Navigating Complex Search Tasks with AI Copilots: Challenges

cover
26 Apr 2024

This paper is available on arxiv under CC 4.0 license.

Authors:

(1) Ryen W. White, Microsoft Research, Redmond, WA, USA.

Abstract and Taking Search to task

AI Copilots

Challenges

Opportunities

The Undiscovered Country and References

3 CHALLENGES

Despite the promise of copilots, there are significant challenges that should be acknowledged and we must find ways to overcome. Those include issues with the copilot output shown in response to searcher requests, the impacts that the copilots can have on searchers, and shifts in the degree of agency that humans have in the search process that result from the introduction of copilots.

Hallucinations: Searchers rely a lot on the answers from copilots,but those answers can be erroneous or non-sensical. So-called“hallucination” is a well-studied problem in foundation models [24]. Copilots can hallucinate for many reasons. One of the main reasons being gaps in the training data. RAG, discussed earlier, is a way to help address this by ensuring that the copilot has access to up-to-date, relevant information at inference time to help ground its responses. Injection of knowledge from other external sources, such as knowledge graphs and Wikipedia, can also help improve the accuracy of copilot responses. An issue related to copilots surfacing misinformation is toxicity (i.e., offensive or harmful content), which can also be present in the copilot output, and must be mitigated before answers are shown to searchers.

Table 1: Anecdotal examples of high-level tasks from Bloom’s taxonomy [27] of varying complexities from searcher/copilot perspectives. Tasks such as ‘Find’ and ‘Analyze’ have similar complexities for both humans and machines. It is easier for machines to create content than for humans, but more difficult for machines to verify the correctness of information.

Biases: Biases in the training data, e.g., social biases and stereotypes [31], affect the output of foundation models and hence the answers provided by copilots. Synthesis of content from different sources can amplify biases in this data. As with hallucinations, this is a well-studied problem [6]. Copilots are also subject to biases from learning from their own or other AI generated content (via feedback loops); biased historical sequences lead to biased downstream models. Copilots may also amplify existing cognitive biases, such as confirmation bias, by favoring responses that are aligned with searchers’ existing beliefs and values, and by providing responses that keep searchers engaged with the copilot, regardless of the ramifications for the searcher.

• Human learning: Learning may be affected/interrupted by the use of AI copilots since they remove the need for searchers to engage as fully with the search system and the information retrieved. Learning is already a core part of the search process [32, 37, 54]. Both exploratory search and search as learning involve considerable time and effort in finding and examining relevant content. While this could be viewed as a cost, this deep exposure to content also helps people learn. As mentioned earlier, copilot users can ask richer questions (allowing them to specify their tasks and goals more fully) but they then receive synthesized answers generated by the copilot, creating fewer, new, or simply different learning opportunities for humans that must be understood.

• Human control: Supporting search requires considering the degree of searcher involvement in the search process, which varies depending on the search task [2]. Copilots enable more strategic,

higher-order actions (higher up the “task tree” from Figure 1 than typical interactions with search systems). It is clear that searchers want control over the search process. They want to know what information is/not being included and why. This helps them understand and trust system output. As things stand, copilot users delegate full control of answer generation to the AI, but the rest is mixed, i.e., less control of search mechanics (queries, etc.) but more control of task specifications (via natural language and dialog). There is more than just a basic tension between automation and control. In reality, it is not a zero sum game. Designers of copilots need to ensure human control while increasing automation [44]. New frameworks for task completion are moving in this direction. For example, AutoGen [67], uses multiple specialized assistive AI copilots that engage with humans and with each other directly to help complete complex tasks, with humans staying informed and in control throughout.

Overall, these are just a few of the challenges that affect the viability of copilots. There are other challenges, such as deeply ingrained search habits that may be a barrier to the adoption of new search functionality, despite the clear benefits to searchers.