A Designer’s Guide to UI/UX Patterns for AI Products

ntro

In the previous chapter, we explored the patterns of AI roles in products — the Guide, Companion, and Driver. While discussing the Driver, I briefly touched on how AI builds and reshapes context around it.

Now, let’s shift our focus to how AI interacts with information across different contexts, as product experiences ultimately revolve around the journey to find information. (If that sounds unclear, feel free to revisit the previous chapter.)

To clarify, I’ll break down the three levels of context we’re dealing with:

View-Specific Context: As the name suggests, this context is confined to the view that the user is currently looking at. It doesn’t involve any information outside of this view.
Cross-View Context (Product-Level Context): This is a step up from view-specific context. It includes all the information within the product that isn’t visible in the current view but exists in other views. Think of it as a collection of multiple view-specific contexts.
External Context: Unlike the previous two, external context is established independently of the product. It’s the information and context that exist outside the product environment.

These three levels of context exist with or without AI. What we’re focusing on here is how AI interacts with these different levels of context to create AI-driven experiences within products.

Why Context Matters so much in AI powered products?

Context matters particularly in products with AI engines because, unlike traditional products, AI interactions involve not only the information rendered on the screen but also context outside the view — often to a significant extent. In products without AI, the focus is primarily on the information and context presented within the visible interface.

In AI-powered products, however, the situation is more complex. Sometimes, AI’s responses are “mainly” tied directly to what’s visible on the screen. Other times, AI processes information that’s “barely” linked to what the user is currently viewing, instead drawing heavily on external context beyond the current view.

Notice that I said “mainly” and “barely”. That’s because there’s no clear-cut division in AI products — there isn’t a scenario where something is purely view-specific or purely external. Instead, there’s always a continuous spectrum, a blend of both view-specific and external context.

In this discussion, we’ll group instances within this spectrum into three categories. We’ll explore how different combinations of view-specific and external context shape user experiences and how these contexts are communicated. Additionally, we’ll examine real-world examples for each case, critically evaluating them and considering possible improvements.

1. View-Specific Context based AI

2. Cross-View Context based AI

3.External Context (Out-of-Product) Context-Based AI

4.Recap and Challenges

1. View-Specific Context based AI

In this scenario, the AI is essentially “looking” at the screen alongside the user. The primary expectation from the user is for the AI to find information within the immediate context and possibly take some action based on that information. However, the crucial first step is always to “check” if the desired information is actually present.

To make this more concrete, let’s explore a few examples from real products.

Example — Slack Catchup AI

Summary of Chat History (Captured from Video so resolution isn’t great)

Let’s consider Slack’s AI-generated summary of a specific chat channel. (Unfortunately, Deloitte doesn’t use Slack, so I had to rely on a YouTube product demo for a glimpse of the screen — sad, I know!)

In the screenshot, the AI-generated summary on the right-hand side is based on the chat history on the left. The AI first scans the chat history to see what’s present in the view-specific context. After that, it “sprinkles” a bit of external context into the summary — for example, interpreting what something in the view-specific context “normally” means in a broader, common-sense scenario. This broader context isn’t derived from the view itself but from information beyond it.

However, it’s important to emphasize the “sprinkling” rather than “pouring” of external context. The main expectation from the user is, “Hey, does X exist here?” Overloading the view with too much external context could mislead the user into thinking that certain information exists within the view when it actually doesn’t.

Communication of the context

So, how is this user expectation communicated in the interface? The short answer is through “juxtaposing” or “stacking” views.

Juxtaposition in Slack & View Stacking in Amazon

Similar to Slack AI providing context about what’s visible in the chat, Amazon’s Rufus AI does the same for what’s visible on the screen — though in this case, it’s a book. Now, here’s a fun (and slightly extreme) example to illustrate the point:

Take a look at the screen. (I picked an extreme case to drive this home.) The screen shows that the author of the book is Meredith Davis, yet when the question “Who’s the author?” is asked, Rufus AI responds with “Jessica Helfand.” What went wrong? This occurred because the Rufus chat window was carried across different views in the Amazon app. The context from a previous view got carried over and overlaid with the current one, leading to this mismatch — even though the context wasn’t view-specific.

So, how should this be handled? Let’s explore that next.

2. Cross-View Context based AI (Product Specific Context)

In this scenario, the context isn’t confined to just one specific view. Instead, it extends across multiple views within the product. Here, the challenge becomes how to visually communicate information that isn’t directly visible to the user at any given moment.

It might sound counterintuitive, but the product must find a way to visually represent these “invisible” elements to ensure the user understands the broader context.

the product must find a way to visually represent these “invisible” elements

Example — Asana AI (Generating Updates Report)

Screen 1- Each item on the screen holds one view-specific context

Screen 2- Generating a report based on multiple views (Cross-view context)

Consider Asana’s AI feature that generates an update report. In the first screen, each item in the list represents a view-specific context respectively, but when you generate a report in the second view, the AI needs to gather information from across multiple views — this is where cross-view context comes into play.

Unlike the view-specific context where the AI focuses on just one view, here the AI needs to visit every relevant view within the product to gather information. Once it’s done, similar to the view-specific context, the AI may “sprinkle” some external context into the report to provide a more comprehensive overview.

Communication of the context

The key challenge in cross-view context is that “no single view” can represent the entire context. This might seem obvious, but it’s often overlooked in design. Let’s revisit Slack for a clearer understanding.

Imagine a user trying to find out “who is working on project eagle?” by prompting the AI on a specific channel. Is the user trying to find if “project eagle” exists in that specific channel (view-specific context)? Or are they looking for it across all channels in Slack (cross-view context)?

Now, let’s examine a different screen to answer this question.

In the first screen, the user was searching within a specific channel, while in the second screen, they were searching across multiple channels. You might have easily figured this out because either you’re keen on visual details or you’re quite familiar with Slack.

The potential for miscommunication here stems from the approach used in the view-specific context. The information displayed on the screen points to a specific section, while the actual search extends beyond that section. This broader context is only indicated by a small chip in the prompt window (red highlighted items). Although visually prominent, this might still be confusing because the user’s mental model is heavily grounded in the current view. Expecting users to intuitively shift their mental model to consider a cross-view or product-level context might be asking too much.

So, how should we effectively communicate the concept of cross-view or product-level context without tying it to one specific view? The answer lies in the fact that no single view can represent this broader context. If such a view doesn’t exist, then it’s best not to show it.

Consider an example that successfully communicates this concept. When dealing with cross-view context, the screen deliberately shows an “empty” state along with an effective call-to-action (CTA) to nudge users toward understanding not only what’s happening but also how things should proceed.

The system then asks the user which views should be considered when answering the prompt. This approach clearly signals to the user that they are dealing with cross-view context. Slack does a fantastic job with this, even though there was some unnecessary confusion in the earlier “project eagle” example.

Allowing users to choose or narrow down specific views helps them quickly grasp whether certain information exists in location A or B. However, does it always matter? For example, if you’re trying to understand the main revenue driver for Toyota Motors, does it matter whether the information comes from Toyota’s homepage or its annual report? It might have some significance in terms of information reliability, but often, when exploring information, you might not know exactly where it resides.

The difference between the cases — “I want to recap whatever information is in these channels” versus “I want to know the revenue driver of the company, regardless of where the information is” — illustrates how different types of context are communicated. This brings us to the next context type: External Context (Out-of-Product Context)-Based AI.

3.External Context (Ouf-of Product) Context based AI

At the far right end of the context mix spectrum, we encounter external context-based AI. This type of AI interaction starts with an almost empty interface, as it has minimal or no pre-existing context within the product itself. Instead, the focus is on leveraging information from outside the product.

In this scenario, the view-specific context typically begins from zero. The AI and user gradually build up the context through interaction. While this might seem counterintuitive, it’s crucial to differentiate this from cross-view context.

Example — Amplyfi, ChatGPT, Perplexity

When a user first interacts with the product, there’s no context provided within the product itself. The AI relies heavily on external information to generate responses.

Consider ChatGPT, where the initial interaction happens on an empty interface. The AI begins by gathering context from outside the product, not from the interface itself.

Take ChatGPT as an example. Initially, it starts with an empty space. As the user prompts the AI, the interaction shapes the context over time. This differs from cross-view context, where the primary goal is to find and consolidate existing information from multiple views.

Hacking credit card bonus offers is my favorite trick in ChatGPT. (It’s not really hacking — just a quick TL;DR to help me maximize those perks.)

In external context-based AI, the interaction process is about shaping context together. This differs from cross-view context, which starts with existing information and then incorporates external context.

Key Differences:

Cross-View Context: Focuses on the existence and location of information across multiple views before integrating external context.
External Context: Emphasizes gathering and integrating information from outside the product, often without a predefined scope.

For Cross-View Context, it’s important to clarify the boundaries of different views and specify the locality within the product. This ensures that users understand where information is coming from and how it relates to what they’re currently seeing. As seen in Slack’s example, this is crucial for effective communication.

On the other hand, External Context often involves gathering information from sources outside the product, without a specific view or scope. In this case, trying to confine the scope can be less effective and might complicate the user experience. Rather than focusing on predefined views, it’s often more beneficial to provide a broader exploration of external information, even if it means the context isn’t immediately visible.

Challenges in communicating the concept which conflicts with mental model

Discussing external context often involves framing it in a linear or top-down manner, which can sound reasonable in theory. However, the actual user experience and mental processes are typically less linear and more exploratory. Users may not always be aware of all the context around them or know exactly where their information search will lead. This discrepancy between how we discuss external context and how it actually works in practice can make it challenging to communicate effectively.

framing it in a linear or top-down manner, which can sound reasonable in theory.

Example — Amplyfi

When a user first encounters a product, it might ask them what scope their query should cover. This raises a crucial question: “Will users actually explore information this way?” For instance, searching for “Toyota Motor information from Toyota Motor Company Filings” seems logical, but why not just ask for “Toyota Motor information”?

Since the primary goal of external based context experience is NOT checking the existence of a thing scoping out the context doesn’t bring huge values rather it creates unnecessary frustrations such as

Specifying a source might seem precise, but it can be problematic if users “aren’t aware of the exact locations” of relevant information. The issue arises when users “must remember which sources contain which” information, or when they are expected to search through predefined scopes.

This can lead to unnecessary frustration, especially if the information they seek isn’t in the selected scope but somewhere else. A better approach might be to provide source links along with the relevant information, ensuring users can verify the information themselves.

In essence, the goal of external context is not to verify the existence of information within specific scopes but to gather and integrate information from outside the product. Unlike view-specific context, which focuses on what is available in the current view, external context interactions often require users to explore without predefined boundaries. This approach might seem less structured, but it opens the door to a broader exploration.

Perplexity’s Discover tab seems to find the fine balance and possible solution for this successfully.

Perplexity’s Discover tab demonstrates a successful balance in handling external context.

Initially, the Discover tab presents information regardless of the user’s current context, acknowledging that there is no pre-existing context. This design encourages users to explore a wide range of information without being constrained by specific scopes.

Once user clicks one of the item in the external context, the user gets into the specific context and now it turns to more of ‘view-specific’ mode but still open to bring external information as well. One thing note-worthy in the screen below is that it ‘overlays/juxtapose’ the context with the prompt and it successfully convey the idea plus even after the dedicated view for the view-specific context, it shows the base-context at the top. (what a nice job)

Overlay (Left) / Source at the top (Right)

4. Recap and Challenge

The implications which exist in different levels of context and ways to communicate these is actually what made me even start the entire series. So the sequence of the entire series is a bit opposite of how the contents was written. I would like to recap what we covered in this chapter by sharing the design decision I had to make and how the contents we have covered led me to the final decision.

(1) Project Backgrond

The product is designed to provide market information in a dashboard manner but it enables AI-powered chat on the interface in order user to ask follow up questions or explore information that isn’t provided as a default view (drill down or dig into further into the context).

(2) Design Decision Moment

Adding history to bring context outside of the current view

The team proposed to have ‘history’ feature in the chat so that the user can see the previous interactions not tied to the current context.

(3) Challenges

To justify this feature, I needed to explain how context functions differently in AI-powered products. As illustrated, the chat interface we designed operates within a ‘view-specific’ context. Here, the interactions are based on the current view, visually communicated by ‘overlaying’ or ‘juxtaposing’ views.

Conversely, ChatGPT uses an external-based context, where the context is dynamically shaped by prompts rather than existing views. When a previous chat history is selected, the current context reshapes itself according to the selected history, leaving no trace of the prior view context.

The context rendered on the page changes by selection

As shown in the screenshot, selecting a previous chat history reshapes the entire context to reflect the chosen history. The information displayed on the screen pertains solely to this context, without incorporating elements from other views in the product. This differs from the product I was designing, which uses a view-driven context where the context is predefined by the screen prior to any AI interactions.

In contrast, ChatGPT’s context operates independently of the existing view-specific context. When loading previous chat history, the context displayed is solely based on the selected history, not influenced by the current view’s context.

The context that the chat is based on (below, in yellow) and the context displayed in the current view (above, in pink) can conflict, leading to unnecessary confusion, similar to issues seen with Amazon Rufus.

Now that we’ve explored how different types of context are communicated through interfaces, our next series will focus on the different response types in prompts. We’ll categorize these response types and examine how they are visualized to set appropriate user expectations.