Streaming In 'Next Question Suggestions' With Vercel's AI SDK
One extremely common pattern for AI-powered apps is to provide suggested next questions for the user to ask. This lets users who are not familiar with these interfaces get started quickly.
Let's look at a basic implementation where when we initiate a conversation, it suggests next questions that stream in as the user requests them.
High-Level Overview
The implementation involves a POST request to /api/chat
, receiving UI messages from the body. These UI messages get converted into model messages before the main processing begins.
The flow consists of two streams combined into one parent stream:
- The first stream for our initial response
- The second stream for follow-up suggestions
To compose these streams together, we create a parent stream with createUIMessageStream
, which exposes a writer:
const stream = createUIMessageStream<MyMessage>({execute: async ({ writer }) => {// Stream initial responseconst messagesFromResponse = await streamInitialResponse(modelMessages, writer);// Generate follow-up suggestionsconst followupSuggestions = generateFollowupSuggestions([...modelMessages,...messagesFromResponse,]);// Stream the suggestions to frontendawait streamFollowupSuggestionsToFrontend(followupSuggestions, writer);},});
Streaming the Initial Response
Here's how the initial response is streamed:
const streamInitialResponse = async (modelMessages: ModelMessage[],writer: UIMessageStreamWriter<MyMessage>,) => {// 1. Stream the initial response - can be any// streamText call with tool calls, etc.const streamTextResult = streamText({model: mainModel,messages: modelMessages,});// 2. Merge the stream into the UIMessageStreamwriter.merge(streamTextResult.toUIMessageStream());// 3. Consume the stream - this waits until the// stream is completeawait streamTextResult.consumeStream();// 4. Return the messages from the response, to// be used in the followup suggestionsreturn (await streamTextResult.response).messages;};
This function calls streamText
using the Gemini 2.0 Flash model (though you could use any model with the AI SDK). The stream is merged into the UI message stream using writer.merge
, and we wait for the stream to complete with streamTextResult.consumeStream()
.
Finally, we return the messages produced by the call, which will be used in the next function.
Generating Follow-Up Suggestions
After getting the initial response, we pass both the original messages and the response messages to generateFollowupSuggestions
:
const generateFollowupSuggestions = (modelMessages: ModelMessage[],) =>// 1. Call streamObject, which allows us to stream// structured outputs to the frontendstreamObject({model: suggestionsModel,// 2. Pass in the full message historymessages: [...modelMessages,// 3. And append a request for followup suggestions{role: 'user',content:'What question should I ask next? Return an array of suggested questions.',},],// 4. These suggestions are made type-safe by// this Zod schemaschema: z.object({suggestions: z.array(z.string()),}),});
This function uses streamObject
with a Zod schema that defines an array of strings for suggestions. We could add a system prompt and further context engineering here, but this simple approach works well enough.
Streaming Suggestions to the Frontend
The follow-up suggestions get piped into streamFollowupSuggestionsToFrontend
:
const streamFollowupSuggestionsToFrontend = async (// 1. This receives the streamObject result from// generateFollowupSuggestionsfollowupSuggestionsResult: ReturnType<typeof generateFollowupSuggestions>,writer: UIMessageStreamWriter<MyMessage>,) => {// 2. Create a data part ID for the suggestions - this// ensures that only ONE data-suggestions part will// be visible in the frontendconst dataPartId = crypto.randomUUID();// 3. Read the suggestions from the streamfor await (const chunk of followupSuggestionsResult.partialObjectStream) {// 4. Write the suggestions to the UIMessageStreamwriter.write({id: dataPartId,type: 'data-suggestions',data:chunk.suggestions?.filter(// 5. Because of some AI SDK type weirdness,// we need to filter out undefined suggestions(suggestion) => suggestion !== undefined,) ?? [],});}};
The suggestions are treated as a custom part of the message. We define the type of this message by specifying a UIMessage
, passing never
as the first parameter and suggestions: string[]
as the second.
Type Safety for Custom Message Parts
We declare our custom message type to ensure type safety:
export type MyMessage = UIMessage<never,{suggestions: string[];}>;
This makes our code type-safe when writing to streams - we can only pass in a string array to the data-suggestions
part.
Frontend Implementation
In the frontend, we use the useChat
hook with our custom message type:
const { messages, sendMessage } = useChat<MyMessage>({});const [input, setInput] = useState(``);const latestSuggestions = messages[messages.length - 1]?.parts.find((part) => part.type === 'data-suggestions',)?.data;
We extract the latest suggestions from the most recent message's parts. These might be undefined if we have no messages yet or if suggestions haven't started streaming.
The suggestions are then rendered as buttons:
<ChatInputsuggestions={messages.length === 0? ['What is the capital of France?','What is the capital of Germany?',]: latestSuggestions}input={input}onChange={(text) => setInput(text)}onSubmit={(e) => {e.preventDefault();sendMessage({text: input,});setInput('');}}/>
We also provide default suggestions if there are no messages yet. When a user clicks a suggestion button, it populates the input field.
Summary
This pattern allows us to stream suggestions to the frontend in the same API endpoint as the rest of our content, creating a seamless experience for users. The suggestions update in real-time as they become available, helping users navigate the conversation more easily.
The key components are:
- A unified stream combining initial response and suggestions
- Type-safe message parts for structured data
- Real-time streaming of suggestions to the frontend
This approach creates a more guided, user-friendly experience for AI conversations.