Vercel’s AI SDK v5 is a popular framework designed to help teams build with AI. It is an abstraction that allows users to switch between different LLM providers and offers a pluggable way to define tools, agents, and transports.
This guide is only for those using AI SDK’s useChat abstraction. If you are not using
useChat, you do not need to follow this guide.
How Streamstraight integrates with AI SDK
Streamstraight extends AI SDK’s native resume stream functionality by providing a custom transport. By using Streamstraight, you will be able to:
- Automatically resume an in-flight stream created through
useChat if the client reconnects from your server after a disconnect
- Process your LLM generation stream on the server in addition to sending it to the client
With Streamstraight, you will be able to retrieve the entire contents of the stream even after the stream has ended. AI SDK’s resumable-stream implementation returns no data if the client reconnects after the stream has ended.
Implementation
1. Enable stream resumption on the client
We reuse AI SDK’s resume option in useChat to enable stream resumption. If useChat is mounted when an existing stream is active, Streamstraight will replay and tail that stream from the beginning. If it is mounted when the existing stream has ended or is not active, nothing will happen.
import { useChat } from "@ai-sdk/react";
import { StreamstraightChatTransport } from "@streamstraight/client";
import { DefaultChatTransport, type UIMessage, generateId } from "ai";
import { useState } from "react";
async function fetchStreamstraightToken(): Promise<string> {
const response = await fetch("/api/streamstraight-token", { method: "POST" });
if (!response.ok) throw new Error("Failed to fetch Streamstraight token");
const { jwtToken } = (await response.json()) as { jwtToken: string };
return jwtToken;
}
export function ChatComponent({
chatData,
}: {
chatData: { id: string; messages: UIMessage[]; currentlyStreaming: boolean };
}) {
const [textInput, setTextInput] = useState<string>("");
const { messages, status, error, sendMessage, resumeStream } = useChat({
id: chatData.id,
messages: chatData.messages,
transport: new StreamstraightChatTransport({
fetchToken: fetchStreamstraightToken,
api: "/api/chat", // Or wherever your server route handler is located
}),
// Set this to true. When useChat is mounted, it will try to resume
// the stream if one is in-progress for this chatId.
resume: true,
});
return (
<div>
{/* Your chat UI */}
<input value={textInput} onChange={(e) => setTextInput(e.target.value)} />
<Button onClick={() => sendMessage({ text: textInput })}>Send</Button>
</div>
);
}
2. Set up your server route handler
There are two ways to implement the server route handler. Choose the approach that best fits your needs:
- Approach 1 (Recommended): Return an SSE stream directly from your POST handler
- Approach 2: Return a JSON response and stream to Streamstraight in an async job
Approach 1: SSE Response (Recommended)
useChat calls a POST server route handler, which calls the streamText function. By default, this handler is located at /api/chat and returns a message stream response.

app/api/chat/route.ts
import { openai } from "@ai-sdk/openai";
import { readChat, saveChat } from "@util/chat-store";
import { convertToModelMessages, generateId, streamText, type UIMessage } from "ai";
import { after } from "next/server";
import { createResumableStreamContext } from "resumable-stream";
// Defined in https://github.com/vercel/ai/blob/61a16427ad1838d274fe2718ace3e3afc5f83d58/packages/ai/src/ui/http-chat-transport.ts#L183
interface ApiChatData {
/** Unique identifier for the chat session */
id: string;
/** Array of UI messages representing the conversation history */
messages: Array<UIMessage>;
/** ID of the message to regenerate, or undefined for new messages */
messageId?: string;
/** The type of message submission - either new message or regeneration */
trigger: "submit-message" | "regenerate-message";
// Only present when you update prepareSendMessagesRequest
message?: UIMessage;
}
export async function POST(req: Request) {
const requestData = (await request.json()) as ApiChatData;
if (!requestData.messages) {
return new Response(JSON.stringify({ error: "Invalid request payload" }), {
status: 400,
headers: { "Content-Type": "application/json" },
});
}
const chat = await readChat(requestData.id);
let messages = chat.messages;
messages = [...messages, requestData.message];
// Clear any previous active stream and save the user message
saveChat({ id: requestData.id, messages, currentlyStreaming: true });
const result = streamText({
model: openai("gpt-4.1-mini"),
messages: convertToModelMessages(requestData.messages),
});
return result.toUIMessageStreamResponse({
generateMessageId: generateId,
onFinish: ({ isAborted, isContinuation, messages, responseMessage }) => {
// Save to chat however you do so
saveChat({ id: requestData.id, messages, currentlyStreaming: false });
},
// This method allows you to consume the SSE stream however you want, in addition
// to returning it from this POST endpoint to the client. Here we consume the
// SSE stream by sending it to Streamstraight.
async consumeSseStream({ stream }) {
const server = await streamstraightServer(
{ apiKey: apiKey.key, baseUrl: API_ORIGIN },
{
// Use chatId as the unique streamId. This means that Streamstraight will keep
// at most a single stream for each chat. Subsequent streams will overwrite
// previous ones in the same chat.
// We must do it this way because AI SDK does not generate a unique messageId
// until after the message has been fully generated.
streamId: requestData.id,
// This must be true for AI SDK, because we are unique on chatId
overwriteExistingStream: true,
},
);
await server.stream(stream);
},
});
}
This approach is simpler and recommended for most use cases. The stream is returned directly to the client via SSE while simultaneously being sent to Streamstraight for use during resumptions.
Approach 2: JSON Response with Async Job
This approach returns a JSON response immediately and streams to Streamstraight in a background job. This is useful when you want to return metadata immediately and initiate streaming from an async process.
First, update your transport configuration on the client to always read from Streamstraight:
interface MyJsonResponse { ... }
const { messages, status, error, sendMessage, resumeStream } = useChat({
id: chatData.id,
messages: chatData.messages,
transport: new StreamstraightChatTransport<UIMessage, MyJsonResponse>({
fetchToken,
api: "/api/chat",
initialStreamSource: "streamstraight",
initialHttpResponseType: "json",
onInitialHttpJsonResponse: (json: MyJsonResponse) => {
// optional; do something with the HTTP response here
},
}),
// Set this to true. When useChat is mounted, it will try to resume
// the stream if one is in-progress for this chatId.
resume: true,
});
Then implement your server route handler:

app/api/chat/route.ts
import { openai } from "@ai-sdk/openai";
import { streamstraightServer } from "@streamstraight/server";
import { readChat, saveChat } from "@util/chat-store";
import { convertToModelMessages, generateId, streamText, type UIMessage } from "ai";
import { after } from "next/server";
interface ApiChatData {
id: string;
messages: Array<UIMessage>;
messageId?: string;
trigger: "submit-message" | "regenerate-message";
}
export async function POST(req: Request) {
const requestData = (await req.json()) as ApiChatData;
if (!requestData.messages) {
return new Response(JSON.stringify({ error: "Invalid request payload" }), {
status: 400,
headers: { "Content-Type": "application/json" },
});
}
const chat = await readChat(requestData.id);
let messages = [...chat.messages, requestData.message];
// Save the user message and mark stream as active
saveChat({ id: requestData.id, messages, currentlyStreaming: true });
const result = streamText({
model: openai("gpt-4.1-mini"),
messages: convertToModelMessages(requestData.messages),
});
const uiMessageStream = result.toUIMessageStream({
generateMessageId: () => `message-server-${generateId()}`,
onFinish: () => {
saveChat({ id: requestData.id, messages, currentlyStreaming: false });
},
});
// Stream to Streamstraight in a background job
// Use your framework's async job handler (e.g., Next.js's after(), BullMQ, etc.)
after(async () => {
try {
const server = await streamstraightServer(
{ apiKey: process.env.STREAMSTRAIGHT_API_KEY!, baseUrl: API_ORIGIN },
{
streamId: requestData.id,
overwriteExistingStream: true,
},
);
await server.stream(uiMessageStream);
} catch (error) {
console.error("[chat] failed to forward stream", { traceId, error });
}
})();
// Return JSON response immediately
return Response.json(
{ status: "streaming", streamId: requestData.id },
{ status: 202 },
);
}
This approach allows you to return metadata immediately while streaming happens in the background. The client will receive all tokens through Streamstraight.
Note that you can transform the LLM generation into any custom data type, and stream it with Streamstraight.
How it works
The streaming flow works differently depending on which approach you choose:
Approach 1 (SSE Response): The server LLM stream is split into two—one is returned through Server-Sent Events to the client, and the other is passed to Streamstraight. The client initially relies on SSE to receive the stream.
Approach 2 (JSON Response): The server returns a JSON response immediately and streams entirely through Streamstraight. The client receives all tokens from Streamstraight from the start.
In both approaches, when the client reconnects after an interruption, useChat will automatically try to reconnect to an existing stream if resume: true. If there is a message stream currently in-progress, Streamstraight will provide that stream to the client from the beginning.
Note that AI SDK’s implementation does not allow for fetching a stream that has already completed. Your server must detect when the LLM generation has finished, and let the client know on mount that there is no active stream.
The diagram above shows Approach 1 (SSE Response). For Approach 2 (JSON
Response), the flow is similar except the server returns a JSON response immediately
(instead of an SSE stream) and the client receives all tokens through Streamstraight
from the start.
Limitations with AI SDK
While AI SDK is flexible, there are a few limitations you should be aware of if you decide to rely on AI SDK as your LLM abstraction:
Without Streamstraight, streams resume twice in development mode
Without Streamstraight, useChat({ resume: true }) will resume streams twice in development mode. This is because useChat uses a useEffect to reconnect to the stream on mount when resume: true, but does not properly clean up the side effect. As a result, when run in React Strict mode (on by default in development mode), the stream is resumed twice.
Unlike most frameworks, Next.js only turns on React Strict Mode by default starting in
13.5.1.
If you’re not seeing this issue without Streamstraight’s transport, try setting
reactStrictMode to true in next.config.ts.
React’s useEffect requires that all side effects be properly cleaned up on unmount. To help spot errors, React mounts each component twice in development. Because useChat does not clean up the side effect, the stream is resumed twice, but only in development.
Streamstraight fixes this by tracking the active stream per chat ID. When Strict
Mode replays the effect, our transport closes the first socket before opening a second
one, so only a single stream remains in-flight.
Unique streams are not uniquely identified
You may have noticed that in this implementation, we use the chat ID as the unique stream ID. This is because AI SDK generates a unique ID for each LLM generation only after the generation has fully finished. Because Streamstraight requires a unique stream ID before the stream begins, we must use the chat ID as the unique stream ID. There is very little downside to this approach as it works with useChat, but it does mean that streams cannot be replayed after the stream has ended.
This issue is not caused by Streamstraight, but by AI SDK’s current implementation (v5).
We do have a few customers using our AI SDK integration with unique stream IDs. If you’re interested in joining the beta, please let us know!