Skip to main content
Vercel’s AI SDK v5 is a popular framework designed to help teams build with AI. It is an abstraction that allows users to switch between different LLM providers and offers a pluggable way to define tools, agents, and transports.
This guide is only for those using AI SDK’s useChat abstraction. If you are not using useChat, you do not need to follow this guide.

How Streamstraight integrates with AI SDK

Streamstraight extends AI SDK’s native resume stream functionality by providing a custom transport. By using Streamstraight, you will be able to:
  • Automatically resume an in-flight stream created through useChat if the client reconnects from your server after a disconnect
  • Process your LLM generation stream on the server in addition to sending it to the client
With Streamstraight, you will be able to retrieve the entire contents of the stream even after the stream has ended. AI SDK’s resumable-stream implementation returns no data if the client reconnects after the stream has ended.

Implementation

1. Enable stream resumption on the client

We reuse AI SDK’s resume option in useChat to enable stream resumption. If useChat is mounted when an existing stream is active, Streamstraight will replay and tail that stream from the beginning. If it is mounted when the existing stream has ended or is not active, nothing will happen.
React
import { useChat } from "@ai-sdk/react";
import { StreamstraightChatTransport } from "@streamstraight/client";
import { DefaultChatTransport, type UIMessage, generateId } from "ai";
import { useState } from "react";

async function fetchStreamstraightToken(): Promise<string> {
  const response = await fetch("/api/streamstraight-token", { method: "POST" });
  if (!response.ok) throw new Error("Failed to fetch Streamstraight token");

  const { jwtToken } = (await response.json()) as { jwtToken: string };
  return jwtToken;
}

export function ChatComponent({
  chatData,
}: {
  chatData: { id: string; messages: UIMessage[]; currentlyStreaming: boolean };
}) {
  const [textInput, setTextInput] = useState<string>("");

  const { messages, status, error, sendMessage, resumeStream } = useChat({
    id: chatData.id,
    messages: chatData.messages,
    transport: new StreamstraightChatTransport({
      fetchToken: fetchStreamstraightToken,
      api: "/api/chat", // Or wherever your server route handler is located
    }),
    // Set this to true. When useChat is mounted, it will try to resume
    // the stream if one is in-progress for this chatId.
    resume: true,
  });

  return (
    <div>
      {/* Your chat UI */}
      <input value={textInput} onChange={(e) => setTextInput(e.target.value)} />
      <Button onClick={() => sendMessage({ text: textInput })}>Send</Button>
    </div>
  );
}

2. Set up your server route handler

There are two ways to implement the server route handler. Choose the approach that best fits your needs:
  • Approach 1 (Recommended): Return an SSE stream directly from your POST handler
  • Approach 2: Return a JSON response and stream to Streamstraight in an async job
useChat calls a POST server route handler, which calls the streamText function. By default, this handler is located at /api/chat and returns a message stream response.
https://nextjs.org/favicon.icoapp/api/chat/route.ts
import { openai } from "@ai-sdk/openai";
import { readChat, saveChat } from "@util/chat-store";
import { convertToModelMessages, generateId, streamText, type UIMessage } from "ai";
import { after } from "next/server";
import { createResumableStreamContext } from "resumable-stream";

// Defined in https://github.com/vercel/ai/blob/61a16427ad1838d274fe2718ace3e3afc5f83d58/packages/ai/src/ui/http-chat-transport.ts#L183
interface ApiChatData {
  /** Unique identifier for the chat session */
  id: string;
  /** Array of UI messages representing the conversation history */
  messages: Array<UIMessage>;
  /** ID of the message to regenerate, or undefined for new messages */
  messageId?: string;
  /** The type of message submission - either new message or regeneration */
  trigger: "submit-message" | "regenerate-message";
  // Only present when you update prepareSendMessagesRequest
  message?: UIMessage;
}

export async function POST(req: Request) {
  const requestData = (await request.json()) as ApiChatData;
  if (!requestData.messages) {
    return new Response(JSON.stringify({ error: "Invalid request payload" }), {
      status: 400,
      headers: { "Content-Type": "application/json" },
    });
  }

  const chat = await readChat(requestData.id);
  let messages = chat.messages;

  messages = [...messages, requestData.message];

  // Clear any previous active stream and save the user message
  saveChat({ id: requestData.id, messages, currentlyStreaming: true });

  const result = streamText({
    model: openai("gpt-4.1-mini"),
    messages: convertToModelMessages(requestData.messages),
  });

  return result.toUIMessageStreamResponse({
    generateMessageId: generateId,
    onFinish: ({ isAborted, isContinuation, messages, responseMessage }) => {
      // Save to chat however you do so
      saveChat({ id: requestData.id, messages, currentlyStreaming: false });
    },

    // This method allows you to consume the SSE stream however you want, in addition
    // to returning it from this POST endpoint to the client. Here we consume the
    // SSE stream by sending it to Streamstraight.
    async consumeSseStream({ stream }) {
      const server = await streamstraightServer(
        { apiKey: apiKey.key, baseUrl: API_ORIGIN },
        {
          // Use chatId as the unique streamId. This means that Streamstraight will keep
          // at most a single stream for each chat. Subsequent streams will overwrite
          // previous ones in the same chat.
          // We must do it this way because AI SDK does not generate a unique messageId
          // until after the message has been fully generated.
          streamId: requestData.id,
          // This must be true for AI SDK, because we are unique on chatId
          overwriteExistingStream: true,
        },
      );

      await server.stream(stream);
    },
  });
}
This approach is simpler and recommended for most use cases. The stream is returned directly to the client via SSE while simultaneously being sent to Streamstraight for use during resumptions.

Approach 2: JSON Response with Async Job

This approach returns a JSON response immediately and streams to Streamstraight in a background job. This is useful when you want to return metadata immediately and initiate streaming from an async process. First, update your transport configuration on the client to always read from Streamstraight:
React
interface MyJsonResponse { ... }

const { messages, status, error, sendMessage, resumeStream } = useChat({
  id: chatData.id,
  messages: chatData.messages,
  transport: new StreamstraightChatTransport<UIMessage, MyJsonResponse>({
    fetchToken,
    api: "/api/chat",
    initialStreamSource: "streamstraight",
    initialHttpResponseType: "json",
    onInitialHttpJsonResponse: (json: MyJsonResponse) => {
      // optional; do something with the HTTP response here
    },
  }),
  // Set this to true. When useChat is mounted, it will try to resume
  // the stream if one is in-progress for this chatId.
  resume: true,
});
Then implement your server route handler:
https://nextjs.org/favicon.icoapp/api/chat/route.ts
import { openai } from "@ai-sdk/openai";
import { streamstraightServer } from "@streamstraight/server";
import { readChat, saveChat } from "@util/chat-store";
import { convertToModelMessages, generateId, streamText, type UIMessage } from "ai";
import { after } from "next/server";

interface ApiChatData {
  id: string;
  messages: Array<UIMessage>;
  messageId?: string;
  trigger: "submit-message" | "regenerate-message";
}

export async function POST(req: Request) {
  const requestData = (await req.json()) as ApiChatData;
  if (!requestData.messages) {
    return new Response(JSON.stringify({ error: "Invalid request payload" }), {
      status: 400,
      headers: { "Content-Type": "application/json" },
    });
  }

  const chat = await readChat(requestData.id);
  let messages = [...chat.messages, requestData.message];

  // Save the user message and mark stream as active
  saveChat({ id: requestData.id, messages, currentlyStreaming: true });

  const result = streamText({
    model: openai("gpt-4.1-mini"),
    messages: convertToModelMessages(requestData.messages),
  });

  const uiMessageStream = result.toUIMessageStream({
    generateMessageId: () => `message-server-${generateId()}`,
    onFinish: () => {
      saveChat({ id: requestData.id, messages, currentlyStreaming: false });
    },
  });

  // Stream to Streamstraight in a background job
  // Use your framework's async job handler (e.g., Next.js's after(), BullMQ, etc.)
  after(async () => {
    try {
      const server = await streamstraightServer(
        { apiKey: process.env.STREAMSTRAIGHT_API_KEY!, baseUrl: API_ORIGIN },
        {
          streamId: requestData.id,
          overwriteExistingStream: true,
        },
      );

      await server.stream(uiMessageStream);
    } catch (error) {
      console.error("[chat] failed to forward stream", { traceId, error });
    }
  })();

  // Return JSON response immediately
  return Response.json(
    { status: "streaming", streamId: requestData.id },
    { status: 202 },
  );
}
This approach allows you to return metadata immediately while streaming happens in the background. The client will receive all tokens through Streamstraight. Note that you can transform the LLM generation into any custom data type, and stream it with Streamstraight.

How it works

The streaming flow works differently depending on which approach you choose: Approach 1 (SSE Response): The server LLM stream is split into two—one is returned through Server-Sent Events to the client, and the other is passed to Streamstraight. The client initially relies on SSE to receive the stream. Approach 2 (JSON Response): The server returns a JSON response immediately and streams entirely through Streamstraight. The client receives all tokens from Streamstraight from the start. In both approaches, when the client reconnects after an interruption, useChat will automatically try to reconnect to an existing stream if resume: true. If there is a message stream currently in-progress, Streamstraight will provide that stream to the client from the beginning. Note that AI SDK’s implementation does not allow for fetching a stream that has already completed. Your server must detect when the LLM generation has finished, and let the client know on mount that there is no active stream.
The diagram above shows Approach 1 (SSE Response). For Approach 2 (JSON Response), the flow is similar except the server returns a JSON response immediately (instead of an SSE stream) and the client receives all tokens through Streamstraight from the start.

Limitations with AI SDK

While AI SDK is flexible, there are a few limitations you should be aware of if you decide to rely on AI SDK as your LLM abstraction:

Without Streamstraight, streams resume twice in development mode

Without Streamstraight, useChat({ resume: true }) will resume streams twice in development mode. This is because useChat uses a useEffect to reconnect to the stream on mount when resume: true, but does not properly clean up the side effect. As a result, when run in React Strict mode (on by default in development mode), the stream is resumed twice.
Unlike most frameworks, Next.js only turns on React Strict Mode by default starting in 13.5.1. If you’re not seeing this issue without Streamstraight’s transport, try setting reactStrictMode to true in next.config.ts.
React’s useEffect requires that all side effects be properly cleaned up on unmount. To help spot errors, React mounts each component twice in development. Because useChat does not clean up the side effect, the stream is resumed twice, but only in development. Streamstraight fixes this by tracking the active stream per chat ID. When Strict Mode replays the effect, our transport closes the first socket before opening a second one, so only a single stream remains in-flight.

Unique streams are not uniquely identified

You may have noticed that in this implementation, we use the chat ID as the unique stream ID. This is because AI SDK generates a unique ID for each LLM generation only after the generation has fully finished. Because Streamstraight requires a unique stream ID before the stream begins, we must use the chat ID as the unique stream ID. There is very little downside to this approach as it works with useChat, but it does mean that streams cannot be replayed after the stream has ended.
This issue is not caused by Streamstraight, but by AI SDK’s current implementation (v5).
We do have a few customers using our AI SDK integration with unique stream IDs. If you’re interested in joining the beta, please let us know!