Skip to main content

Streaming without blocking the HTTP response

In most implementations of an AI chat, the user triggers an AI response via a POST request to a server/route handler. With Server-sent Events, the HTTP response will be a stream. With Streamstraight, the response stream is passed to Streamstraight entirely in the server route handler. By default, the route handler will not return the HTTP response until the entire stream has completed. In order to return the HTTP response immediately and continue the stream in the background, you must process the stream asynchronously or in a background job.

Processing the response stream in a background job

It’s generally good practice to move long-running work to a background job to avoid timeouts and improve user experience. Depending on your choice of programming language, web framework, and hosting provider, you can choose any task queue system that supports background processing. Streamstraight encourages this pattern as best practice. Examples include Trigger.dev, Inngest, Celery (Python), BullMQ (Node.js), or similar systems to handle the streaming asynchronously.

Processing the response stream asynchronously in your server

Most web frameworks support processing async work even after the route handler has finished sending the HTTP response. How you accomplish this depends on the web framework you use and whether you use a long-running server or serverless functions. Here are some examples:
import { after } from 'next/server' // Next 15+

export async function POST(request: Request) {
  // ...

  const llmResponseStream = await openai.responses.stream({ ... });

  const ssServer = await streamstraightServer(
    { apiKey: process.env.STREAMSTRAIGHT_API_KEY },
    { streamId: messageId },
  );

  // Next.js requires you to wrap async work in `after` to
  // continue running it after the response is sent
  after(async () => {
    await ssServer.stream(llmResponseStream);
  })

  return Response.json({ messageId });
}

Typing Chunks on the Client

The Streamstraight client SDK provides a way to provide the type of each chunk streamed from the server. Simply pass a type into the generic:
await connectStreamstraightClient<MyChunkTypeHere>(
  { fetchToken: async () => jwtToken },
  { streamId: "your-stream-id" },
);
If you’re streaming directly from the OpenAI Responses API, the chunks will have type OpenAI.Responses.ResponseStreamEvent. Alternatively, process the LLM stream into a custom format on your server and pass that type in here instead.

Custom Chunk Encoders/Decoders

Streamstraight JSON-encodes each chunk before sending them over websockets and storing them, before decoding them on the client. You can pass in custom encoder and decoder functions—just make sure they encode to and decode from a string.
const ssServer = await streamstraightServer(
  { apiKey: process.env.STREAMSTRAIGHT_API_KEY! },
  {
    streamId: "your-stream-id",
    encoder: (chunk) => JSON.stringify(chunk),
  },
);
await ssServer.stream(stream);
A chunk will not be created if the encoder returns an empty string.