Lambda Response Streaming Grows Up: 200 MB Payloads and What That Means for Serverless APIs

If downloading a whole album before hearing the first note feels outdated, buffering an entire HTTP response before sending a single byte does too. That’s why response streaming in AWS Lambda has quietly become one of the most useful patterns in serverless: you can start delivering bytes to a client as soon as they’re ready—no more waiting for the full payload to be built. The big recent change? As of July 31, 2025, Lambda’s response streaming now supports up to 200 MB per response, a 10x increase over the prior limit. That upgrade expands what you can feasibly serve straight from a function without detouring through S3 or another store. (aws.amazon.com)

What changed, exactly?

Why this matters for modern serverless

The upshot: more use cases can live entirely in Lambda again, reducing architectural glue and cutting trips to intermediary storage services. (aws.amazon.com)

A minimal Node.js streaming handler

Lambda exposes a handy wrapper, streamifyResponse, in Node.js runtimes. It gives your handler a writable stream you can push bytes to. The safest pattern is to use pipeline so you don’t overwhelm downstream consumers.

// index.mjs (Node.js 18+)
import { pipeline } from 'node:stream/promises';
import { Readable } from 'node:stream';

/* global awslambda */
export const handler = awslambda.streamifyResponse(async (event, responseStream, _context) => {
  // Turn something into a readable stream — here, we just echo the event:
  const input = Readable.from(Buffer.from(JSON.stringify(event)));

  // Pipe it straight to the client; pipeline() handles backpressure for you.
  await pipeline(input, responseStream);
});

This mirrors AWS’s recommended approach: the responseStream is a standard Node writable stream, and pipeline handles backpressure correctly. (docs.aws.amazon.com)

Enabling streaming on a Function URL

To stream over plain HTTPS, attach a Function URL and set its invoke mode to RESPONSE_STREAM. Here’s the relevant CloudFormation/SAM snippet:

Resources:
  StreamingFunction:
    Type: AWS::Lambda::Function
    Properties:
      Runtime: nodejs18.x
      Handler: index.handler
      Code: ./dist
      MemorySize: 1024
      Timeout: 30

  StreamingFunctionUrl:
    Type: AWS::Lambda::Url
    Properties:
      TargetFunctionArn: !Ref StreamingFunction
      AuthType: AWS_IAM         # Prefer IAM or CloudFront protection
      InvokeMode: RESPONSE_STREAM

RESPONSE_STREAM configures the Function URL to call InvokeWithResponseStream under the hood, enabling progressive delivery and the 200 MB streaming limit. (docs.aws.amazon.com)

A simple browser client to consume the stream

Most modern HTTP clients can read streamed bodies incrementally. In the browser, use the Web Streams API:

async function readStream(url, awsSigv4Headers) {
  const res = await fetch(url, { headers: awsSigv4Headers });
  const reader = res.body.getReader();
  const decoder = new TextDecoder();

  let buffered = '';
  for (;;) {
    const { value, done } = await reader.read();
    if (done) break;
    buffered += decoder.decode(value, { stream: true });
    // For Server-Sent Events, you could parse 'data:' lines here and update the UI
    console.log('Chunk:', buffered.length);
  }
  return buffered;
}

If your HTTP client buffers until the connection closes, you won’t see the benefits; pick a client that surfaces data incrementally. (aws.amazon.com)

Production patterns that work well

Performance notes (and a reality check)

Where streaming fits—and where it doesn’t

Great fits:

Not ideal:

Putting it together: a pragmatic recipe

1) Start with a Node.js streaming handler using streamifyResponse and pipeline. Keep each chunk meaningful (e.g., SSE lines, JSON Lines). (docs.aws.amazon.com)
2) Attach a Function URL, set InvokeMode: RESPONSE_STREAM. Test locally with curl or a browser to confirm you see incremental chunks. (docs.aws.amazon.com)
3) Put CloudFront in front for custom domains, caching, WAF, and to keep streaming behavior intact. Use Origin Access Control or other restrictions to limit direct access to the Function URL. (aws.amazon.com)
4) If your Lambda must be in a VPC, call it via InvokeWithResponseStream using the AWS SDK through a VPC endpoint—Function URLs won’t stream from inside the VPC. (docs.aws.amazon.com)
5) Monitor costs and user-perceived latency. Streaming is about faster “feels fast” more than faster total bytes delivered; make sure your UI shows progress as chunks arrive. (docs.aws.amazon.com)

The bottom line

Lambda’s bump to 200 MB for response streaming meaningfully stretches the surface area of what you can serve straight from a function. It keeps your architecture simple for AI chats, progressive rendering, and biggish downloads, without detouring through extra storage and orchestration. Know your limits (2 MB/s after 6 MB), pick the right front door (Function URL, often behind CloudFront), and lean on Node’s stream patterns to keep backpressure in check. With those pieces in place, your serverless app can “hit play” sooner—and feel snappier—without overcomplicating the stack. (aws.amazon.com)

References: