You are viewing the v1 docs for LangChain, which is currently under active development. Learn more.
Overview
Streaming is crucial for enhancing the responsiveness of applications built on LLMs. By displaying output progressively, even before a complete response is ready, streaming significantly improves user experience (UX), particularly when dealing with the latency of LLMs.Stream from an agent
LangChain’s streaming system lets you surface live feedback from agent runs to your application. There are three main categories of data you can stream:- Agent progress — get state updates after each agent step.
- LLM tokens — stream language model tokens as they’re generated.
- Custom updates — emit user-defined signals (e.g., “Fetched 10/100 records”).
Agent progress
To stream agent progress, use thestream()
or astream()
methods with stream_mode="updates"
. This emits an event after every agent step.
For example, if you have an agent that calls a tool once, you should see the following updates:
- LLM node: AI message with tool call requests
- Tool node: Tool message with execution result
- LLM node: Final AI response
Streaming agent progress
Output
LLM tokens
To stream tokens as they are produced by the LLM, usestream_mode="messages"
. Below
you can see the output of the agent streaming tool calls and the final response.
Streaming LLM tokens
Output
Custom updates
To stream updates from tools as they are executed, you can use get_stream_writer.Streaming custom updates
Output
If you add
get_stream_writer
inside your tool, you won’t be able to invoke the tool outside of a LangGraph execution context.Stream multiple modes
You can specify multiple streaming modes by passing stream mode as a list:stream_mode=["updates", "custom"]
:
Streaming multiple modes
Output