How do I configure checkpointing in LangGraph?

Last updated: October 9, 2025

Context

When using LangGraph, customers need to understand how to properly configure checkpointing to manage thread state persistence and control data retention. This includes understanding the available checkpointing options across different deployment models and how to configure Time-to-Live (TTL) settings.

Answer

LangGraph provides different checkpointing options depending on your deployment model:

Cloud SaaS / Managed Platform

For the managed LangGraph platform:

Checkpointing is automatically handled by the platform's managed checkpointer
Custom checkpointers are not supported
Configure TTL settings in langgraph.json to manage data retention:

Self-Hosted Container

When self-hosting LangGraph:

The checkpointer is managed by the platform's API server
No need to manually configure store or checkpointer in your graph code
Configure the database connection using the POSTGRES_CHECKPOINTER_URI environment variable
For optimal connection management, use ConnectionPool instead of raw Connection to prevent connection timeouts during long runs:

from psycopg.rows import dict_row
from psycopg_pool import ConnectionPool
from langgraph.checkpoint.postgres import PostgresSaver

pool = ConnectionPool(
  conn_string,
  max_size=10,
  kwargs={"autocommit": True, "row_factory": dict_row}
)
checkpointer = PostgresSaver(pool)
checkpointer.setup()

graph = your_graph.compile(checkpointer=checkpointer)

Subgraph Checkpointing Configuration

When using subgraphs, only the parent graph should have a checkpointer to avoid duplicate storage and state persistence issues:

# Correct: Only parent graph has checkpointer
def create_plan_subgraph():
    builder = StateGraph(YourState)
    # ... add nodes ...
    return builder.compile()  # No checkpointer

def create_execute_subgraph():
    builder = StateGraph(YourState)
    # ... add nodes ...
    return builder.compile()  # No checkpointer

# Parent graph with single checkpointer
builder = StateGraph(YourState)
builder.add_node("plan_subgraph", create_plan_subgraph())
builder.add_node("execute_subgraph", create_execute_subgraph())
return builder.compile(checkpointer=checkpointer)  # Single source of truth

Configure ConnectionPool parameters based on your needs:

pool = ConnectionPool(
  conn_string,
  min_size=2,           # Minimum connections to keep open
  max_size=10,          # Maximum connections in pool
  max_idle=300.0,       # Time (seconds) before idle connection is closed
  max_lifetime=3600.0,  # Max lifetime of a connection (seconds)
  kwargs={
    "autocommit": True,
    "row_factory": dict_row,
    "prepare_threshold": 0,  # Disable prepared statements if needed
  }
)

The checkpointer is managed by the platform's API server
No need to manually configure store or checkpointer in your graph code
Configure the database connection using the POSTGRES_CHECKPOINTER_URI environment variable

Important Notes

Thread state and checkpoints are automatically deleted when TTL expires
Associated traces in LangSmith will disappear when checkpoints are deleted
For high-volume deployments, consider using blob storage to prevent database bloat
TTL policies should be mirrored in blob storage lifecycle rules if using blob storage
MongoDB checkpointers have a 16MB document size limit per checkpoint, while PostgreSQL supports up to 1GB per field. Consider PostgreSQL for applications with large state objects
Checkpoint blobs have a practical limit of approximately 1GB before memory errors occur, regardless of the checkpointer backend
Large data (images, PDFs, etc.) stored in graph state can cause memory issues and database performance problems since the entire state is checkpointed
For large payloads, store only references in state and keep the actual data in external storage (e.g., S3) or use the LangGraph store for cross-thread accessibility
Large data in state will also appear in LangSmith traces, potentially impacting performance
When using subgraphs, only compile the parent graph with a checkpointer. Compiling subgraphs with their own checkpointers creates separate checkpoint namespaces, leading to document size bloat and state persistence issues on resume
Each checkpoint document stores the complete graph state, RunnableConfig, and execution metadata at that point in time

Note: While you can configure a custom BaseStore for long-term memory storage, the checkpointer itself cannot be customized in the managed platform.

For more information, refer to: