How do I configure checkpointing in LangGraph?

Last updated: October 9, 2025

Context

When using LangGraph, customers need to understand how to properly configure checkpointing to manage thread state persistence and control data retention. This includes understanding the available checkpointing options across different deployment models and how to configure Time-to-Live (TTL) settings.

Answer

LangGraph provides different checkpointing options depending on your deployment model:

Cloud SaaS / Managed Platform

For the managed LangGraph platform:

  • Checkpointing is automatically handled by the platform's managed checkpointer

  • Custom checkpointers are not supported

  • Configure TTL settings in langgraph.json to manage data retention:

Self-Hosted Container

When self-hosting LangGraph:

  • The checkpointer is managed by the platform's API server

  • No need to manually configure store or checkpointer in your graph code

  • Configure the database connection using the POSTGRES_CHECKPOINTER_URI environment variable

  • For optimal connection management, use ConnectionPool instead of raw Connection to prevent connection timeouts during long runs:

from psycopg.rows import dict_row
from psycopg_pool import ConnectionPool
from langgraph.checkpoint.postgres import PostgresSaver

pool = ConnectionPool(
conn_string,
max_size=10,
kwargs={"autocommit": True, "row_factory": dict_row}
)
checkpointer = PostgresSaver(pool)
checkpointer.setup()

graph = your_graph.compile(checkpointer=checkpointer)

Subgraph Checkpointing Configuration

When using subgraphs, only the parent graph should have a checkpointer to avoid duplicate storage and state persistence issues:

# Correct: Only parent graph has checkpointer
def create_plan_subgraph():
    builder = StateGraph(YourState)
    # ... add nodes ...
    return builder.compile()  # No checkpointer

def create_execute_subgraph():
    builder = StateGraph(YourState)
    # ... add nodes ...
    return builder.compile()  # No checkpointer

# Parent graph with single checkpointer
builder = StateGraph(YourState)
builder.add_node("plan_subgraph", create_plan_subgraph())
builder.add_node("execute_subgraph", create_execute_subgraph())
return builder.compile(checkpointer=checkpointer)  # Single source of truth

Configure ConnectionPool parameters based on your needs:

pool = ConnectionPool(
conn_string,
min_size=2, # Minimum connections to keep open
max_size=10, # Maximum connections in pool
max_idle=300.0, # Time (seconds) before idle connection is closed
max_lifetime=3600.0, # Max lifetime of a connection (seconds)
kwargs={
"autocommit": True,
"row_factory": dict_row,
"prepare_threshold": 0, # Disable prepared statements if needed
}
)
  • The checkpointer is managed by the platform's API server

  • No need to manually configure store or checkpointer in your graph code

  • Configure the database connection using the POSTGRES_CHECKPOINTER_URI environment variable

Important Notes

  • Thread state and checkpoints are automatically deleted when TTL expires

  • Associated traces in LangSmith will disappear when checkpoints are deleted

  • For high-volume deployments, consider using blob storage to prevent database bloat

  • TTL policies should be mirrored in blob storage lifecycle rules if using blob storage

  • MongoDB checkpointers have a 16MB document size limit per checkpoint, while PostgreSQL supports up to 1GB per field. Consider PostgreSQL for applications with large state objects

  • Checkpoint blobs have a practical limit of approximately 1GB before memory errors occur, regardless of the checkpointer backend

  • Large data (images, PDFs, etc.) stored in graph state can cause memory issues and database performance problems since the entire state is checkpointed

  • For large payloads, store only references in state and keep the actual data in external storage (e.g., S3) or use the LangGraph store for cross-thread accessibility

  • Large data in state will also appear in LangSmith traces, potentially impacting performance

  • When using subgraphs, only compile the parent graph with a checkpointer. Compiling subgraphs with their own checkpointers creates separate checkpoint namespaces, leading to document size bloat and state persistence issues on resume

  • Each checkpoint document stores the complete graph state, RunnableConfig, and execution metadata at that point in time

Note: While you can configure a custom BaseStore for long-term memory storage, the checkpointer itself cannot be customized in the managed platform.

For more information, refer to: