What are the differences between internal and external Redis in LangSmith deployments?

Last updated: April 20, 2026

Context

When deploying LangSmith, you can choose between using an internal Redis (bundled/in-cluster) or an external Redis (managed or separately operated). Understanding the differences between these options is important for making the right architectural decision for your deployment, especially when considering autoscaling, cost, and performance requirements.

Answer

Functional Differences

From LangSmith's perspective, there is no functional difference between internal and external Redis. LangSmith connects to Redis through a single connection URI (REDIS_DATABASE_URI), so the application behavior is identical regardless of where Redis is running.

Redis is used in LangSmith for:

  • Queuing: Background job processing using SAQ (Simple Async Queue)

  • Caching: Temporary trace run data caching

  • Rate limiting: API rate-limit counters

  • Task queues: Background job execution (exports, rules, upgrades, etc.)

All data stored in Redis has a configured TTL and is ephemeral - no durable user data or trace payloads are permanently stored there.

Architectural Differences

Internal Redis

  • Deployed as a StatefulSet with a single replica inside the cluster

  • Uses bundled redis image with an 8Gi PVC for persistence

  • Default resources: 2 CPU / 4Gi memory requests, 4 CPU / 8Gi memory limits

  • Standalone mode only (no clustering)

External Redis

  • Configured by setting "redis.external.enabled: true" and providing a connection URL

  • Supports Redis Cluster mode (redis.external.cluster.enabled: true)

  • Supports IAM/Workload Identity authentication for AWS, GCP, and Azure

  • Supports mTLS client certificates

  • Supports Azure EnterpriseCluster safe mode

  • Minimum supported version: Redis >= 5

  • Valkey is officially supported as a drop-in replacement (LangSmith v0.14+)

Autoscaling Considerations

The internal Redis StatefulSet is not affected by HPA or KEDA - it always runs as a fixed single replica. KEDA in the LangSmith chart only targets queue and ingest-queue services based on Redis queue backlog size.

When LangSmith app pods scale up/down, consider:

  • Connection count: More app pod replicas means more concurrent connections to Redis

  • Queue depth: Higher write throughput requires a larger Redis cache to handle increased queue depth

Cost and Performance Tradeoffs

Internal Redis (in-cluster)

  • Pros: Zero additional infrastructure cost, simplest setup

  • Cons: Single node (no HA), no automatic failover, competes for node resources

  • Best for: Dev/staging and low-to-medium production workloads

External Managed Redis

  • Pros: HA/replication, automatic failover, managed backups, independent scaling, better monitoring

  • Cons: Additional cost for the managed service

  • Best for: Production deployments

  • Recommended providers: AWS ElastiCache (OSS Redis or Valkey), Azure Cache for Redis, Google Cloud Memorystore

Requirements and Recommendations

  • LangSmith requires the noeviction eviction policy

  • No additional Redis modules are needed (RediSearch, RedisJSON not required)

  • Each LangSmith installation must use its own dedicated Redis instance

  • Baseline recommendation for external Redis: at least 2 vCPUs and 8GB of memory

  • Production recommendation: Use an external managed Redis service

Dataplane (LangGraph Platform) Redis

The dataplane helm chart follows the same Redis architecture with these differences:

  • External Redis options are simpler (no Redis Cluster, IAM, or mTLS configuration)

  • Redis serves a narrower purpose: worker communication and ephemeral metadata only

  • Multiple deployments can share the same Redis instance but must use different database numbers

Sources:

https://docs.langchain.com/langsmith/self-host-external-redis

https://docs.langchain.com/langsmith/self-host-scale#keda-autoscaling-for-langsmith-queues

https://support.langchain.com/articles/8478795211-productionizing-and-scaling-self-hosted-langsmith-best-practices (HPA examples for backend, platform, queue)

https://docs.langchain.com/langsmith/env-var (REDIS_MAX_CONNECTIONS for Agent Server)

https://docs.langchain.com/langsmith/self-host-scale (sizing table and config examples)

https://docs.langchain.com/langsmith/self-host-external-redis (external Redis setup and sizing baseline)

https://docs.langchain.com/langsmith/kubernetes (cluster requirements and external DB recommendation)

https://support.langchain.com/articles/5142502915-best-practices-for-scaling-self-hosted-langsmith

https://support.langchain.com/articles/8478795211-productionizing-and-scaling-self-hosted-langsmith-best-practices

https://docs.langchain.com/langsmith/data-plane#redis (dataplane Redis architecture)

https://docs.langchain.com/langsmith/data-plane#custom-redis (REDIS_URI_CUSTOM setup)

https://support.langchain.com/articles/3842219806-streaming-failures-when-using-shared-custom-redis-instances

https://docs.langchain.com/langsmith/agent-server-scale (Agent Server scaling config)