Configuring Timeout for init_chat_model in LangChain

Last updated: January 5, 2026

Problem

When using init_chat_model with a timeout parameter, the actual timeout behavior may be longer than expected. For example, setting timeout=30 might result in waiting 90+ seconds before a timeout error is raised.

Causes

Three factors can contribute to this:

Issue	Cause	Multiplier
Retry multiplication	`max_retries=2` (default)	3× timeout
Incomplete timeout config	Float only sets read timeout (OpenAI)	Connection hangs
DNS multi-IP	N A records tried sequentially	N× timeout

Cause 1: Retries Multiply the Timeout

Both OpenAI and Anthropic SDKs default to max_retries=2, meaning 3 total attempts. Each attempt waits for the timeout duration before retrying.

Example: timeout=30 with default retries = 90s actual timeout

Solution: Set max_retries=0:

from langchain.chat_models import init_chat_model

llm = init_chat_model(
    "gpt-4o",
    model_provider="openai",
    timeout=30.0,
    max_retries=0,  # Disable retries
)

Cause 2: Float Timeout May Not Set Connection Timeout (OpenAI)

For OpenAI-based models, a simple float like timeout=30 sets the read timeout, not the connection timeout. This means requests to unreachable endpoints may not timeout as expected.

Solution: Use httpx.Timeout for fine-grained control:

import httpx
from langchain.chat_models import init_chat_model

llm = init_chat_model(
    "gpt-4o",
    model_provider="openai",
    timeout=httpx.Timeout(
        connect=30.0,  # Connection timeout (like curl --connect-timeout)
        read=60.0,     # Time to wait for response data
        write=30.0,    # Time to wait for sending data
        pool=30.0      # Time to wait for connection from pool
    ),
    max_retries=0,
)

For Anthropic Provider

Float timeout works for connection timeout, but still set max_retries=0:

from langchain.chat_models import init_chat_model

llm = init_chat_model(
    "claude-sonnet-4-20250514",
    model_provider="anthropic",
    timeout=30.0,
    max_retries=0,
)

Note: Anthropic does not support httpx.Timeout - only float values are accepted.

Cause 3: DNS Multi-IP Timeout Multiplication

Even with max_retries=0 and proper httpx.Timeout configuration, timeouts may still be longer than expected if your endpoint's hostname resolves to multiple IP addresses (A records).

Problem

When a hostname resolves to multiple IP addresses, httpx tries each IP sequentially, applying the full connect timeout to each attempt:

3 IPs × 10s timeout = 30s actual timeout

This happens because httpx doesn't implement Happy Eyeballs (RFC 8305), which would race connections in parallel like browsers do.

How to Check Your DNS

# Check how many A records your endpoint has
dig +short your-endpoint.com A

Or in Python:

import socket
_, _, ips = socket.gethostbyname_ex("your-endpoint.com")
print(f"Found {len(ips)} A records: {ips}")

Solution: Dynamic Timeout Based on DNS

Calculate the per-IP timeout by dividing your desired total timeout by the number of DNS A records:

import socket
import httpx
from langchain.chat_models import init_chat_model

hostname = "your-endpoint.com"
desired_timeout = 10.0

# Get number of A records
_, _, ips = socket.gethostbyname_ex(hostname)
per_ip_timeout = desired_timeout / len(ips)

print(f"Found {len(ips)} IPs, using {per_ip_timeout:.2f}s per IP")

llm = init_chat_model(
    "gpt-4o",
    model_provider="openai",
    base_url=f"https://{hostname}",
    timeout=httpx.Timeout(
        connect=per_ip_timeout,
        read=60.0,
        write=10.0,
        pool=10.0
    ),
    max_retries=0,
)

Helper Function

import socket
import httpx

def get_timeout_for_hostname(hostname: str, desired_timeout: float) -> httpx.Timeout:
    """
    Calculate timeout accounting for multiple DNS A records.

    Args:
        hostname: The endpoint hostname (without https://)
        desired_timeout: Total desired timeout in seconds

    Returns:
        httpx.Timeout configured for predictable behavior
    """
    try:
        _, _, ips = socket.gethostbyname_ex(hostname)
        per_ip_timeout = desired_timeout / max(len(ips), 1)
        print(f"DNS has {len(ips)} A records, using {per_ip_timeout:.1f}s connect timeout")
    except socket.gaierror:
        per_ip_timeout = desired_timeout

    return httpx.Timeout(
        connect=per_ip_timeout,
        read=60.0,
        write=10.0,
        pool=10.0
    )

# Usage
hostname = "your-endpoint.com"
timeout = get_timeout_for_hostname(hostname, desired_timeout=10.0)

llm = init_chat_model(
    "gpt-4o",
    model_provider="openai",
    base_url=f"https://{hostname}",
    timeout=timeout,
    max_retries=0,
)

Complete Example

Combining all solutions for predictable timeout behavior:

import socket
import httpx
from langchain.chat_models import init_chat_model

# Configuration
hostname = "your-endpoint.com"
desired_connect_timeout = 10.0
read_timeout = 60.0

# Account for multiple DNS A recordstry:
    _, _, ips = socket.gethostbyname_ex(hostname)
    per_ip_timeout = desired_connect_timeout / len(ips)
    print(f"Endpoint has {len(ips)} A records, using {per_ip_timeout:.1f}s per IP")
except socket.gaierror as e:
    print(f"DNS resolution failed: {e}")
    per_ip_timeout = desired_connect_timeout

# Create LLM with predictable timeout
llm = init_chat_model(
    "gpt-4o",
    model_provider="openai",
    base_url=f"https://{hostname}",
    timeout=httpx.Timeout(
        connect=per_ip_timeout,
        read=read_timeout,
        write=10.0,
        pool=10.0
    ),
    max_retries=0,
)

Summary

Provider	Timeout Type	`max_retries`	DNS Handling	Behavior
OpenAI	`float`	default (2)	None	Unpredictable
OpenAI	`httpx.Timeout`	`0`	None	Predictable (single IP)
OpenAI	`httpx.Timeout`	`0`	Dynamic	Predictable (multi IP)
Anthropic	`float`	`0`	N/A	Predictable

Key Takeaways

Always set max_retries=0 when you need predictable timeout behavior
Use httpx.Timeout for OpenAI to control connection timeout
Check DNS A records if timeout is still longer than expected — divide timeout by IP count

Technical Background

Why httpx doesn't race connections

httpx uses httpcore as its transport layer, which calls anyio for async networking.

While anyio supports Happy Eyeballs via happy_eyeballs_delay parameter since v1.2.0, httpcore doesn't pass this parameter through. This means connections are tried sequentially rather than in parallel.

Happy Eyeballs (RFC 8305)

Happy Eyeballs is an algorithm that races connection attempts to multiple IP addresses in parallel, using the first successful connection. This is what browsers and curl do, resulting in faster and more predictable connection times.

Python's asyncio supports this natively since 3.8:

await loop.create_connection(
    protocol_factory,
    host='example.com',
    port=443,
    happy_eyeballs_delay=0.25  # 250ms between attempts
)

But httpx/httpcore don't use this feature yet.