Using Local LLM and LangChain Anonymizer for PII Anonymization

Last updated: January 7, 2026

Overview

LangSmith's anonymizer feature supports custom functions, enabling you to integrate a private/local LLM for intelligent PII detection and redaction. This approach provides more sophisticated anonymization than regex patterns alone, while keeping sensitive data within your network.

Prerequisites

  • LangSmith Python SDK >= 0.1.81

  • A local LLM server (e.g., LM Studio, Ollama, vLLM)

Implementation

from langsmith import Client
from langsmith.anonymizer import create_anonymizer
from openai import OpenAI

# Local LLM - PII never leaves your network
local_llm = OpenAI(base_url="http://127.0.0.1:1234/v1", api_key="not-needed")

def llm_anonymizer(value: str, path: list) -> str:
    """Use a local LLM to detect and redact PII."""
    if not isinstance(value, str) or len(value) < 10:
        return value

    response = local_llm.chat.completions.create(
        model="gpt-oss-safeguard-20b",
        messages=[{
            "role": "system",
            "content": "Replace PII: names→[NAME], emails→[EMAIL], phones→[PHONE], addresses→[ADDRESS], companies→[COMPANY]. Return only the text."
        }, {"role": "user", "content": value}],
        temperature=0  # Deterministic output
    )
    return response.choices[0].message.content

client = Client(anonymizer=create_anonymizer(llm_anonymizer))

Tradeoffs

  • Latency: LLM calls add ~100-500ms per trace

  • Non-determinism: Use temperature=0 to mitigate