TrueFoundry

TrueFoundry provides an enterprise-ready AI Gateway to provide governance and observability to agentic frameworks like LangChain. TrueFoundry AI Gateway serves as a unified interface for LLM access, providing:

Unified API Access: Connect to 250+ LLMs (OpenAI, Claude, Gemini, Groq, Mistral) through one API
Low Latency: Sub-3ms internal latency with intelligent routing and load balancing
Enterprise Security: SOC 2, HIPAA, GDPR compliance with RBAC and audit logging
Quota and cost management: Token-based quotas, rate limiting, and comprehensive usage tracking
Observability: Full request/response logging, metrics, and traces with customizable retention

Prerequisites

Before integrating LangChain with TrueFoundry, ensure you have:

TrueFoundry Account: A TrueFoundry account with at least one model provider configured. Follow quick start guide here
Personal Access Token: Generate a token by following the TrueFoundry token generation guide

Quickstart

You can connect to TrueFoundry's unified LLM gateway through the ChatOpenAI interface.

Set the base_url to your TrueFoundry endpoint (explained below)
Set the api_key to your TrueFoundry PAT (Personal Access Token)
Use the same model-name as shown in the unified code snippet

TrueFoundry metrics

Installation

pip install langchain-openai

Basic Setup

Connect to TrueFoundry by updating the ChatOpenAI model in LangChain:

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(
    api_key=TRUEFOUNDRY_API_KEY,
    base_url=TRUEFOUNDRY_GATEWAY_BASE_URL,
    model="openai-main/gpt-4o"  # Similarly you can call any model from any model provider
)

llm.invoke("What is the meaning of life, universe and everything?")

API Reference:ChatOpenAI

The request is routed through your TrueFoundry gateway to the specified model provider. TrueFoundry automatically handles rate limiting, load balancing, and observability.

LangGraph Integration

from langchain_openai import ChatOpenAI
from langgraph.graph import StateGraph, MessagesState
from langchain_core.messages import HumanMessage

# Define your LangGraph workflow
def call_model(state: MessagesState):
    model = ChatOpenAI(
        api_key=TRUEFOUNDRY_API_KEY,
        base_url=TRUEFOUNDRY_GATEWAY_BASE_URL,
        # Copy the exact model name from gateway
        model="openai-main/gpt-4o"
    )
    response = model.invoke(state["messages"])
    return {"messages": [response]}

# Build workflow
workflow = StateGraph(MessagesState)
workflow.add_node("agent", call_model)
workflow.set_entry_point("agent")
workflow.set_finish_point("agent")

app = workflow.compile()

# Run agent through TrueFoundry
result = app.invoke({"messages": [HumanMessage(content="Hello!")]})

API Reference:ChatOpenAI | HumanMessage | StateGraph

Observability and Governance

TrueFoundry metrics

With the Metrics Dashboard, you can monitor and analyze:

Performance Metrics: Track key latency metrics like Request Latency, Time to First Token (TTFS), and Inter-Token Latency (ITL) with P99, P90, and P50 percentiles
Cost and Token Usage: Gain visibility into your application's costs with detailed breakdowns of input/output tokens and the associated expenses for each model
Usage Patterns: Understand how your application is being used with detailed analytics on user activity, model distribution, and team-based usage
Rate Limiting & Load Balancing: Configure limits, distribute traffic across models, and set up fallbacks

Support

For questions, issues, or support:

Email: support@truefoundry.com
Documentation: https://docs.truefoundry.com/

Prerequisites​

Quickstart​

Installation​

Basic Setup​

LangGraph Integration​

Observability and Governance​

Support​