Skip to main content
Open on GitHub

TrueFoundry

TrueFoundry provides an enterprise-ready AI Gateway to provide governance and observability to agentic frameworks like LangChain. TrueFoundry AI Gateway serves as a unified interface for LLM access, providing:

  • Unified API Access: Connect to 250+ LLMs (OpenAI, Claude, Gemini, Groq, Mistral) through one API
  • Low Latency: Sub-3ms internal latency with intelligent routing and load balancing
  • Enterprise Security: SOC 2, HIPAA, GDPR compliance with RBAC and audit logging
  • Quota and cost management: Token-based quotas, rate limiting, and comprehensive usage tracking
  • Observability: Full request/response logging, metrics, and traces with customizable retention

Prerequisites

Before integrating LangChain with TrueFoundry, ensure you have:

  1. TrueFoundry Account: A TrueFoundry account with at least one model provider configured. Follow quick start guide here
  2. Personal Access Token: Generate a token by following the TrueFoundry token generation guide

Quickstart

You can connect to TrueFoundry's unified LLM gateway through the ChatOpenAI interface.

  • Set the base_url to your TrueFoundry endpoint (explained below)
  • Set the api_key to your TrueFoundry PAT (Personal Access Token)
  • Use the same model-name as shown in the unified code snippet

TrueFoundry metrics

Installation

pip install langchain-openai

Basic Setup

Connect to TrueFoundry by updating the ChatOpenAI model in LangChain:

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(
api_key=TRUEFOUNDRY_API_KEY,
base_url=TRUEFOUNDRY_GATEWAY_BASE_URL,
model="openai-main/gpt-4o" # Similarly you can call any model from any model provider
)

llm.invoke("What is the meaning of life, universe and everything?")
API Reference:ChatOpenAI

The request is routed through your TrueFoundry gateway to the specified model provider. TrueFoundry automatically handles rate limiting, load balancing, and observability.

LangGraph Integration

from langchain_openai import ChatOpenAI
from langgraph.graph import StateGraph, MessagesState
from langchain_core.messages import HumanMessage

# Define your LangGraph workflow
def call_model(state: MessagesState):
model = ChatOpenAI(
api_key=TRUEFOUNDRY_API_KEY,
base_url=TRUEFOUNDRY_GATEWAY_BASE_URL,
# Copy the exact model name from gateway
model="openai-main/gpt-4o"
)
response = model.invoke(state["messages"])
return {"messages": [response]}

# Build workflow
workflow = StateGraph(MessagesState)
workflow.add_node("agent", call_model)
workflow.set_entry_point("agent")
workflow.set_finish_point("agent")

app = workflow.compile()

# Run agent through TrueFoundry
result = app.invoke({"messages": [HumanMessage(content="Hello!")]})

Observability and Governance

TrueFoundry metrics

With the Metrics Dashboard, you can monitor and analyze:

  • Performance Metrics: Track key latency metrics like Request Latency, Time to First Token (TTFS), and Inter-Token Latency (ITL) with P99, P90, and P50 percentiles
  • Cost and Token Usage: Gain visibility into your application's costs with detailed breakdowns of input/output tokens and the associated expenses for each model
  • Usage Patterns: Understand how your application is being used with detailed analytics on user activity, model distribution, and team-based usage
  • Rate Limiting & Load Balancing: Configure limits, distribute traffic across models, and set up fallbacks

Support

For questions, issues, or support: