What is Helicone?
Helicone is an open-source platform that combines LLM observability with an AI Gateway to help developers build reliable, cost-effective AI applications. We solve the core challenges of production LLM development: provider outages, unpredictable costs, and complex debugging.AI Gateway
Access 100+ LLM models through one unified OpenAI-compatible API with automatic fallbacks and 0% markup
Full Observability
Monitor every request with detailed traces, costs, latency, and errors across your entire LLM stack
Prompt Management
Version, test, and deploy prompts without code changes using production data
Enterprise Ready
SOC 2 compliant, GDPR ready, with self-hosting options for complete data control
Key Features
Helicone provides everything you need to build production-grade LLM applications:Unified AI Gateway
Access 100+ models from OpenAI, Anthropic, Google Gemini, AWS Bedrock, Groq, and more through a single OpenAI-compatible API. Switch providers by changing the model name—no code refactoring needed.
- OpenAI (GPT-4o, GPT-4o-mini, o1, o3)
- Anthropic (Claude 4.5 Sonnet, Claude Opus, Claude Haiku)
- Google (Gemini 2.0 Flash, Gemini Pro)
- AWS Bedrock, Azure OpenAI, Groq, Together AI, Anyscale, and 20+ more
Intelligent Routing & Reliability
Automatic FallbacksConfigure fallback chains to route to backup providers when failures occur. Keep your app online during outages.
Rate Limit ManagementCustom rate limits per user, endpoint, or any dimension to prevent abuse and control costs.
Response CachingCache identical requests to reduce latency and costs by up to 99% with semantic or exact matching.
LLM SecurityBuilt-in prompt injection detection, PII redaction, and content moderation.
Complete Observability
What you get out of the box:- Request/response body inspection
- Cost tracking across all providers with our 300+ model pricing database
- Latency metrics (total time, time to first token, tokens per second)
- Error tracking and debugging
- Custom properties for filtering by user, feature, environment, etc.
- Session tracing for multi-step AI workflows
- Token usage analytics
Advanced Debugging Tools
Session Trees: Visualize complex AI agent workflows across multiple LLM calls. Trace the exact path of execution to find where things break.Request Search: Find any request in milliseconds with powerful filters across 10+ dimensions including cost, latency, model, user, and custom properties.Playground: Test prompts, replay requests, and compare model outputs side-by-side directly in the UI.
Prompt Management
Deploy prompt changes instantly without code deployments:- Version control for prompts with full history
- A/B testing and experimentation
- Environment-based deployments (dev, staging, prod)
- Template variables and dynamic prompt compilation
- Rollback to any previous version instantly
How It Works
Helicone operates as a transparent proxy between your application and LLM providers: Two integration options:- AI Gateway with Credits (Recommended): Add credits to Helicone and access 100+ models instantly. We manage provider API keys for you at 0% markup.
- Bring Your Own Keys: Connect your own provider API keys for observability-only mode with direct billing.
Architecture
Helicone is built on five main components:Integration Time: 2 Minutes
Get started in three lines of code:Why Developers Choose Helicone
0% MarkupPay exactly what providers charge. No hidden fees, no games.
50ms OverheadEdge deployment keeps latency invisible to your users.
Open SourceApache 2.0 licensed. Self-host with Docker or Helm.
Unlimited LogsNever pay per request. Store unlimited history.
Drop-in IntegrationWorks with existing OpenAI SDK code. No refactoring.
Active CommunityJoin 2000+ developers building with Helicone.
Ready to Start?
Get Started in 2 Minutes
Follow our quickstart guide to send your first request and see it logged in the dashboard