Skip to main content

What is Helicone?

Helicone is an open-source platform that combines LLM observability with an AI Gateway to help developers build reliable, cost-effective AI applications. We solve the core challenges of production LLM development: provider outages, unpredictable costs, and complex debugging.

AI Gateway

Access 100+ LLM models through one unified OpenAI-compatible API with automatic fallbacks and 0% markup

Full Observability

Monitor every request with detailed traces, costs, latency, and errors across your entire LLM stack

Prompt Management

Version, test, and deploy prompts without code changes using production data

Enterprise Ready

SOC 2 compliant, GDPR ready, with self-hosting options for complete data control

Key Features

Helicone provides everything you need to build production-grade LLM applications:

Unified AI Gateway

Access 100+ models from OpenAI, Anthropic, Google Gemini, AWS Bedrock, Groq, and more through a single OpenAI-compatible API. Switch providers by changing the model name—no code refactoring needed.
Supported Providers:
  • OpenAI (GPT-4o, GPT-4o-mini, o1, o3)
  • Anthropic (Claude 4.5 Sonnet, Claude Opus, Claude Haiku)
  • Google (Gemini 2.0 Flash, Gemini Pro)
  • AWS Bedrock, Azure OpenAI, Groq, Together AI, Anyscale, and 20+ more

Intelligent Routing & Reliability

Automatic FallbacksConfigure fallback chains to route to backup providers when failures occur. Keep your app online during outages.

Rate Limit ManagementCustom rate limits per user, endpoint, or any dimension to prevent abuse and control costs.

Response CachingCache identical requests to reduce latency and costs by up to 99% with semantic or exact matching.

LLM SecurityBuilt-in prompt injection detection, PII redaction, and content moderation.

Complete Observability

Every request through Helicone is automatically logged with full context—no additional instrumentation required.
What you get out of the box:
  • Request/response body inspection
  • Cost tracking across all providers with our 300+ model pricing database
  • Latency metrics (total time, time to first token, tokens per second)
  • Error tracking and debugging
  • Custom properties for filtering by user, feature, environment, etc.
  • Session tracing for multi-step AI workflows
  • Token usage analytics

Advanced Debugging Tools

Session Trees: Visualize complex AI agent workflows across multiple LLM calls. Trace the exact path of execution to find where things break.Request Search: Find any request in milliseconds with powerful filters across 10+ dimensions including cost, latency, model, user, and custom properties.Playground: Test prompts, replay requests, and compare model outputs side-by-side directly in the UI.

Prompt Management

Deploy prompt changes instantly without code deployments:
  • Version control for prompts with full history
  • A/B testing and experimentation
  • Environment-based deployments (dev, staging, prod)
  • Template variables and dynamic prompt compilation
  • Rollback to any previous version instantly

How It Works

Helicone operates as a transparent proxy between your application and LLM providers: Two integration options:
  1. AI Gateway with Credits (Recommended): Add credits to Helicone and access 100+ models instantly. We manage provider API keys for you at 0% markup.
  2. Bring Your Own Keys: Connect your own provider API keys for observability-only mode with direct billing.

Architecture

Helicone is built on five main components:
1

Web (Next.js)

Frontend dashboard for visualizing metrics, debugging requests, and managing prompts
2

Worker (Cloudflare)

Edge-deployed proxy that logs requests with less than 50ms latency overhead
3

Jawn (Express)

API server for collecting logs, serving queries, and managing platform features
4

Supabase

Authentication and application database for user data and configuration
5

ClickHouse

High-performance analytics database for querying millions of requests in milliseconds

Integration Time: 2 Minutes

Get started in three lines of code:
import { OpenAI } from "openai";

const client = new OpenAI({
  baseURL: "https://ai-gateway.helicone.ai",
  apiKey: process.env.HELICONE_API_KEY,
});

const response = await client.chat.completions.create({
  model: "gpt-4o-mini",
  messages: [{ role: "user", content: "Hello!" }]
});

Why Developers Choose Helicone

0% MarkupPay exactly what providers charge. No hidden fees, no games.

50ms OverheadEdge deployment keeps latency invisible to your users.

Open SourceApache 2.0 licensed. Self-host with Docker or Helm.

Unlimited LogsNever pay per request. Store unlimited history.

Drop-in IntegrationWorks with existing OpenAI SDK code. No refactoring.

Active CommunityJoin 2000+ developers building with Helicone.

Ready to Start?

Get Started in 2 Minutes

Follow our quickstart guide to send your first request and see it logged in the dashboard

Questions?

Join our Discord community or contact help@helicone.ai for support.