Skip to main content
Learn how to track costs across users, features, and environments to understand your AI application’s unit economics and identify optimization opportunities.

What You’ll Learn

How to:
  • Track costs per user and feature
  • Set up cost alerts before budget overruns
  • Enable caching to reduce redundant API costs
  • Analyze cost trends over time

Prerequisites

  • Helicone API key (get one here)
  • An LLM application with API calls
  • 5 minutes to implement tracking

Step 1: Add Cost Tracking Headers

Start by tagging your requests with metadata for cost segmentation.
import { OpenAI } from "openai";

const client = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY,
  baseURL: "https://oai.helicone.ai/v1",
  defaultHeaders: {
    "Helicone-Auth": `Bearer ${process.env.HELICONE_API_KEY}`,
  },
});

// Track cost by user and feature
const response = await client.chat.completions.create(
  {
    model: "gpt-4o-mini",
    messages: [{ role: "user", content: "Hello!" }],
  },
  {
    headers: {
      "Helicone-User-Id": "user-123",
      "Helicone-Property-Feature": "chat",
      "Helicone-Property-Environment": "production",
      "Helicone-Property-UserTier": "premium",
    },
  }
);
Key Headers:
  • Helicone-User-Id: Track costs per user for unit economics
  • Helicone-Property-Feature: Identify which features drive costs
  • Helicone-Property-Environment: Separate dev/staging/production costs
  • Helicone-Property-UserTier: Compare free vs. paid user costs

Step 2: Organize Multi-Step Workflows

For complex workflows (like AI agents), use sessions to track the total cost of completing a task.
import { randomUUID } from "crypto";

const sessionId = randomUUID();

// Initial question
await client.chat.completions.create(
  {
    model: "gpt-4o",
    messages: [{ role: "user", content: "Summarize this document..." }],
  },
  {
    headers: {
      "Helicone-Session-Id": sessionId,
      "Helicone-Session-Name": "Document Analysis",
      "Helicone-Session-Path": "/analyze",
      "Helicone-User-Id": "user-123",
      "Helicone-Property-Feature": "document-analysis",
    },
  }
);

// Follow-up analysis
await client.chat.completions.create(
  {
    model: "gpt-4o",
    messages: [{ role: "user", content: "Extract key points..." }],
  },
  {
    headers: {
      "Helicone-Session-Id": sessionId, // Same session ID
      "Helicone-Session-Name": "Document Analysis",
      "Helicone-Session-Path": "/analyze/extract",
      "Helicone-User-Id": "user-123",
      "Helicone-Property-Feature": "document-analysis",
    },
  }
);
Sessions show the total cost of completing a task. This reveals insights like “document analysis costs $0.45 on average” rather than seeing individual API calls.

Step 3: View Cost Analytics

1

Dashboard Overview

Navigate to your Helicone dashboard to see:
  • Total costs (today, this week, this month)
  • Cost trends over time
  • Top cost-driving models and features
  • Cost per user breakdown
2

Filter by Properties

Use the filters to segment costs:
Filter by Property: Feature = "document-analysis"
Result: $127 spent on document analysis this week

Filter by Property: Environment = "development"
Result: $43 spent on development testing

Filter by Property: UserTier = "premium"
Result: Premium users generate $1,200 in value vs. $380 in costs
3

Session Cost Analysis

View Sessions to see:
  • Average cost per workflow type
  • Cost distribution across steps
  • Expensive outliers to investigate

Step 4: Set Up Cost Alerts

Preventing budget overruns before they happen.
1

Navigate to Alerts

Go to Settings → Alerts in your dashboard.
2

Create Cost Alert

  1. Click “Create Alert”
  2. Select Cost as the metric
  3. Set your threshold (e.g., $100/day)
  4. Choose time window (e.g., 1 day)
  5. Add filters (optional):
    • Environment = “production” (exclude dev costs)
    • Feature = “document-analysis” (monitor specific features)
3

Configure Notifications

Add notification channels:
  • Email: finance@company.com
  • Slack: #alerts channel
Recommended alert structure:
  • Daily alert at 80% of budget (warning)
  • Daily alert at 100% of budget (critical)
  • Separate alerts for production vs. development

Step 5: Enable Caching for Cost Reduction

Cache repetitive requests to eliminate redundant API costs.
// Enable caching for FAQ responses
await client.chat.completions.create(
  {
    model: "gpt-4o-mini",
    messages: [
      { role: "system", content: "Answer FAQ questions" },
      { role: "user", content: "What are your business hours?" }
    ],
  },
  {
    headers: {
      "Helicone-Cache-Enabled": "true",
      "Cache-Control": "max-age=86400", // 24 hours
      "Helicone-Property-Feature": "faq",
    },
  }
);

// Second identical request = $0 cost (cached)
await client.chat.completions.create(
  {
    model: "gpt-4o-mini",
    messages: [
      { role: "system", content: "Answer FAQ questions" },
      { role: "user", content: "What are your business hours?" }
    ],
  },
  {
    headers: {
      "Helicone-Cache-Enabled": "true",
      "Cache-Control": "max-age=86400",
      "Helicone-Property-Feature": "faq",
    },
  }
);
Best caching opportunities:
  • FAQ and support responses
  • Static content generation
  • Development/testing environments
  • Repeated queries with identical inputs

Expected Results

After implementing cost tracking:

Week 1

Total Costs: $487
├── Production: $423 (87%)
│   ├── chat: $245 (58%)
│   ├── document-analysis: $127 (30%)
│   └── search: $51 (12%)
└── Development: $64 (13%)

Top 5 Users by Cost:
1. user-789: $42.50 (premium tier)
2. user-456: $38.20 (premium tier)
3. user-123: $31.80 (free tier)
4. user-234: $28.90 (free tier)
5. user-567: $24.10 (premium tier)

Cache Performance:
- Hit rate: 23%
- Savings: $112

Insights

  • Premium users cost 35/monthaverage,generate35/month average, generate 120 value (3.4x ROI)
  • Free users cost $28/month, unsustainable without limits
  • Document analysis is most expensive feature at $0.45/session
  • Caching FAQ responses saved $112 (23% hit rate)

Step 6: Analyze and Optimize

1

Identify Cost Drivers

Look for:
  • High-cost users to potentially upgrade or limit
  • Features with poor cost-to-value ratios
  • Unexpected development environment costs
  • Cache opportunities (repeated similar requests)
2

Take Action

Based on insights:
// Add rate limiting for free tier users
if (userTier === "free" && monthlyCost > 25) {
  throw new Error("Monthly limit reached. Upgrade to premium.");
}
3

Monitor Impact

Track changes over time:
  • Did rate limiting reduce free tier costs?
  • Is model switching maintaining quality?
  • What’s the new cache hit rate?

Advanced: Query Costs Programmatically

Use the Helicone API to build custom cost dashboards:
const response = await fetch(
  "https://api.helicone.ai/v1/request/query-clickhouse",
  {
    method: "POST",
    headers: {
      "Authorization": `Bearer ${HELICONE_API_KEY}`,
      "Content-Type": "application/json",
    },
    body: JSON.stringify({
      filter: {
        request_response_rmt: {
          request_created_at: {
            gte: "2024-01-01T00:00:00Z"
          },
          properties: {
            UserTier: { equals: "premium" }
          }
        }
      },
      limit: 1000,
    }),
  }
);

const data = await response.json();
const totalCost = data.data.reduce(
  (sum, req) => sum + (req.cost_usd || 0), 
  0
);

console.log(`Premium user costs: $${totalCost.toFixed(2)}`);

Best Practices

Start with high-level tracking: Add User ID, Feature, and Environment headers to all requests
Use sessions for complex workflows: Group related requests to see true unit costs
Set graduated alerts: 50%, 80%, 95% of budget to catch issues early
Don’t over-optimize prematurely: Track for 1-2 weeks to understand patterns before making changes

Troubleshooting

Helicone calculates costs based on model detection:
  • Using AI Gateway: 100% accurate costs
  • Direct integration: Best-effort based on 300+ model pricing
If your model isn’t supported, contact help@helicone.ai to add it.
Properties take a few minutes to appear in filters after first use. Ensure:
  • Header format: Helicone-Property-[Name]
  • Values are strings (not numbers or booleans)
  • Requests are successfully logging (check dashboard)
Check:
  • Alert threshold and time window
  • Minimum request count (low traffic may not trigger)
  • Filters (too restrictive may exclude all requests)
  • Notification channels are configured correctly

Next Steps

Cost Tracking Guide

In-depth cost optimization strategies

User Metrics

Track per-user usage and costs

Sessions

Group requests to understand workflow costs

Alerts

Configure cost and error alerts