Track Environments in LLM Applications

Proper environment tracking helps you separate dev/staging/production traffic, compare performance across environments, and prevent production issues.

The Problem

Without environment tracking:

Development requests pollute production metrics
Can’t compare staging vs. production performance
Difficult to test changes before production rollout
Cost analytics include test/dev spending
Alerts trigger on development errors

The Solution

Use Helicone’s custom properties to track environments and segment your data.

Implementation

1. Add Environment Headers

Tag every request with its environment:

import { OpenAI } from "openai";

const ENV = process.env.NODE_ENV || "development"; // development, staging, production

const client = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY,
  baseURL: "https://oai.helicone.ai/v1",
  defaultHeaders: {
    "Helicone-Auth": `Bearer ${process.env.HELICONE_API_KEY}`,
    // Essential: Track environment
    "Helicone-Property-Environment": ENV,
  },
});

// All requests automatically tagged with environment
const response = await client.chat.completions.create({
  model: "gpt-4o",
  messages: [{ role: "user", content: "Hello!" }],
});

2. Add Version Tracking

Track code version to identify issues:

const client = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY,
  baseURL: "https://oai.helicone.ai/v1",
  defaultHeaders: {
    "Helicone-Auth": `Bearer ${process.env.HELICONE_API_KEY}`,
    "Helicone-Property-Environment": ENV,
    // Track version for debugging
    "Helicone-Property-Version": process.env.APP_VERSION || "unknown",
    // Optional: Track git commit
    "Helicone-Property-Commit": process.env.GIT_COMMIT || "unknown",
  },
});

Version tracking helps identify when issues started and compare performance between releases.

3. Environment-Specific Configuration

Different settings per environment:

interface EnvironmentConfig {
  caching: boolean;
  model: string;
  maxTokens: number;
  alertThreshold: number;
}

const CONFIG: Record<string, EnvironmentConfig> = {
  development: {
    caching: true,        // Always cache in dev
    model: "gpt-4o-mini", // Cheaper model for dev
    maxTokens: 500,       // Smaller responses
    alertThreshold: 0,    // No alerts in dev
  },
  staging: {
    caching: true,
    model: "gpt-4o",      // Production model
    maxTokens: 1000,
    alertThreshold: 10,   // Relaxed alerts
  },
  production: {
    caching: false,       // Fresh responses
    model: "gpt-4o",
    maxTokens: 2000,
    alertThreshold: 5,    // Strict alerts
  },
};

const config = CONFIG[ENV] || CONFIG.development;

const response = await client.chat.completions.create(
  {
    model: config.model,
    messages: [...],
    max_tokens: config.maxTokens,
  },
  {
    headers: {
      ...(config.caching && {
        "Helicone-Cache-Enabled": "true",
        "Cache-Control": "max-age=86400",
      }),
    },
  }
);

Dashboard Usage

Filter by Environment

Navigate to Dashboard

Go to Helicone Dashboard

Apply Environment Filter

Property: Environment = production

Now all metrics show only production data:

Total requests
Cost
Error rate
Latency

Compare Environments

Open multiple browser tabs:

Tab 1: Filter by Environment = production
Tab 2: Filter by Environment = staging
Tab 3: Filter by Environment = development

Compare metrics side-by-side.

Environment-Specific Alerts

Create separate alerts per environment:

Production Alert
Staging Alert
Development Alert

Configuration:

Name: Production Error Rate
Metric: Error Rate
Threshold: > 2% (strict)
Time Window: 5 minutes
Filter: Environment = production
Notifications: Slack #production-alerts, PagerDuty

Configuration:

Name: Staging Error Rate
Metric: Error Rate
Threshold: > 10% (relaxed)
Time Window: 15 minutes
Filter: Environment = staging
Notifications: Slack #staging-alerts

Configuration:

Name: Development Cost Spike
Metric: Cost
Threshold: > $50/day
Time Window: 1 day
Filter: Environment = development
Notifications: Email team@company.com

Development usually doesn’t need error alerts, but cost alerts prevent runaway test scripts.

Use Cases

1. Pre-Production Testing

Test prompt changes in staging before production:

// Deploy new prompt to staging
const SYSTEM_PROMPT = ENV === "production"
  ? "You are a helpful assistant."  // Old prompt
  : "You are a helpful and concise assistant."; // New prompt

await client.chat.completions.create(
  {
    model: "gpt-4o",
    messages: [
      { role: "system", content: SYSTEM_PROMPT },
      { role: "user", content: "..." }
    ],
  },
  {
    headers: {
      "Helicone-Property-Environment": ENV,
      "Helicone-Property-PromptVersion": ENV === "production" ? "v1" : "v2",
    },
  }
);

Compare in Helicone:

Filter 1: Environment = staging, PromptVersion = v2
Filter 2: Environment = production, PromptVersion = v1

Metrics to compare:
- Response length (tokens)
- User satisfaction (feedback)
- Cost per request
- Latency

2. Cost Tracking by Environment

See where money is being spent:

// Query costs by environment
const response = await fetch(
  "https://api.helicone.ai/v1/request/query-clickhouse",
  {
    method: "POST",
    headers: {
      "Authorization": `Bearer ${HELICONE_API_KEY}`,
      "Content-Type": "application/json",
    },
    body: JSON.stringify({
      filter: {
        request_response_rmt: {
          request_created_at: {
            gte: new Date(Date.now() - 30 * 24 * 60 * 60 * 1000).toISOString(),
          },
        },
      },
    }),
  }
);

const data = await response.json();

// Calculate cost by environment
const costByEnv = data.data.reduce((acc: any, req: any) => {
  const env = req.properties?.Environment || "unknown";
  acc[env] = (acc[env] || 0) + (req.cost_usd || 0);
  return acc;
}, {});

console.log("Last 30 days:");
console.log(`Production: $${costByEnv.production?.toFixed(2) || 0}`);
console.log(`Staging: $${costByEnv.staging?.toFixed(2) || 0}`);
console.log(`Development: $${costByEnv.development?.toFixed(2) || 0}`);

Example output:

Last 30 days:
Production: $1,247.50
Staging: $83.20
Development: $156.30

3. Debugging Production Issues

When production has issues:

Filter to Production

Property: Environment = production
Date: [When issue started]

Check Recent Deployments

Property: Environment = production
Property: Version = v2.5.0  (newly deployed)

Compare to:
Property: Environment = production  
Property: Version = v2.4.0  (previous version)

Test Fix in Staging

Deploy fix to staging:

Property: Environment = staging
Property: Version = v2.5.1

Verify:
- Error rate drops
- No new issues
- Performance acceptable

Roll Out to Production

After staging validation:

Property: Environment = production
Property: Version = v2.5.1

Monitor:
- Error rate
- Latency
- User feedback

4. A/B Testing Across Environments

Test different approaches:

// Staging: Test new model
await client.chat.completions.create(
  {
    model: "gpt-4o-mini",  // New cheaper model
    messages: [...],
  },
  {
    headers: {
      "Helicone-Property-Environment": "staging",
      "Helicone-Property-ModelTest": "gpt-4o-mini",
    },
  }
);

// Production: Current model
await client.chat.completions.create(
  {
    model: "gpt-4o",  // Current model
    messages: [...],
  },
  {
    headers: {
      "Helicone-Property-Environment": "production",
      "Helicone-Property-ModelTest": "gpt-4o",
    },
  }
);

// Compare:
// - Quality (user feedback, scores)
// - Cost (avg per request)
// - Latency (p50, p95, p99)
// - Error rate

Best Practices

Use consistent naming: Stick to “development”, “staging”, “production” across all services

Set environment at startup: Configure once when app starts, not per request

Different models per environment: Use cheaper models in dev/staging to reduce costs

Separate Helicone projects: For strict separation, create separate Helicone projects per environment

Always validate Environment property is set correctly. Missing environments will be hard to filter.

Advanced: Multi-Region Environments

Track both environment and region:

const client = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY,
  baseURL: "https://oai.helicone.ai/v1",
  defaultHeaders: {
    "Helicone-Auth": `Bearer ${process.env.HELICONE_API_KEY}`,
    "Helicone-Property-Environment": ENV,
    "Helicone-Property-Region": process.env.AWS_REGION || "us-east-1",
    "Helicone-Property-Datacenter": process.env.DATACENTER || "aws-us-east-1a",
  },
});

// Filter by: Environment = production AND Region = us-west-2
// Compare performance across regions

Environment Tracking Checklist

Environment property added to all requests
Version/commit tracking enabled
Environment-specific configurations (model, caching, etc.)
Separate alerts per environment
Dashboard views filtered by environment
Cost tracking separated by environment
Staging environment exists and used for testing
Team knows which environment they’re working in

Example: Complete Setup

// config.ts
export interface AppConfig {
  environment: string;
  version: string;
  llm: {
    model: string;
    caching: boolean;
    maxTokens: number;
  };
  monitoring: {
    errorThreshold: number;
    costLimit: number;
  };
}

const configs: Record<string, AppConfig> = {
  development: {
    environment: "development",
    version: process.env.GIT_COMMIT || "dev",
    llm: {
      model: "gpt-4o-mini",
      caching: true,
      maxTokens: 500,
    },
    monitoring: {
      errorThreshold: 0,  // No alerts
      costLimit: 50,      // $50/day
    },
  },
  staging: {
    environment: "staging",
    version: process.env.GIT_COMMIT || "staging",
    llm: {
      model: "gpt-4o",
      caching: true,
      maxTokens: 1000,
    },
    monitoring: {
      errorThreshold: 10,  // 10% errors
      costLimit: 100,      // $100/day
    },
  },
  production: {
    environment: "production",
    version: process.env.GIT_COMMIT || "unknown",
    llm: {
      model: "gpt-4o",
      caching: false,
      maxTokens: 2000,
    },
    monitoring: {
      errorThreshold: 2,   // 2% errors
      costLimit: 500,      // $500/day
    },
  },
};

export const config = configs[process.env.NODE_ENV || "development"];

// client.ts
import { OpenAI } from "openai";
import { config } from "./config";

export const client = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY,
  baseURL: "https://oai.helicone.ai/v1",
  defaultHeaders: {
    "Helicone-Auth": `Bearer ${process.env.HELICONE_API_KEY}`,
    "Helicone-Property-Environment": config.environment,
    "Helicone-Property-Version": config.version,
  },
});

export async function chat(messages: any[]) {
  return client.chat.completions.create(
    {
      model: config.llm.model,
      messages,
      max_tokens: config.llm.maxTokens,
    },
    {
      headers: {
        ...(config.llm.caching && {
          "Helicone-Cache-Enabled": "true",
          "Cache-Control": "max-age=3600",
        }),
      },
    }
  );
}

Next Steps

Custom Properties

Learn more about custom properties

Production Monitoring

Set up comprehensive production monitoring

Cost Tracking

Track costs by environment

Alerts

Configure environment-specific alerts

​The Problem

​The Solution

​Implementation

​1. Add Environment Headers

​2. Add Version Tracking

​3. Environment-Specific Configuration

​Dashboard Usage

​Filter by Environment

​Environment-Specific Alerts

​Use Cases

​1. Pre-Production Testing

​2. Cost Tracking by Environment

​3. Debugging Production Issues

​4. A/B Testing Across Environments

​Best Practices

​Advanced: Multi-Region Environments

​Environment Tracking Checklist

​Example: Complete Setup

​Next Steps

Custom Properties

Production Monitoring

Cost Tracking

Alerts

The Problem

The Solution

Implementation

1. Add Environment Headers

2. Add Version Tracking

3. Environment-Specific Configuration

Dashboard Usage

Filter by Environment

Environment-Specific Alerts

Use Cases

1. Pre-Production Testing

2. Cost Tracking by Environment

3. Debugging Production Issues

4. A/B Testing Across Environments

Best Practices

Advanced: Multi-Region Environments

Environment Tracking Checklist

Example: Complete Setup

Next Steps