What is Distributed Tracing?
Distributed tracing lets you track complex LLM workflows where one request triggers multiple child requests. Visualize the entire execution tree, understand dependencies, and debug multi-step AI operations.
When to Use Traces
Agent Workflows Track agents that make multiple LLM calls to plan, execute, and reflect
RAG Pipelines Trace embedding generation, retrieval, and final generation steps
Parallel Processing Monitor concurrent LLM calls and their relationships
Complex Chains Debug LangChain, LlamaIndex, or custom chains
Traces vs Sessions
Sessions group related requests chronologically (like chat messages).Traces show parent-child relationships between requests (like function calls).
Feature Sessions Traces Relationship Sequential Parent-Child Use Case Conversations Workflows Visualization Timeline Tree Example Multi-turn chat Agent with sub-tasks
How Tracing Works
Helicone supports OpenTelemetry-style tracing with parent-child relationships:
Parent Request (Main task)
├── Child 1 (Subtask A)
│ ├── Grandchild 1 (Step A.1)
│ └── Grandchild 2 (Step A.2)
└── Child 2 (Subtask B)
└── Grandchild 3 (Step B.1)
Using Node IDs
Create traces by setting parent-child relationships with the Helicone-Node-Id header:
Unique identifier for this request node. Format: parent_id:child_id
For root requests: {unique-id}
For child requests: {parent-id}:{child-id}
Basic Trace Example
from openai import OpenAI
import uuid
client = OpenAI(
api_key = "YOUR_OPENAI_KEY" ,
base_url = "https://oai.helicone.ai/v1" ,
default_headers = {
"Helicone-Auth" : "Bearer YOUR_HELICONE_KEY"
}
)
# Parent request
parent_id = str (uuid.uuid4())
parent_response = client.chat.completions.create(
model = "gpt-4" ,
messages = [{ "role" : "user" , "content" : "Create a travel plan for Tokyo" }],
extra_headers = {
"Helicone-Node-Id" : parent_id
}
)
# Child request 1: Research attractions
child1_id = str (uuid.uuid4())
child1_response = client.chat.completions.create(
model = "gpt-4" ,
messages = [{ "role" : "user" , "content" : "What are top attractions in Tokyo?" }],
extra_headers = {
"Helicone-Node-Id" : f " { parent_id } : { child1_id } "
}
)
# Child request 2: Research restaurants
child2_id = str (uuid.uuid4())
child2_response = client.chat.completions.create(
model = "gpt-4" ,
messages = [{ "role" : "user" , "content" : "What are best restaurants in Tokyo?" }],
extra_headers = {
"Helicone-Node-Id" : f " { parent_id } : { child2_id } "
}
)
# Grandchild request: Get specific restaurant details
grandchild_id = str (uuid.uuid4())
grandchild_response = client.chat.completions.create(
model = "gpt-4" ,
messages = [{ "role" : "user" , "content" : "Tell me about Sukiyabashi Jiro" }],
extra_headers = {
"Helicone-Node-Id" : f " { parent_id } : { child2_id } : { grandchild_id } "
}
)
Agent Trace Example
Track a ReAct-style agent:
import uuid
from openai import OpenAI
client = OpenAI(
api_key = "YOUR_OPENAI_KEY" ,
base_url = "https://oai.helicone.ai/v1" ,
default_headers = { "Helicone-Auth" : "Bearer YOUR_HELICONE_KEY" }
)
def run_agent ( task : str ):
agent_id = str (uuid.uuid4())
# Step 1: Plan
plan_id = str (uuid.uuid4())
plan = client.chat.completions.create(
model = "gpt-4" ,
messages = [{
"role" : "user" ,
"content" : f "Plan how to accomplish: { task } "
}],
extra_headers = {
"Helicone-Node-Id" : f " { agent_id } : { plan_id } " ,
"Helicone-Property-Step" : "planning"
}
)
# Step 2: Execute actions
actions = parse_plan(plan.choices[ 0 ].message.content)
for i, action in enumerate (actions):
action_id = str (uuid.uuid4())
result = client.chat.completions.create(
model = "gpt-4" ,
messages = [{
"role" : "user" ,
"content" : f "Execute action: { action } "
}],
extra_headers = {
"Helicone-Node-Id" : f " { agent_id } : { action_id } " ,
"Helicone-Property-Step" : f "action_ { i } "
}
)
# Step 3: Reflect
reflect_id = str (uuid.uuid4())
reflection = client.chat.completions.create(
model = "gpt-4" ,
messages = [{
"role" : "user" ,
"content" : "Review the results and provide final answer"
}],
extra_headers = {
"Helicone-Node-Id" : f " { agent_id } : { reflect_id } " ,
"Helicone-Property-Step" : "reflection"
}
)
return reflection
# Run agent
result = run_agent( "Research AI safety concerns" )
RAG Pipeline Trace
Trace retrieval-augmented generation:
import uuid
from openai import OpenAI
client = OpenAI(
api_key = "YOUR_OPENAI_KEY" ,
base_url = "https://oai.helicone.ai/v1" ,
default_headers = { "Helicone-Auth" : "Bearer YOUR_HELICONE_KEY" }
)
def rag_query ( query : str ):
trace_id = str (uuid.uuid4())
# Step 1: Generate embedding for query
embed_id = str (uuid.uuid4())
embedding = client.embeddings.create(
model = "text-embedding-3-small" ,
input = query,
extra_headers = {
"Helicone-Node-Id" : f " { trace_id } : { embed_id } " ,
"Helicone-Property-Stage" : "embedding"
}
)
# Step 2: Retrieve relevant documents (simulated)
docs = vector_search(embedding.data[ 0 ].embedding)
# Step 3: Generate answer with context
gen_id = str (uuid.uuid4())
answer = client.chat.completions.create(
model = "gpt-4" ,
messages = [
{ "role" : "system" , "content" : "Answer using the provided context." },
{ "role" : "user" , "content" : f "Context: { docs } \n\n Question: { query } " }
],
extra_headers = {
"Helicone-Node-Id" : f " { trace_id } : { gen_id } " ,
"Helicone-Property-Stage" : "generation"
}
)
return answer
result = rag_query( "What is quantum computing?" )
Parallel Request Tracing
Trace concurrent requests:
import asyncio
import uuid
from openai import AsyncOpenAI
client = AsyncOpenAI(
api_key = "YOUR_OPENAI_KEY" ,
base_url = "https://oai.helicone.ai/v1" ,
default_headers = { "Helicone-Auth" : "Bearer YOUR_HELICONE_KEY" }
)
async def parallel_analysis ( topic : str ):
parent_id = str (uuid.uuid4())
# Create multiple parallel analysis tasks
tasks = [
analyze_aspect(parent_id, topic, "technical" , "Technical analysis" ),
analyze_aspect(parent_id, topic, "business" , "Business analysis" ),
analyze_aspect(parent_id, topic, "ethical" , "Ethical analysis" ),
]
results = await asyncio.gather( * tasks)
return results
async def analyze_aspect ( parent_id : str , topic : str , aspect : str , prompt : str ):
child_id = str (uuid.uuid4())
response = await client.chat.completions.create(
model = "gpt-4" ,
messages = [{ "role" : "user" , "content" : f " { prompt } of { topic } " }],
extra_headers = {
"Helicone-Node-Id" : f " { parent_id } : { child_id } " ,
"Helicone-Property-Aspect" : aspect
}
)
return response
# Run parallel analysis
results = asyncio.run(parallel_analysis( "AI regulation" ))
Custom Trace Logging
For non-OpenAI requests or custom tracing:
// Log custom traces via API
await fetch ( 'https://api.helicone.ai/v1/trace/custom/log' , {
method: 'POST' ,
headers: {
'Authorization' : 'Bearer YOUR_HELICONE_KEY' ,
'Content-Type' : 'application/json' ,
'Helicone-Node-Id' : ` ${ parentId } : ${ childId } `
},
body: JSON . stringify ({
providerRequest: {
url: "https://api.anthropic.com/v1/messages" ,
json: {
model: "claude-3-opus-20240229" ,
messages: [{ role: "user" , content: "Hello" }]
},
meta: { 'Helicone-Auth' : 'Bearer YOUR_HELICONE_KEY' }
},
providerResponse: {
json: responseData ,
status: 200 ,
headers: {}
},
timing: {
startTime: { seconds: startTime , nanos: 0 },
endTime: { seconds: endTime , nanos: 0 }
},
provider: "anthropic"
})
});
Viewing Traces
Visualize traces in the Helicone dashboard:
Navigate to Requests
Go to the Requests page and find a traced request
View Trace Tree
Click the trace icon to see the full parent-child hierarchy
Analyze Each Node
Click any node to see its request details, cost, and latency
Identify Bottlenecks
Find slow or expensive operations in the trace tree
Trace Metrics
Helicone calculates metrics across traces:
Total Cost : Sum of all nodes in the trace
Total Duration : Time from root start to last leaf completion
Node Count : Number of requests in the trace
Max Depth : Deepest level in the trace tree
Success Rate : Percentage of successful nodes
Best Practices
Generate unique IDs but keep them traceable: import uuid
parent_id = str (uuid.uuid4())
child_id = str (uuid.uuid4())
node_id = f " { parent_id } : { child_id } "
Add Context with Properties
Use custom properties to annotate trace nodes: extra_headers = {
"Helicone-Node-Id" : node_id,
"Helicone-Property-Step" : "planning" ,
"Helicone-Property-Iteration" : "1"
}
Keep traces manageable. Very deep traces (>10 levels) can be hard to visualize and debug.
Use both tracing (for workflow structure) and sessions (for conversation context): extra_headers = {
"Helicone-Node-Id" : f " { parent_id } : { child_id } " ,
"Helicone-Session-Id" : session_id
}
Continue tracing even if some nodes fail. This helps debug failures: try :
result = make_llm_call(node_id)
except Exception as e:
# Log error but continue trace
log_error(node_id, e)
Tracing Integrations
OpenTelemetry Helicone supports OTEL trace format for compatibility with existing instrumentation
LangChain Automatic tracing for LangChain chains and agents
LlamaIndex Trace RAG pipelines and query engines
Custom Frameworks Use custom trace logging API for any framework
Next Steps
Session Tracking Learn about grouping related requests
Custom Properties Add metadata to trace nodes
Request Logging Understand individual request tracking
User Metrics Track traces per user