Tracing
A short tutorial — attach a passive Tracer to a chat loop, capture the raw request wire, and turn any captured call into a runnable curl you can replay.
A Tracer records the trajectory of a run for debugging. It's a passive
observer built on the seams the loop already exposes — it rides your onEvent
sink and the provider's wire taps, so the loop never knows it's there and capture
can never change a run. Every captured item becomes one timestamped entry in a
single ordered timeline, which you can fold into a trajectory, render as a
timeline, dump as JSONL — or, the payoff here, turn back into the exact HTTP call
that produced it.
This tutorial starts from the multi-turn chat loop and grows one program across three steps:
- Attach a Tracer as a passive observer on the event seam.
- Capture the request wire — the exact body sent to the model each turn.
- Reconstruct a curl at the end of every turn, ready to replay.
Each step below shows the whole program so far — the lines it adds are highlighted.
Step 1 — Attach a Tracer
tracer.sink is an EventSink: the same shape runAgent's onEvent already
takes. So tracing is just fanning each event to two places — your renderer for
the human, the sink for the record. Nothing about the loop changes. At the end of
each turn, formatTrajectory() folds what it captured into one (action →
observation) pair per model turn:
import { AgentEventType, defineTool, runAgent, SessionMemoryStore, Tracer } from "@open-agent-loops/core";
import type { AgentEvent } from "@open-agent-loops/core";
import { OpenAICompatibleModel } from "@open-agent-loops/core/providers/openai";
import { createInterface } from "node:readline/promises";
import { stdin as input, stdout as output } from "node:process";
import { z } from "zod";
const apiKey = process.env.LLM_API_KEY;
const modelId = process.env.LLM_MODEL;
if (!apiKey || !modelId) {
console.error("Set LLM_API_KEY and LLM_MODEL (see .env.example).");
process.exit(1);
}
const baseURL = process.env.LLM_BASE_URL ?? "https://api.featherless.ai/v1";
const weather = defineTool({
name: "weather",
description: "Get the current weather for a city.",
parameters: z.object({ city: z.string().describe("City to look up.") }),
execute: async ({ city }) => ({ content: `Sunny in ${city}` }),
});
const model = new OpenAICompatibleModel({ apiKey, model: modelId, baseURL, thinking: "on" });
// Stream the run to the human, the same as any chat loop.
function render(e: AgentEvent) {
switch (e.type) {
case AgentEventType.ReasoningDelta:
process.stdout.write(`\x1b[2m${e.text}\x1b[22m`);
break;
case AgentEventType.TextDelta:
process.stdout.write(e.text);
break;
case AgentEventType.ToolStart:
console.log(`\n→ ${e.toolName}(${JSON.stringify(e.args)})`);
break;
case AgentEventType.ToolEnd:
console.log(`← ${e.toolName} [${e.isError ? "error" : "ok"}]: ${e.result}`);
break;
}
}
// A Tracer is a passive observer: `tracer.sink` is an EventSink — the exact
// shape `onEvent` already takes. Fan each event to both (render for the human,
// sink for the record) and the loop is none the wiser.
const tracer = new Tracer();
const observe = (e: AgentEvent) => { render(e); tracer.sink(e); };
const memory = new SessionMemoryStore(); // one store + one id → one conversation
const sessionId = "tracing";
const rl = createInterface({ input, output });
while (true) {
const prompt = (await rl.question("\nyou › ")).trim();
if (prompt === "" || prompt === "exit") break;
process.stdout.write("bot › ");
await runAgent({ model, memory, sessionId, prompt, tools: [weather], onEvent: observe });
// End of turn: the same trace, folded into (action → observation) pairs.
console.log(`\n${tracer.formatTrajectory()}`);
}
rl.close();bun run examples/tracing-tutorial/step1.ts
you › what's the weather in Paris?You get the model's reasoning and answer streamed live (as always), then a folded
summary of the turn: the assistant's decision and the tool calls it produced, with
timings. Because memory and sessionId are reused, the trajectory grows turn by
turn — the next step scopes the output to just the latest turn.
Other views of the same trace
The trajectory is one projection. format() renders the full timeline one entry
per line; toJSONL() emits one JSON object per line for storage or tooling;
toJSON() bundles the timeline with the run's metadata. disclosure() shows how
the tool surface and context window changed turn to turn. They all read the same
entries — pick the lens you need.
Step 2 — Capture the Request Wire
The trajectory is reconstructed from the loop's events — enough to see what
happened, but not the bytes actually sent. To replay a call you need the wire
body. onRawRequest is the request-side twin of onRawSSE: it hands the Tracer
the fully assembled request — messages (system folded in, every tool_calls
block and tool result), tools, and sampling params — once per model turn.
onRequest is a lightweight summary that seeds the model id and baseURL into
tracer.meta. The highlighted lines wire both taps and report what each turn
captured:
import { AgentEventType, defineTool, runAgent, SessionMemoryStore, Tracer } from "@open-agent-loops/core";
import type { AgentEvent } from "@open-agent-loops/core";
import { OpenAICompatibleModel } from "@open-agent-loops/core/providers/openai";
import { createInterface } from "node:readline/promises";
import { stdin as input, stdout as output } from "node:process";
import { z } from "zod";
const apiKey = process.env.LLM_API_KEY;
const modelId = process.env.LLM_MODEL;
if (!apiKey || !modelId) {
console.error("Set LLM_API_KEY and LLM_MODEL (see .env.example).");
process.exit(1);
}
const baseURL = process.env.LLM_BASE_URL ?? "https://api.featherless.ai/v1";
const weather = defineTool({
name: "weather",
description: "Get the current weather for a city.",
parameters: z.object({ city: z.string().describe("City to look up.") }),
execute: async ({ city }) => ({ content: `Sunny in ${city}` }),
});
const tracer = new Tracer();
// The request-side taps. `onRawRequest` is the twin of `onRawSSE`: it hands over
// the exact JSON the SDK POSTs, once per model turn. `onRequest` is a lightweight
// summary — here it seeds the model id and baseURL into `tracer.meta`.
const model = new OpenAICompatibleModel({
apiKey,
model: modelId,
baseURL,
thinking: "on",
onRawRequest: tracer.onRawRequest,
onRequest: tracer.onRequest, // model + baseURL → tracer.meta
});
function render(e: AgentEvent) {
switch (e.type) {
case AgentEventType.ReasoningDelta:
process.stdout.write(`\x1b[2m${e.text}\x1b[22m`);
break;
case AgentEventType.TextDelta:
process.stdout.write(e.text);
break;
case AgentEventType.ToolStart:
console.log(`\n→ ${e.toolName}(${JSON.stringify(e.args)})`);
break;
case AgentEventType.ToolEnd:
console.log(`← ${e.toolName} [${e.isError ? "error" : "ok"}]: ${e.result}`);
break;
}
}
const observe = (e: AgentEvent) => { render(e); tracer.sink(e); };
const memory = new SessionMemoryStore();
const sessionId = "tracing";
const rl = createInterface({ input, output });
while (true) {
const prompt = (await rl.question("\nyou › ")).trim();
if (prompt === "" || prompt === "exit") break;
const before = tracer.requests().length; // remember where this turn starts
process.stdout.write("bot › ");
await runAgent({ model, memory, sessionId, prompt, tools: [weather], onEvent: observe });
// End of turn: `requests()` is the structured read of the request wire — one
// body per model turn, with the full tool-call history (it owns the
// `request_body` filter and unwraps each body, so we don't touch `entries`).
// `slice(before)` keeps just this turn's; one user message can be several turns.
const requests = tracer.requests().slice(before);
const last = requests.at(-1) as { messages?: unknown[] } | undefined;
console.log(`\n# ${requests.length} request(s) captured · ${last?.messages?.length ?? 0} messages`);
}
rl.close();bun run examples/tracing-tutorial/step2.ts
you › what's the weather in Paris and Tokyo?One user message can be several model turns — a turn that calls weather
twice, then a turn that answers — so you'll often see two request bodies captured,
the second carrying the first's tool calls and their results. tracer.requests()
is the structured read of those bodies, in turn order — it owns the request_body
filter and unwraps each body, so you never touch tracer.entries. Slicing from
the count we saved before the turn gives just this turn's.
Both directions of the wire
onRawRequest captures the request; onRawSSE captures the raw response lines
the server streams back. Together they're the complete wire. Neither tap sees
HTTP headers, so the API key is never captured — only the request and
response bodies. Add tracer.observe(model) for a third, finer grain: the
parsed StreamEvents and a per-turn disclosure snapshot.
Step 3 — A Curl at the End of Each Turn
The body is already the whole -d payload; the only things missing from a
runnable command are the URL (in tracer.meta.baseURL) and the auth header (an
env placeholder). tracer.curls() stitches them together — one runnable command
per captured request. The highlighted lines print a curl for every request the
turn made:
import { AgentEventType, defineTool, runAgent, SessionMemoryStore, Tracer } from "@open-agent-loops/core";
import type { AgentEvent } from "@open-agent-loops/core";
import { OpenAICompatibleModel } from "@open-agent-loops/core/providers/openai";
import { createInterface } from "node:readline/promises";
import { stdin as input, stdout as output } from "node:process";
import { z } from "zod";
const apiKey = process.env.LLM_API_KEY;
const modelId = process.env.LLM_MODEL;
if (!apiKey || !modelId) {
console.error("Set LLM_API_KEY and LLM_MODEL (see .env.example).");
process.exit(1);
}
const baseURL = process.env.LLM_BASE_URL ?? "https://api.featherless.ai/v1";
const weather = defineTool({
name: "weather",
description: "Get the current weather for a city.",
parameters: z.object({ city: z.string().describe("City to look up.") }),
execute: async ({ city }) => ({ content: `Sunny in ${city}` }),
});
const tracer = new Tracer();
const model = new OpenAICompatibleModel({
apiKey,
model: modelId,
baseURL,
thinking: "on",
onRawRequest: tracer.onRawRequest,
onRequest: tracer.onRequest,
});
function render(e: AgentEvent) {
switch (e.type) {
case AgentEventType.ReasoningDelta:
process.stdout.write(`\x1b[2m${e.text}\x1b[22m`);
break;
case AgentEventType.TextDelta:
process.stdout.write(e.text);
break;
case AgentEventType.ToolStart:
console.log(`\n→ ${e.toolName}(${JSON.stringify(e.args)})`);
break;
case AgentEventType.ToolEnd:
console.log(`← ${e.toolName} [${e.isError ? "error" : "ok"}]: ${e.result}`);
break;
}
}
const observe = (e: AgentEvent) => { render(e); tracer.sink(e); };
const memory = new SessionMemoryStore();
const sessionId = "tracing";
const rl = createInterface({ input, output });
while (true) {
const prompt = (await rl.question("\nyou › ")).trim();
if (prompt === "" || prompt === "exit") break;
const before = tracer.requests().length;
process.stdout.write("bot › ");
await runAgent({ model, memory, sessionId, prompt, tools: [weather], onEvent: observe });
// End of turn: `curls()` renders each captured body as a runnable curl. It owns
// the wire plumbing — filtering the request bodies and stitching in
// `meta.baseURL` — and keeps the key a `$LLM_API_KEY` placeholder (never
// captured). `stream: false` gives a single readable JSON response on replay;
// `slice(before)` keeps just this turn's. Paste any to reproduce that exact call.
const curls = tracer.curls({ apiKeyEnv: "LLM_API_KEY", stream: false }).slice(before);
console.log(`\n# ${curls.length} request(s) this turn — replay any of them:\n`);
for (const curl of curls) {
console.log(curl);
console.log();
}
}
rl.close();bun run examples/tracing-tutorial/step3.ts
you › what's the weather in Paris and Tokyo?Each printed command reproduces one model turn exactly — paste it into a terminal
(with LLM_API_KEY set) and you'll get the same call the agent made, tool-call
history and all. The key stays a $LLM_API_KEY placeholder, so the output is safe
to copy into a bug report. stream: false flips the body to a single JSON
response so a hand-run is easy to read; drop it to stream SSE back.
Readable by default, quoting handled
curls() is sugar over the toCurl building block, forwarding its options. It
pretty-prints the JSON body so the command is easy to read — single quotes
preserve the newlines, so it stays runnable (pass pretty: false for a compact
one-liner). It also escapes any embedded apostrophe as '\'', so message
content like "what's the weather" survives the shell intact. Reach for
toCurl directly for a custom request path; for very large multi-turn bodies,
prefer writing the body to a file and using -d @body.json over an inline payload.
Other timelines: choosing a grain
The tutorial captured the run at two grains: the loop's events
(tracer.sink) and the raw request wire (onRawRequest). There's a third —
the raw response wire (onRawSSE) — and one format() renders any of them.
Every entry carries a source, and format({ sources }) filters the timeline to
just the grains you ask for:
source | wired by | the timeline shows |
|---|---|---|
agent | tracer.sink | turns, messages, tool calls, and the streamed reasoning/text deltas |
model | onRawRequest | the exact request body POSTed each turn — the bytes out |
sse | onRawSSE | the raw data: {…} lines the server streams back — the bytes in |
One subtlety worth calling out, because it surprises people: the reasoning and
text deltas in the agent timeline are the loop's parse of the response
stream. They arrive token by token, so they look SSE-ish — but they are not the
wire. The wire is the sse grain: the literal data: {…} lines onRawSSE taps.
So to see the over-the-wire timeline — each request interleaved with the raw SSE that answered it — wire both raw taps and filter to the two wire grains:
const model = new OpenAICompatibleModel({
// ...
onRawRequest: tracer.onRawRequest, // request wire (out)
onRawSSE: tracer.onRawSSE, // response wire (in)
onRequest: tracer.onRequest, // run config → meta (model, baseURL)
});
// ...after the run, the request + raw SSE, in arrival order:
console.log(tracer.format({ sources: ["model", "sse"] }));tracer.observe(model) also records under the model source — the parsed
StreamEvents and a per-turn disclosure snapshot — a grain that sits between the
agent events and the raw wire. Two runnable scripts sit beside the tutorial:
examples/trace-timeline— the full agent-grain timeline viaformat(), plustoJSONL()written to disk.examples/trace-wire— the raw over-the-wire view: both raw taps, the interleaved request + SSE timeline, and the exact request bytes per turn.
format() truncates; the JSON doesn't
format() clips every value to maxValueLength (default 80), so request_body
rows show a compact body msgs=N tools=M marker and long SSE lines are cut.
Raise maxValueLength, narrow with sources, or read the bytes verbatim from
toJSON() / toJSONL() when you need them in full.
Recap
Starting from a plain chat loop, a few lines at a time you attached a passive
Tracer on the existing event seam, captured the raw request wire with
onRawRequest, and turned each captured body into a runnable curl with
tracer.curls(). The loop never changed — the Tracer only ever observes.
From here:
- Bring Your Own Model Client — where the
onRawRequest/onRawSSE/onRequesttaps live, and how to point the provider at any OpenAI-compatible endpoint. - Bring Your Own Front End goes deeper on the
onEventseam the Tracer rides. - The API reference has exact signatures for
Tracer,toCurl, and every trace view (format,trajectory,disclosure,toJSON/toJSONL).