Tracing

A short tutorial — attach a passive Tracer to a chat loop, capture the raw request wire, and turn any captured call into a runnable curl you can replay.

A Tracer records the trajectory of a run for debugging. It's a passive observer built on the seams the loop already exposes — it rides your onEvent sink and the provider's wire taps, so the loop never knows it's there and capture can never change a run. Every captured item becomes one timestamped entry in a single ordered timeline, which you can fold into a trajectory, render as a timeline, dump as JSONL — or, the payoff here, turn back into the exact HTTP call that produced it.

This tutorial starts from the multi-turn chat loop and grows one program across three steps:

Attach a Tracer as a passive observer on the event seam.
Capture the request wire — the exact body sent to the model each turn.
Reconstruct a curl at the end of every turn, ready to replay.

Each step below shows the whole program so far — the lines it adds are highlighted.

Step 1 — Attach a Tracer

tracer.sink is an EventSink: the same shape runAgent's onEvent already takes. So tracing is just fanning each event to two places — your renderer for the human, the sink for the record. Nothing about the loop changes. At the end of each turn, formatTrajectory() folds what it captured into one (action → observation) pair per model turn:

examples/tracing-tutorial/step1.ts

import { AgentEventType, defineTool, runAgent, SessionMemoryStore, Tracer } from "@open-agent-loops/core"; 
import type { AgentEvent } from "@open-agent-loops/core";
import { OpenAICompatibleModel } from "@open-agent-loops/core/providers/openai";
import { createInterface } from "node:readline/promises";
import { stdin as input, stdout as output } from "node:process";
import { z } from "zod";

const apiKey = process.env.LLM_API_KEY;
const modelId = process.env.LLM_MODEL;
if (!apiKey || !modelId) {
  console.error("Set LLM_API_KEY and LLM_MODEL (see .env.example).");
  process.exit(1);
}

const baseURL = process.env.LLM_BASE_URL ?? "https://api.featherless.ai/v1";

const weather = defineTool({
  name: "weather",
  description: "Get the current weather for a city.",
  parameters: z.object({ city: z.string().describe("City to look up.") }),
  execute: async ({ city }) => ({ content: `Sunny in ${city}` }),
});

const model = new OpenAICompatibleModel({ apiKey, model: modelId, baseURL, thinking: "on" });

// Stream the run to the human, the same as any chat loop.
function render(e: AgentEvent) {
  switch (e.type) {
    case AgentEventType.ReasoningDelta:
      process.stdout.write(`\x1b[2m${e.text}\x1b[22m`);
      break;
    case AgentEventType.TextDelta:
      process.stdout.write(e.text);
      break;
    case AgentEventType.ToolStart:
      console.log(`\n→ ${e.toolName}(${JSON.stringify(e.args)})`);
      break;
    case AgentEventType.ToolEnd:
      console.log(`← ${e.toolName} [${e.isError ? "error" : "ok"}]: ${e.result}`);
      break;
  }
}

// A Tracer is a passive observer: `tracer.sink` is an EventSink — the exact
// shape `onEvent` already takes. Fan each event to both (render for the human,
// sink for the record) and the loop is none the wiser.
const tracer = new Tracer(); 
const observe = (e: AgentEvent) => { render(e); tracer.sink(e); };

const memory = new SessionMemoryStore(); // one store + one id → one conversation
const sessionId = "tracing";
const rl = createInterface({ input, output });

while (true) {
  const prompt = (await rl.question("\nyou › ")).trim();
  if (prompt === "" || prompt === "exit") break;

  process.stdout.write("bot › ");
  await runAgent({ model, memory, sessionId, prompt, tools: [weather], onEvent: observe }); 

  // End of turn: the same trace, folded into (action → observation) pairs.
  console.log(`\n${tracer.formatTrajectory()}`);
}
rl.close();

bun run examples/tracing-tutorial/step1.ts
you › what's the weather in Paris?

You get the model's reasoning and answer streamed live (as always), then a folded summary of the turn: the assistant's decision and the tool calls it produced, with timings. Because memory and sessionId are reused, the trajectory grows turn by turn — the next step scopes the output to just the latest turn.

Other views of the same trace

The trajectory is one projection. format() renders the full timeline one entry per line; toJSONL() emits one JSON object per line for storage or tooling; toJSON() bundles the timeline with the run's metadata. disclosure() shows how the tool surface and context window changed turn to turn. They all read the same entries — pick the lens you need.

Step 2 — Capture the Request Wire

The trajectory is reconstructed from the loop's events — enough to see what happened, but not the bytes actually sent. To replay a call you need the wire body. onRawRequest is the request-side twin of onRawSSE: it hands the Tracer the fully assembled request — messages (system folded in, every tool_calls block and tool result), tools, and sampling params — once per model turn. onRequest is a lightweight summary that seeds the model id and baseURL into tracer.meta. The highlighted lines wire both taps and report what each turn captured:

examples/tracing-tutorial/step2.ts

import { AgentEventType, defineTool, runAgent, SessionMemoryStore, Tracer } from "@open-agent-loops/core";
import type { AgentEvent } from "@open-agent-loops/core";
import { OpenAICompatibleModel } from "@open-agent-loops/core/providers/openai";
import { createInterface } from "node:readline/promises";
import { stdin as input, stdout as output } from "node:process";
import { z } from "zod";

const apiKey = process.env.LLM_API_KEY;
const modelId = process.env.LLM_MODEL;
if (!apiKey || !modelId) {
  console.error("Set LLM_API_KEY and LLM_MODEL (see .env.example).");
  process.exit(1);
}

const baseURL = process.env.LLM_BASE_URL ?? "https://api.featherless.ai/v1";

const weather = defineTool({
  name: "weather",
  description: "Get the current weather for a city.",
  parameters: z.object({ city: z.string().describe("City to look up.") }),
  execute: async ({ city }) => ({ content: `Sunny in ${city}` }),
});

const tracer = new Tracer();

// The request-side taps. `onRawRequest` is the twin of `onRawSSE`: it hands over
// the exact JSON the SDK POSTs, once per model turn. `onRequest` is a lightweight
// summary — here it seeds the model id and baseURL into `tracer.meta`.
const model = new OpenAICompatibleModel({
  apiKey,
  model: modelId,
  baseURL,
  thinking: "on",
  onRawRequest: tracer.onRawRequest, 
  onRequest: tracer.onRequest, //       model + baseURL → tracer.meta
});

function render(e: AgentEvent) {
  switch (e.type) {
    case AgentEventType.ReasoningDelta:
      process.stdout.write(`\x1b[2m${e.text}\x1b[22m`);
      break;
    case AgentEventType.TextDelta:
      process.stdout.write(e.text);
      break;
    case AgentEventType.ToolStart:
      console.log(`\n→ ${e.toolName}(${JSON.stringify(e.args)})`);
      break;
    case AgentEventType.ToolEnd:
      console.log(`← ${e.toolName} [${e.isError ? "error" : "ok"}]: ${e.result}`);
      break;
  }
}

const observe = (e: AgentEvent) => { render(e); tracer.sink(e); };

const memory = new SessionMemoryStore();
const sessionId = "tracing";
const rl = createInterface({ input, output });

while (true) {
  const prompt = (await rl.question("\nyou › ")).trim();
  if (prompt === "" || prompt === "exit") break;

  const before = tracer.requests().length; // remember where this turn starts
  process.stdout.write("bot › ");
  await runAgent({ model, memory, sessionId, prompt, tools: [weather], onEvent: observe });

  // End of turn: `requests()` is the structured read of the request wire — one
  // body per model turn, with the full tool-call history (it owns the
  // `request_body` filter and unwraps each body, so we don't touch `entries`).
  // `slice(before)` keeps just this turn's; one user message can be several turns.
  const requests = tracer.requests().slice(before);
  const last = requests.at(-1) as { messages?: unknown[] } | undefined;
  console.log(`\n# ${requests.length} request(s) captured · ${last?.messages?.length ?? 0} messages`);
}
rl.close();

bun run examples/tracing-tutorial/step2.ts
you › what's the weather in Paris and Tokyo?

One user message can be several model turns — a turn that calls weather twice, then a turn that answers — so you'll often see two request bodies captured, the second carrying the first's tool calls and their results. tracer.requests() is the structured read of those bodies, in turn order — it owns the request_body filter and unwraps each body, so you never touch tracer.entries. Slicing from the count we saved before the turn gives just this turn's.

Both directions of the wire

onRawRequest captures the request; onRawSSE captures the raw response lines the server streams back. Together they're the complete wire. Neither tap sees HTTP headers, so the API key is never captured — only the request and response bodies. Add tracer.observe(model) for a third, finer grain: the parsed StreamEvents and a per-turn disclosure snapshot.

Step 3 — A Curl at the End of Each Turn

The body is already the whole -d payload; the only things missing from a runnable command are the URL (in tracer.meta.baseURL) and the auth header (an env placeholder). tracer.curls() stitches them together — one runnable command per captured request. The highlighted lines print a curl for every request the turn made:

examples/tracing-tutorial/step3.ts

import { AgentEventType, defineTool, runAgent, SessionMemoryStore, Tracer } from "@open-agent-loops/core"; 
import type { AgentEvent } from "@open-agent-loops/core";
import { OpenAICompatibleModel } from "@open-agent-loops/core/providers/openai";
import { createInterface } from "node:readline/promises";
import { stdin as input, stdout as output } from "node:process";
import { z } from "zod";

const apiKey = process.env.LLM_API_KEY;
const modelId = process.env.LLM_MODEL;
if (!apiKey || !modelId) {
  console.error("Set LLM_API_KEY and LLM_MODEL (see .env.example).");
  process.exit(1);
}

const baseURL = process.env.LLM_BASE_URL ?? "https://api.featherless.ai/v1";

const weather = defineTool({
  name: "weather",
  description: "Get the current weather for a city.",
  parameters: z.object({ city: z.string().describe("City to look up.") }),
  execute: async ({ city }) => ({ content: `Sunny in ${city}` }),
});

const tracer = new Tracer();

const model = new OpenAICompatibleModel({
  apiKey,
  model: modelId,
  baseURL,
  thinking: "on",
  onRawRequest: tracer.onRawRequest,
  onRequest: tracer.onRequest,
});

function render(e: AgentEvent) {
  switch (e.type) {
    case AgentEventType.ReasoningDelta:
      process.stdout.write(`\x1b[2m${e.text}\x1b[22m`);
      break;
    case AgentEventType.TextDelta:
      process.stdout.write(e.text);
      break;
    case AgentEventType.ToolStart:
      console.log(`\n→ ${e.toolName}(${JSON.stringify(e.args)})`);
      break;
    case AgentEventType.ToolEnd:
      console.log(`← ${e.toolName} [${e.isError ? "error" : "ok"}]: ${e.result}`);
      break;
  }
}

const observe = (e: AgentEvent) => { render(e); tracer.sink(e); };

const memory = new SessionMemoryStore();
const sessionId = "tracing";
const rl = createInterface({ input, output });

while (true) {
  const prompt = (await rl.question("\nyou › ")).trim();
  if (prompt === "" || prompt === "exit") break;

  const before = tracer.requests().length;
  process.stdout.write("bot › ");
  await runAgent({ model, memory, sessionId, prompt, tools: [weather], onEvent: observe });

  // End of turn: `curls()` renders each captured body as a runnable curl. It owns
  // the wire plumbing — filtering the request bodies and stitching in
  // `meta.baseURL` — and keeps the key a `$LLM_API_KEY` placeholder (never
  // captured). `stream: false` gives a single readable JSON response on replay;
  // `slice(before)` keeps just this turn's. Paste any to reproduce that exact call.
  const curls = tracer.curls({ apiKeyEnv: "LLM_API_KEY", stream: false }).slice(before);
  console.log(`\n# ${curls.length} request(s) this turn — replay any of them:\n`);
  for (const curl of curls) {
    console.log(curl);
    console.log();
  }
}
rl.close();

bun run examples/tracing-tutorial/step3.ts
you › what's the weather in Paris and Tokyo?

Each printed command reproduces one model turn exactly — paste it into a terminal (with LLM_API_KEY set) and you'll get the same call the agent made, tool-call history and all. The key stays a $LLM_API_KEY placeholder, so the output is safe to copy into a bug report. stream: false flips the body to a single JSON response so a hand-run is easy to read; drop it to stream SSE back.

Readable by default, quoting handled

curls() is sugar over the toCurl building block, forwarding its options. It pretty-prints the JSON body so the command is easy to read — single quotes preserve the newlines, so it stays runnable (pass pretty: false for a compact one-liner). It also escapes any embedded apostrophe as '\'', so message content like "what's the weather" survives the shell intact. Reach for toCurl directly for a custom request path; for very large multi-turn bodies, prefer writing the body to a file and using -d @body.json over an inline payload.

Other timelines: choosing a grain

The tutorial captured the run at two grains: the loop's events (tracer.sink) and the raw request wire (onRawRequest). There's a third — the raw response wire (onRawSSE) — and one format() renders any of them. Every entry carries a source, and format({ sources }) filters the timeline to just the grains you ask for:

`source`	wired by	the timeline shows
`agent`	`tracer.sink`	turns, messages, tool calls, and the streamed reasoning/text deltas
`model`	`onRawRequest`	the exact request body POSTed each turn — the bytes out
`sse`	`onRawSSE`	the raw `data: {…}` lines the server streams back — the bytes in

One subtlety worth calling out, because it surprises people: the reasoning and text deltas in the agent timeline are the loop's parse of the response stream. They arrive token by token, so they look SSE-ish — but they are not the wire. The wire is the sse grain: the literal data: {…} lines onRawSSE taps.

So to see the over-the-wire timeline — each request interleaved with the raw SSE that answered it — wire both raw taps and filter to the two wire grains:

const model = new OpenAICompatibleModel({
  // ...
  onRawRequest: tracer.onRawRequest, // request wire  (out)
  onRawSSE: tracer.onRawSSE, //         response wire (in)
  onRequest: tracer.onRequest, //       run config → meta (model, baseURL)
});

// ...after the run, the request + raw SSE, in arrival order:
console.log(tracer.format({ sources: ["model", "sse"] }));

tracer.observe(model) also records under the model source — the parsed StreamEvents and a per-turn disclosure snapshot — a grain that sits between the agent events and the raw wire. Two runnable scripts sit beside the tutorial:

examples/trace-timeline — the full agent-grain timeline via format(), plus toJSONL() written to disk.
examples/trace-wire — the raw over-the-wire view: both raw taps, the interleaved request + SSE timeline, and the exact request bytes per turn.

format() truncates; the JSON doesn't

format() clips every value to maxValueLength (default 80), so request_body rows show a compact body msgs=N tools=M marker and long SSE lines are cut. Raise maxValueLength, narrow with sources, or read the bytes verbatim from toJSON() / toJSONL() when you need them in full.

Recap

Starting from a plain chat loop, a few lines at a time you attached a passive Tracer on the existing event seam, captured the raw request wire with onRawRequest, and turned each captured body into a runnable curl with tracer.curls(). The loop never changed — the Tracer only ever observes.

From here:

Bring Your Own Model Client — where the onRawRequest / onRawSSE / onRequest taps live, and how to point the provider at any OpenAI-compatible endpoint.
Bring Your Own Front End goes deeper on the onEvent seam the Tracer rides.
The API reference has exact signatures for Tracer, toCurl, and every trace view (format, trajectory, disclosure, toJSON/toJSONL).