Tools
A short tutorial — give the model a tool with defineTool, let it recover from errors, organize many tools in a registry, and reach for the built-in tools over your own backend.
A tool is a capability you hand the model: a name, a one-line
description, a Zod schema for its arguments, and an execute handler. The loop
validates each call against the schema before execute runs, so your handler
can trust its input — and the description plus every field's .describe() are all
the model knows about a tool, so documenting one well directly improves how it's
used.
This tutorial starts from the multi-turn chat loop and grows one program across five steps:
- Define a tool the model can call.
- Let it fail safely — throw an error the model recovers from.
- Organize many tools in a
ToolRegistry, and run a stateful one safely. - Reach for a built-in tool over a backend you supply.
- Gate a risky call behind your approval before it runs.
Each step below shows the whole program so far — the lines it adds are highlighted.
Step 1 — Define a Tool
Author every tool through defineTool. It's an identity function at runtime; its
job is to infer the args type of execute from your schema, so the handler is
fully typed for free. The handler returns a ToolResult whose content string is
folded back into the conversation as the tool's result. Here a weather tool
joins the chat loop's tools array:
import {
AgentEventType,
defineTool,
runAgent,
SessionMemoryStore,
} from "@open-agent-loops/core";
import type { AgentEvent } from "@open-agent-loops/core";
import { OpenAICompatibleModel } from "@open-agent-loops/core/providers/openai";
import { createInterface } from "node:readline/promises";
import { stdin as input, stdout as output } from "node:process";
import { z } from "zod";
const apiKey = process.env.LLM_API_KEY;
if (!apiKey) {
console.error("Set LLM_API_KEY (see .env.example).");
process.exit(1);
}
// DeepSeek V4 tool-calls cleanly; GLM emits broken empty-key tool args.
const model = new OpenAICompatibleModel({
apiKey,
baseURL: process.env.LLM_BASE_URL ?? "https://api.featherless.ai/v1",
model: process.env.LLM_MODEL ?? "deepseek-ai/DeepSeek-V4-Flash",
thinking: "on",
});
// A tool is a name, a description, a Zod schema, and an execute handler. `args`
// is inferred from the schema, so `{ city }` below is typed as a string — and
// the description plus each field's `.describe()` are all the model sees.
const weather = defineTool({
name: "weather",
description: "Get the current weather for a city.",
parameters: z.object({
city: z.string().describe('City to look up, e.g. "Paris".'),
}),
execute: async ({ city }) => {
// Replace with a real API call. The returned `content` is folded back into
// the conversation as this tool call's result.
return { content: `Sunny, 21°C in ${city}.` };
},
});
// Render every event the loop emits, including the dimmed reasoning channel.
function render(e: AgentEvent) {
switch (e.type) {
case AgentEventType.ReasoningDelta:
process.stdout.write(`\x1b[2m${e.text}\x1b[22m`);
break;
case AgentEventType.TextDelta:
process.stdout.write(e.text);
break;
case AgentEventType.ToolStart:
console.log(`→ ${e.toolName}(${JSON.stringify(e.args)})`);
break;
case AgentEventType.ToolEnd:
console.log(`← ${e.toolName} [${e.isError ? "error" : "ok"}]: ${e.result}`);
break;
}
}
// Multi-turn: one memory + one sessionId, reused every turn.
const memory = new SessionMemoryStore();
const sessionId = "tool-tutorial";
const rl = createInterface({ input, output });
while (true) {
const prompt = (await rl.question("\nyou › ")).trim();
if (prompt === "" || prompt === "exit") break;
process.stdout.write("bot › ");
await runAgent({
model,
memory,
sessionId,
prompt,
tools: [weather],
onEvent: render,
});
}
rl.close();bun run examples/tool-tutorial/step1.ts
you › what's the weather in Paris?The model sees weather and its schema every turn; when the conversation needs
the weather it calls weather({ city: "Paris" }), and the content you return
comes back as that call's result. Because memory and sessionId are reused, the
next message continues the same conversation.
Step 2 — Let a Tool Fail Safely
Tools touch the real world, and real things fail. There is deliberately no
error field on a result: a tool signals a hard error by throwing. The loop
catches the throw and turns it into a tool result marked isError, with the
error's message as the content the model sees — so the model can read it and
recover, instead of the whole run rejecting. The highlighted lines give weather
a tiny set of known cities and throw for anything else:
import {
AgentEventType,
defineTool,
runAgent,
SessionMemoryStore,
} from "@open-agent-loops/core";
import type { AgentEvent } from "@open-agent-loops/core";
import { OpenAICompatibleModel } from "@open-agent-loops/core/providers/openai";
import { createInterface } from "node:readline/promises";
import { stdin as input, stdout as output } from "node:process";
import { z } from "zod";
const apiKey = process.env.LLM_API_KEY;
if (!apiKey) {
console.error("Set LLM_API_KEY (see .env.example).");
process.exit(1);
}
// DeepSeek V4 tool-calls cleanly; GLM emits broken empty-key tool args.
const model = new OpenAICompatibleModel({
apiKey,
baseURL: process.env.LLM_BASE_URL ?? "https://api.featherless.ai/v1",
model: process.env.LLM_MODEL ?? "deepseek-ai/DeepSeek-V4-Flash",
thinking: "on",
});
// A tiny "backend" so the tool has something real to fail on.
const KNOWN: Record<string, string> = {
Paris: "Sunny, 21°C",
Tokyo: "Rainy, 18°C",
};
const weather = defineTool({
name: "weather",
description: "Get the current weather for a city.",
parameters: z.object({
city: z.string().describe('City to look up, e.g. "Paris".'),
}),
execute: async ({ city }) => {
const report = KNOWN[city];
if (!report) {
// Throw for a hard error: only the message reaches the model, so make it
// useful. The run does NOT crash — the model can pick another city.
throw new Error(`No weather for "${city}". Known cities: ${Object.keys(KNOWN).join(", ")}.`);
}
return { content: `${report} in ${city}.` };
},
});
// Render every event the loop emits, including the dimmed reasoning channel.
function render(e: AgentEvent) {
switch (e.type) {
case AgentEventType.ReasoningDelta:
process.stdout.write(`\x1b[2m${e.text}\x1b[22m`);
break;
case AgentEventType.TextDelta:
process.stdout.write(e.text);
break;
case AgentEventType.ToolStart:
console.log(`→ ${e.toolName}(${JSON.stringify(e.args)})`);
break;
case AgentEventType.ToolEnd:
console.log(`← ${e.toolName} [${e.isError ? "error" : "ok"}]: ${e.result}`);
break;
}
}
// Multi-turn: one memory + one sessionId, reused every turn.
const memory = new SessionMemoryStore();
const sessionId = "tool-tutorial";
const rl = createInterface({ input, output });
while (true) {
const prompt = (await rl.question("\nyou › ")).trim();
if (prompt === "" || prompt === "exit") break;
process.stdout.write("bot › ");
await runAgent({
model,
memory,
sessionId,
prompt,
tools: [weather],
onEvent: render,
});
}
rl.close();bun run examples/tool-tutorial/step2.ts
you › what's the weather on the Moon?Only the error's message reaches the model, so make it useful — here it lists the known cities, and the model simply asks you to pick one. A normal return is for success and for soft outcomes the model should read but not treat as failure (a non-zero shell exit, an "already up to date" note).
Two more fields on a result
A ToolResult can also carry details — a structured payload for hooks that's
never sent to the model — and terminate: true, which stops the loop right
after that result. A "submit final answer" tool is just a tool that returns
terminate: true.
Step 3 — Organize Many Tools in a Registry
Past a handful of tools, a ToolRegistry keeps a named catalog you build once and
resolve from. It's a composition helper, not a loop dependency — runAgent
still takes a plain Tool[]; the registry just produces that array, so you can
adopt it incrementally. The highlighted lines add a second, stateful tool and
collect both in a registry:
import {
AgentEventType,
defineTool,
ExecutionMode,
runAgent,
SessionMemoryStore,
ToolRegistry,
} from "@open-agent-loops/core";
import type { AgentEvent } from "@open-agent-loops/core";
import { OpenAICompatibleModel } from "@open-agent-loops/core/providers/openai";
import { createInterface } from "node:readline/promises";
import { stdin as input, stdout as output } from "node:process";
import { z } from "zod";
const apiKey = process.env.LLM_API_KEY;
if (!apiKey) {
console.error("Set LLM_API_KEY (see .env.example).");
process.exit(1);
}
// DeepSeek V4 tool-calls cleanly; GLM emits broken empty-key tool args.
const model = new OpenAICompatibleModel({
apiKey,
baseURL: process.env.LLM_BASE_URL ?? "https://api.featherless.ai/v1",
model: process.env.LLM_MODEL ?? "deepseek-ai/DeepSeek-V4-Flash",
thinking: "on",
});
const KNOWN: Record<string, string> = {
Paris: "Sunny, 21°C",
Tokyo: "Rainy, 18°C",
};
const weather = defineTool({
name: "weather",
description: "Get the current weather for a city.",
parameters: z.object({
city: z.string().describe('City to look up, e.g. "Paris".'),
}),
execute: async ({ city }) => {
const report = KNOWN[city];
if (!report) {
throw new Error(`No weather for "${city}". Known cities: ${Object.keys(KNOWN).join(", ")}.`);
}
return { content: `${report} in ${city}.` };
},
});
// A stateful tool: it appends to a shared list. Mark it Sequential so several
// calls in one turn run one-at-a-time, never racing on `notes` — the same
// reason the planning tools run sequentially.
const notes: string[] = [];
const remember = defineTool({
name: "remember",
description: "Save a short note for later in the conversation.",
parameters: z.object({ note: z.string().describe("The note to save.") }),
execute: async ({ note }) => {
notes.push(note);
return { content: `Saved. ${notes.length} note(s) so far.` };
},
executionMode: ExecutionMode.Sequential,
});
// Build the catalog once; hand the loop the array the registry resolves.
const registry = new ToolRegistry([weather, remember]);
// Render every event the loop emits, including the dimmed reasoning channel.
function render(e: AgentEvent) {
switch (e.type) {
case AgentEventType.ReasoningDelta:
process.stdout.write(`\x1b[2m${e.text}\x1b[22m`);
break;
case AgentEventType.TextDelta:
process.stdout.write(e.text);
break;
case AgentEventType.ToolStart:
console.log(`→ ${e.toolName}(${JSON.stringify(e.args)})`);
break;
case AgentEventType.ToolEnd:
console.log(`← ${e.toolName} [${e.isError ? "error" : "ok"}]: ${e.result}`);
break;
}
}
// Multi-turn: one memory + one sessionId, reused every turn.
const memory = new SessionMemoryStore();
const sessionId = "tool-tutorial";
const rl = createInterface({ input, output });
while (true) {
const prompt = (await rl.question("\nyou › ")).trim();
if (prompt === "" || prompt === "exit") break;
process.stdout.write("bot › ");
await runAgent({
model,
memory,
sessionId,
prompt,
tools: registry.list(),
onEvent: render,
});
}
rl.close();bun run examples/tool-tutorial/step3.ts
you › remember to pack an umbrella, then check Tokyo's weatherTwo things land here. First, registry.list() feeds the loop every registered
tool (use registry.resolve(["weather"]) for just a named subset); registering a
duplicate name or resolving an unknown one throws and lists what is there, so a
wiring typo fails loudly. Second, remember mutates a shared notes array, so it
sets executionMode: ExecutionMode.Sequential.
Parallel by default
When one turn makes several tool calls, the loop runs them in parallel —
fine for read-only tools like weather. A tool that mutates shared state sets
ExecutionMode.Sequential to force that turn's whole batch to run one-at-a-time,
so the calls can't race. It's exactly why the planning
tools run sequentially.
Step 4 — Reach for a Built-in Tool
You rarely start from nothing. The SDK ships the common agent tools — shell,
code_execution, search, file read/write, web fetch/search, a browser — but with a deliberate
split: the SDK owns the model-facing contract (the name, the schema, the
result formatting), and you supply the backend that does the work. The
highlighted lines add the built-in shell over a Bun-based backend:
import {
AgentEventType,
defineTool,
ExecutionMode,
runAgent,
SessionMemoryStore,
shellTool,
ToolRegistry,
} from "@open-agent-loops/core";
import type { AgentEvent } from "@open-agent-loops/core";
import { OpenAICompatibleModel } from "@open-agent-loops/core/providers/openai";
import { bunShellBackend } from "./bun-backends";
import { createInterface } from "node:readline/promises";
import { stdin as input, stdout as output } from "node:process";
import { z } from "zod";
const apiKey = process.env.LLM_API_KEY;
if (!apiKey) {
console.error("Set LLM_API_KEY (see .env.example).");
process.exit(1);
}
// DeepSeek V4 tool-calls cleanly; GLM emits broken empty-key tool args.
const model = new OpenAICompatibleModel({
apiKey,
baseURL: process.env.LLM_BASE_URL ?? "https://api.featherless.ai/v1",
model: process.env.LLM_MODEL ?? "deepseek-ai/DeepSeek-V4-Flash",
thinking: "on",
});
const KNOWN: Record<string, string> = {
Paris: "Sunny, 21°C",
Tokyo: "Rainy, 18°C",
};
const weather = defineTool({
name: "weather",
description: "Get the current weather for a city.",
parameters: z.object({
city: z.string().describe('City to look up, e.g. "Paris".'),
}),
execute: async ({ city }) => {
const report = KNOWN[city];
if (!report) {
throw new Error(`No weather for "${city}". Known cities: ${Object.keys(KNOWN).join(", ")}.`);
}
return { content: `${report} in ${city}.` };
},
});
const notes: string[] = [];
const remember = defineTool({
name: "remember",
description: "Save a short note for later in the conversation.",
parameters: z.object({ note: z.string().describe("The note to save.") }),
execute: async ({ note }) => {
notes.push(note);
return { content: `Saved. ${notes.length} note(s) so far.` };
},
executionMode: ExecutionMode.Sequential,
});
// A built-in tool: the SDK ships the `shell` contract; you bring the backend
// that runs the command (here Bun's, host glue you provide).
const shell = shellTool(bunShellBackend());
const registry = new ToolRegistry([weather, remember, shell]);
// Render every event the loop emits, including the dimmed reasoning channel.
function render(e: AgentEvent) {
switch (e.type) {
case AgentEventType.ReasoningDelta:
process.stdout.write(`\x1b[2m${e.text}\x1b[22m`);
break;
case AgentEventType.TextDelta:
process.stdout.write(e.text);
break;
case AgentEventType.ToolStart:
console.log(`→ ${e.toolName}(${JSON.stringify(e.args)})`);
break;
case AgentEventType.ToolEnd:
console.log(`← ${e.toolName} [${e.isError ? "error" : "ok"}]: ${e.result}`);
break;
}
}
// Multi-turn: one memory + one sessionId, reused every turn.
const memory = new SessionMemoryStore();
const sessionId = "tool-tutorial";
const rl = createInterface({ input, output });
while (true) {
const prompt = (await rl.question("\nyou › ")).trim();
if (prompt === "" || prompt === "exit") break;
process.stdout.write("bot › ");
await runAgent({
model,
memory,
sessionId,
prompt,
tools: registry.list(),
onEvent: render,
});
}
rl.close();bun run examples/tool-tutorial/step4.ts
you › list the files in the current directoryshellTool gives the model a stable shell tool; bunShellBackend() is the host
glue you bring. That boundary is the point: anything that runs a command, reads
a file, or hits the network should be your code, not the SDK's — so each built-in
(codeExecutionTool, searchTool, readTool / globTool, editTool / writeTool,
webFetchTool / webSearchTool, browserTools) takes a backend interface you implement. The
exception is the planning tools, which are pure
in-memory and ship ready to use.
Two ways to run code
Two built-ins run model-written instructions, in different shapes. shell runs one
command string; code_execution runs a snippet in a named language
({ language, code }). They differ on isolation: bunShellBackend runs unsandboxed
on the host, while the shipped denoCodeExecutionBackend executes JavaScript/TypeScript
through Deno's deny-by-default permission sandbox — real isolation with no extra
infrastructure (grant file, network, or env access explicitly via allow). Either way
the result ends with an explicit [exit 0: ok] / [exit N: error] verdict, so a run is
never a contentless result.
Step 5 — Gate a Tool Behind Approval
Some calls are risky enough that a human should sign off first — running a shell
command, deploying, deleting. Admission is a seam separate from the tool itself:
the gateToolCalls hook sees the whole turn's calls before any of them run and
decides allow / deny / ask. It's batteries-included — pair the shipped
permissionGate with an InMemoryPermissionStore (the policy) and an
ApprovalPrompter (how you ask). The highlighted lines let weather and
remember run freely and prompt before shell:
import {
AgentEventType,
ApprovalChoice,
defineTool,
ExecutionMode,
InMemoryPermissionStore,
permissionGate,
PermissionPolicy,
runAgent,
SessionMemoryStore,
shellTool,
ToolRegistry,
} from "@open-agent-loops/core";
import type { AgentEvent, ApprovalPrompter } from "@open-agent-loops/core";
import { OpenAICompatibleModel } from "@open-agent-loops/core/providers/openai";
import { bunShellBackend } from "./bun-backends";
import { createInterface } from "node:readline/promises";
import { stdin as input, stdout as output } from "node:process";
import { z } from "zod";
const apiKey = process.env.LLM_API_KEY;
if (!apiKey) {
console.error("Set LLM_API_KEY (see .env.example).");
process.exit(1);
}
// DeepSeek V4 tool-calls cleanly; GLM emits broken empty-key tool args.
const model = new OpenAICompatibleModel({
apiKey,
baseURL: process.env.LLM_BASE_URL ?? "https://api.featherless.ai/v1",
model: process.env.LLM_MODEL ?? "deepseek-ai/DeepSeek-V4-Flash",
thinking: "on",
});
const KNOWN: Record<string, string> = {
Paris: "Sunny, 21°C",
Tokyo: "Rainy, 18°C",
};
const weather = defineTool({
name: "weather",
description: "Get the current weather for a city.",
parameters: z.object({
city: z.string().describe('City to look up, e.g. "Paris".'),
}),
execute: async ({ city }) => {
const report = KNOWN[city];
if (!report) {
throw new Error(`No weather for "${city}". Known cities: ${Object.keys(KNOWN).join(", ")}.`);
}
return { content: `${report} in ${city}.` };
},
});
const notes: string[] = [];
const remember = defineTool({
name: "remember",
description: "Save a short note for later in the conversation.",
parameters: z.object({ note: z.string().describe("The note to save.") }),
execute: async ({ note }) => {
notes.push(note);
return { content: `Saved. ${notes.length} note(s) so far.` };
},
executionMode: ExecutionMode.Sequential,
});
// The SDK ships the `shell` contract; you bring the backend that runs the command.
const shell = shellTool(bunShellBackend());
const registry = new ToolRegistry([weather, remember, shell]);
// Render every event the loop emits, including the dimmed reasoning channel.
function render(e: AgentEvent) {
switch (e.type) {
case AgentEventType.ReasoningDelta:
process.stdout.write(`\x1b[2m${e.text}\x1b[22m`);
break;
case AgentEventType.TextDelta:
process.stdout.write(e.text);
break;
case AgentEventType.ToolStart:
console.log(`→ ${e.toolName}(${JSON.stringify(e.args)})`);
break;
case AgentEventType.ToolEnd:
console.log(`← ${e.toolName} [${e.isError ? "error" : "ok"}]: ${e.result}`);
break;
}
}
// Multi-turn: one memory + one sessionId, reused every turn.
const memory = new SessionMemoryStore();
const sessionId = "tool-tutorial";
const rl = createInterface({ input, output });
// Pre-approve the read-only tools; ask before the shell runs anything.
const permissions = new InMemoryPermissionStore({
fallback: PermissionPolicy.Ask,
rules: { weather: PermissionPolicy.Allow, remember: PermissionPolicy.Allow },
});
// A terminal prompter: show the call (name + args) and ask y/N.
const prompter: ApprovalPrompter = {
async ask(batch) {
const choices: ApprovalChoice[] = [];
for (const { toolCall, args } of batch) {
const ok = (await rl.question(`\n🔐 allow ${toolCall.function.name}(${JSON.stringify(args)})? [y/N] `)).trim().toLowerCase() === "y";
choices.push(ok ? ApprovalChoice.AllowOnce : ApprovalChoice.DenyOnce);
}
return choices;
},
};
const gate = permissionGate(permissions, prompter);
while (true) {
const prompt = (await rl.question("\nyou › ")).trim();
if (prompt === "" || prompt === "exit") break;
process.stdout.write("bot › ");
await runAgent({
model,
memory,
sessionId,
prompt,
tools: registry.list(),
hooks: { gateToolCalls: gate },
onEvent: render,
});
}
rl.close();bun run examples/tool-tutorial/step5.ts
you › list the files in the current directoryAsk about the weather and it just runs; ask it to touch the filesystem and you
get a [y/N] before shell executes. The whole turn's calls arrive together, so
the gate runs once per turn — ahead of the parallel execution phase, so the
prompt never races a running tool. A denied call never runs; it comes back as an
error tool-result the model can react to. Return ApprovalChoice.AllowAlways /
DenyAlways instead of the "once" variants and permissionGate persists that
policy to the store, so the next turn won't ask again.
Recap
Starting from a plain chat loop, a few lines at a time you gave the model a tool, let it recover from a tool's error instead of crashing, organized many tools in a registry and ran a stateful one safely, handed it a built-in tool over a backend you own, then gated its riskiest call behind your approval. The loop never changed — tools are just values you pass it.
From here:
- Skills bundle tools with instructions the model loads on demand — the next level of progressive disclosure.
- Planning Tools are batteries-included tools that give the model durable working memory.
- Permissions & Credentials goes deeper on the gate
you just used — persisting "always" choices, writing the
ApprovalPrompter— and adds credentials: feeding a tool a secret the model never sees. - The API reference has exact signatures for
defineTool,ToolRegistry, and every built-in tool and backend.
Messages & the Wire Format
How the loop models a conversation — the four message roles, how user / assistant / tool turns interleave, and how tool calls pair one-to-one with tool results, even under parallel and interleaved tool calling.
Code Execution
A tutorial — grow one program across four steps: run a model-written snippet in a sandbox, grant it one capability, swap the backend, then gate it behind approval.