Tools

A short tutorial — give the model a tool with defineTool, let it recover from errors, organize many tools in a registry, and reach for the built-in tools over your own backend.

A tool is a capability you hand the model: a name, a one-line description, a Zod schema for its arguments, and an execute handler. The loop validates each call against the schema before execute runs, so your handler can trust its input — and the description plus every field's .describe() are all the model knows about a tool, so documenting one well directly improves how it's used.

This tutorial starts from the multi-turn chat loop and grows one program across five steps:

Define a tool the model can call.
Let it fail safely — throw an error the model recovers from.
Organize many tools in a ToolRegistry, and run a stateful one safely.
Reach for a built-in tool over a backend you supply.
Gate a risky call behind your approval before it runs.

Each step below shows the whole program so far — the lines it adds are highlighted.

Step 1 — Define a Tool

Author every tool through defineTool. It's an identity function at runtime; its job is to infer the args type of execute from your schema, so the handler is fully typed for free. The handler returns a ToolResult whose content string is folded back into the conversation as the tool's result. Here a weather tool joins the chat loop's tools array:

examples/tool-tutorial/step1.ts

import {
  AgentEventType,
  defineTool,
  runAgent,
  SessionMemoryStore,
} from "@open-agent-loops/core";
import type { AgentEvent } from "@open-agent-loops/core";
import { OpenAICompatibleModel } from "@open-agent-loops/core/providers/openai";
import { createInterface } from "node:readline/promises";
import { stdin as input, stdout as output } from "node:process";
import { z } from "zod";

const apiKey = process.env.LLM_API_KEY;
if (!apiKey) {
  console.error("Set LLM_API_KEY (see .env.example).");
  process.exit(1);
}

// DeepSeek V4 tool-calls cleanly; GLM emits broken empty-key tool args.
const model = new OpenAICompatibleModel({
  apiKey,
  baseURL: process.env.LLM_BASE_URL ?? "https://api.featherless.ai/v1",
  model: process.env.LLM_MODEL ?? "deepseek-ai/DeepSeek-V4-Flash",
  thinking: "on",
});

// A tool is a name, a description, a Zod schema, and an execute handler. `args`
// is inferred from the schema, so `{ city }` below is typed as a string — and
// the description plus each field's `.describe()` are all the model sees.
const weather = defineTool({
  name: "weather",
  description: "Get the current weather for a city.",
  parameters: z.object({
    city: z.string().describe('City to look up, e.g. "Paris".'),
  }),
  execute: async ({ city }) => {
    // Replace with a real API call. The returned `content` is folded back into
    // the conversation as this tool call's result.
    return { content: `Sunny, 21°C in ${city}.` };
  },
});

// Render every event the loop emits, including the dimmed reasoning channel.
function render(e: AgentEvent) {
  switch (e.type) {
    case AgentEventType.ReasoningDelta:
      process.stdout.write(`\x1b[2m${e.text}\x1b[22m`);
      break;
    case AgentEventType.TextDelta:
      process.stdout.write(e.text);
      break;
    case AgentEventType.ToolStart:
      console.log(`→ ${e.toolName}(${JSON.stringify(e.args)})`);
      break;
    case AgentEventType.ToolEnd:
      console.log(`← ${e.toolName} [${e.isError ? "error" : "ok"}]: ${e.result}`);
      break;
  }
}

// Multi-turn: one memory + one sessionId, reused every turn.
const memory = new SessionMemoryStore();
const sessionId = "tool-tutorial";
const rl = createInterface({ input, output });

while (true) {
  const prompt = (await rl.question("\nyou › ")).trim();
  if (prompt === "" || prompt === "exit") break;

  process.stdout.write("bot › ");
  await runAgent({
    model,
    memory,
    sessionId,
    prompt,
    tools: [weather],
    onEvent: render,
  });
}
rl.close();

bun run examples/tool-tutorial/step1.ts
you › what's the weather in Paris?

The model sees weather and its schema every turn; when the conversation needs the weather it calls weather({ city: "Paris" }), and the content you return comes back as that call's result. Because memory and sessionId are reused, the next message continues the same conversation.

Step 2 — Let a Tool Fail Safely

Tools touch the real world, and real things fail. There is deliberately no error field on a result: a tool signals a hard error by throwing. The loop catches the throw and turns it into a tool result marked isError, with the error's message as the content the model sees — so the model can read it and recover, instead of the whole run rejecting. The highlighted lines give weather a tiny set of known cities and throw for anything else:

examples/tool-tutorial/step2.ts

import {
  AgentEventType,
  defineTool,
  runAgent,
  SessionMemoryStore,
} from "@open-agent-loops/core";
import type { AgentEvent } from "@open-agent-loops/core";
import { OpenAICompatibleModel } from "@open-agent-loops/core/providers/openai";
import { createInterface } from "node:readline/promises";
import { stdin as input, stdout as output } from "node:process";
import { z } from "zod";

const apiKey = process.env.LLM_API_KEY;
if (!apiKey) {
  console.error("Set LLM_API_KEY (see .env.example).");
  process.exit(1);
}

// DeepSeek V4 tool-calls cleanly; GLM emits broken empty-key tool args.
const model = new OpenAICompatibleModel({
  apiKey,
  baseURL: process.env.LLM_BASE_URL ?? "https://api.featherless.ai/v1",
  model: process.env.LLM_MODEL ?? "deepseek-ai/DeepSeek-V4-Flash",
  thinking: "on",
});

// A tiny "backend" so the tool has something real to fail on.
const KNOWN: Record<string, string> = {
  Paris: "Sunny, 21°C",
  Tokyo: "Rainy, 18°C",
};

const weather = defineTool({
  name: "weather",
  description: "Get the current weather for a city.",
  parameters: z.object({
    city: z.string().describe('City to look up, e.g. "Paris".'),
  }),
  execute: async ({ city }) => {
    const report = KNOWN[city]; 
    if (!report) {
      // Throw for a hard error: only the message reaches the model, so make it
      // useful. The run does NOT crash — the model can pick another city.
      throw new Error(`No weather for "${city}". Known cities: ${Object.keys(KNOWN).join(", ")}.`);
    }
    return { content: `${report} in ${city}.` };
  },
});

// Render every event the loop emits, including the dimmed reasoning channel.
function render(e: AgentEvent) {
  switch (e.type) {
    case AgentEventType.ReasoningDelta:
      process.stdout.write(`\x1b[2m${e.text}\x1b[22m`);
      break;
    case AgentEventType.TextDelta:
      process.stdout.write(e.text);
      break;
    case AgentEventType.ToolStart:
      console.log(`→ ${e.toolName}(${JSON.stringify(e.args)})`);
      break;
    case AgentEventType.ToolEnd:
      console.log(`← ${e.toolName} [${e.isError ? "error" : "ok"}]: ${e.result}`);
      break;
  }
}

// Multi-turn: one memory + one sessionId, reused every turn.
const memory = new SessionMemoryStore();
const sessionId = "tool-tutorial";
const rl = createInterface({ input, output });

while (true) {
  const prompt = (await rl.question("\nyou › ")).trim();
  if (prompt === "" || prompt === "exit") break;

  process.stdout.write("bot › ");
  await runAgent({
    model,
    memory,
    sessionId,
    prompt,
    tools: [weather],
    onEvent: render,
  });
}
rl.close();

bun run examples/tool-tutorial/step2.ts
you › what's the weather on the Moon?

Only the error's message reaches the model, so make it useful — here it lists the known cities, and the model simply asks you to pick one. A normal return is for success and for soft outcomes the model should read but not treat as failure (a non-zero shell exit, an "already up to date" note).

Two more fields on a result

A ToolResult can also carry details — a structured payload for hooks that's never sent to the model — and terminate: true, which stops the loop right after that result. A "submit final answer" tool is just a tool that returns terminate: true.

Step 3 — Organize Many Tools in a Registry

Past a handful of tools, a ToolRegistry keeps a named catalog you build once and resolve from. It's a composition helper, not a loop dependency — runAgent still takes a plain Tool[]; the registry just produces that array, so you can adopt it incrementally. The highlighted lines add a second, stateful tool and collect both in a registry:

examples/tool-tutorial/step3.ts

import {
  AgentEventType,
  defineTool,
  ExecutionMode, 
  runAgent,
  SessionMemoryStore,
  ToolRegistry, 
} from "@open-agent-loops/core";
import type { AgentEvent } from "@open-agent-loops/core";
import { OpenAICompatibleModel } from "@open-agent-loops/core/providers/openai";
import { createInterface } from "node:readline/promises";
import { stdin as input, stdout as output } from "node:process";
import { z } from "zod";

const apiKey = process.env.LLM_API_KEY;
if (!apiKey) {
  console.error("Set LLM_API_KEY (see .env.example).");
  process.exit(1);
}

// DeepSeek V4 tool-calls cleanly; GLM emits broken empty-key tool args.
const model = new OpenAICompatibleModel({
  apiKey,
  baseURL: process.env.LLM_BASE_URL ?? "https://api.featherless.ai/v1",
  model: process.env.LLM_MODEL ?? "deepseek-ai/DeepSeek-V4-Flash",
  thinking: "on",
});

const KNOWN: Record<string, string> = {
  Paris: "Sunny, 21°C",
  Tokyo: "Rainy, 18°C",
};

const weather = defineTool({
  name: "weather",
  description: "Get the current weather for a city.",
  parameters: z.object({
    city: z.string().describe('City to look up, e.g. "Paris".'),
  }),
  execute: async ({ city }) => {
    const report = KNOWN[city];
    if (!report) {
      throw new Error(`No weather for "${city}". Known cities: ${Object.keys(KNOWN).join(", ")}.`);
    }
    return { content: `${report} in ${city}.` };
  },
});

// A stateful tool: it appends to a shared list. Mark it Sequential so several
// calls in one turn run one-at-a-time, never racing on `notes` — the same
// reason the planning tools run sequentially.
const notes: string[] = [];
const remember = defineTool({
  name: "remember",
  description: "Save a short note for later in the conversation.",
  parameters: z.object({ note: z.string().describe("The note to save.") }),
  execute: async ({ note }) => {
    notes.push(note);
    return { content: `Saved. ${notes.length} note(s) so far.` };
  },
  executionMode: ExecutionMode.Sequential,
});

// Build the catalog once; hand the loop the array the registry resolves.
const registry = new ToolRegistry([weather, remember]);

// Render every event the loop emits, including the dimmed reasoning channel.
function render(e: AgentEvent) {
  switch (e.type) {
    case AgentEventType.ReasoningDelta:
      process.stdout.write(`\x1b[2m${e.text}\x1b[22m`);
      break;
    case AgentEventType.TextDelta:
      process.stdout.write(e.text);
      break;
    case AgentEventType.ToolStart:
      console.log(`→ ${e.toolName}(${JSON.stringify(e.args)})`);
      break;
    case AgentEventType.ToolEnd:
      console.log(`← ${e.toolName} [${e.isError ? "error" : "ok"}]: ${e.result}`);
      break;
  }
}

// Multi-turn: one memory + one sessionId, reused every turn.
const memory = new SessionMemoryStore();
const sessionId = "tool-tutorial";
const rl = createInterface({ input, output });

while (true) {
  const prompt = (await rl.question("\nyou › ")).trim();
  if (prompt === "" || prompt === "exit") break;

  process.stdout.write("bot › ");
  await runAgent({
    model,
    memory,
    sessionId,
    prompt,
    tools: registry.list(), 
    onEvent: render,
  });
}
rl.close();

bun run examples/tool-tutorial/step3.ts
you › remember to pack an umbrella, then check Tokyo's weather

Two things land here. First, registry.list() feeds the loop every registered tool (use registry.resolve(["weather"]) for just a named subset); registering a duplicate name or resolving an unknown one throws and lists what is there, so a wiring typo fails loudly. Second, remember mutates a shared notes array, so it sets executionMode: ExecutionMode.Sequential.

Parallel by default

When one turn makes several tool calls, the loop runs them in parallel — fine for read-only tools like weather. A tool that mutates shared state sets ExecutionMode.Sequential to force that turn's whole batch to run one-at-a-time, so the calls can't race. It's exactly why the planning tools run sequentially.

Step 4 — Reach for a Built-in Tool

You rarely start from nothing. The SDK ships the common agent tools — shell, code_execution, search, file read/write, web fetch/search, a browser — but with a deliberate split: the SDK owns the model-facing contract (the name, the schema, the result formatting), and you supply the backend that does the work. The highlighted lines add the built-in shell over a Bun-based backend:

examples/tool-tutorial/step4.ts

import {
  AgentEventType,
  defineTool,
  ExecutionMode,
  runAgent,
  SessionMemoryStore,
  shellTool, 
  ToolRegistry,
} from "@open-agent-loops/core";
import type { AgentEvent } from "@open-agent-loops/core";
import { OpenAICompatibleModel } from "@open-agent-loops/core/providers/openai";
import { bunShellBackend } from "./bun-backends"; 
import { createInterface } from "node:readline/promises";
import { stdin as input, stdout as output } from "node:process";
import { z } from "zod";

const apiKey = process.env.LLM_API_KEY;
if (!apiKey) {
  console.error("Set LLM_API_KEY (see .env.example).");
  process.exit(1);
}

// DeepSeek V4 tool-calls cleanly; GLM emits broken empty-key tool args.
const model = new OpenAICompatibleModel({
  apiKey,
  baseURL: process.env.LLM_BASE_URL ?? "https://api.featherless.ai/v1",
  model: process.env.LLM_MODEL ?? "deepseek-ai/DeepSeek-V4-Flash",
  thinking: "on",
});

const KNOWN: Record<string, string> = {
  Paris: "Sunny, 21°C",
  Tokyo: "Rainy, 18°C",
};

const weather = defineTool({
  name: "weather",
  description: "Get the current weather for a city.",
  parameters: z.object({
    city: z.string().describe('City to look up, e.g. "Paris".'),
  }),
  execute: async ({ city }) => {
    const report = KNOWN[city];
    if (!report) {
      throw new Error(`No weather for "${city}". Known cities: ${Object.keys(KNOWN).join(", ")}.`);
    }
    return { content: `${report} in ${city}.` };
  },
});

const notes: string[] = [];
const remember = defineTool({
  name: "remember",
  description: "Save a short note for later in the conversation.",
  parameters: z.object({ note: z.string().describe("The note to save.") }),
  execute: async ({ note }) => {
    notes.push(note);
    return { content: `Saved. ${notes.length} note(s) so far.` };
  },
  executionMode: ExecutionMode.Sequential,
});

// A built-in tool: the SDK ships the `shell` contract; you bring the backend
// that runs the command (here Bun's, host glue you provide).
const shell = shellTool(bunShellBackend());

const registry = new ToolRegistry([weather, remember, shell]); 

// Render every event the loop emits, including the dimmed reasoning channel.
function render(e: AgentEvent) {
  switch (e.type) {
    case AgentEventType.ReasoningDelta:
      process.stdout.write(`\x1b[2m${e.text}\x1b[22m`);
      break;
    case AgentEventType.TextDelta:
      process.stdout.write(e.text);
      break;
    case AgentEventType.ToolStart:
      console.log(`→ ${e.toolName}(${JSON.stringify(e.args)})`);
      break;
    case AgentEventType.ToolEnd:
      console.log(`← ${e.toolName} [${e.isError ? "error" : "ok"}]: ${e.result}`);
      break;
  }
}

// Multi-turn: one memory + one sessionId, reused every turn.
const memory = new SessionMemoryStore();
const sessionId = "tool-tutorial";
const rl = createInterface({ input, output });

while (true) {
  const prompt = (await rl.question("\nyou › ")).trim();
  if (prompt === "" || prompt === "exit") break;

  process.stdout.write("bot › ");
  await runAgent({
    model,
    memory,
    sessionId,
    prompt,
    tools: registry.list(),
    onEvent: render,
  });
}
rl.close();

bun run examples/tool-tutorial/step4.ts
you › list the files in the current directory

shellTool gives the model a stable shell tool; bunShellBackend() is the host glue you bring. That boundary is the point: anything that runs a command, reads a file, or hits the network should be your code, not the SDK's — so each built-in (codeExecutionTool, searchTool, readTool / globTool, editTool / writeTool, webFetchTool / webSearchTool, browserTools) takes a backend interface you implement. The exception is the planning tools, which are pure in-memory and ship ready to use.

Two ways to run code

Two built-ins run model-written instructions, in different shapes. shell runs one command string; code_execution runs a snippet in a named language ({ language, code }). They differ on isolation: bunShellBackend runs unsandboxed on the host, while the shipped denoCodeExecutionBackend executes JavaScript/TypeScript through Deno's deny-by-default permission sandbox — real isolation with no extra infrastructure (grant file, network, or env access explicitly via allow). Either way the result ends with an explicit [exit 0: ok] / [exit N: error] verdict, so a run is never a contentless result.

Step 5 — Gate a Tool Behind Approval

Some calls are risky enough that a human should sign off first — running a shell command, deploying, deleting. Admission is a seam separate from the tool itself: the gateToolCalls hook sees the whole turn's calls before any of them run and decides allow / deny / ask. It's batteries-included — pair the shipped permissionGate with an InMemoryPermissionStore (the policy) and an ApprovalPrompter (how you ask). The highlighted lines let weather and remember run freely and prompt before shell:

examples/tool-tutorial/step5.ts

import {
  AgentEventType,
  ApprovalChoice, 
  defineTool,
  ExecutionMode,
  InMemoryPermissionStore, 
  permissionGate, 
  PermissionPolicy, 
  runAgent,
  SessionMemoryStore,
  shellTool,
  ToolRegistry,
} from "@open-agent-loops/core";
import type { AgentEvent, ApprovalPrompter } from "@open-agent-loops/core"; 
import { OpenAICompatibleModel } from "@open-agent-loops/core/providers/openai";
import { bunShellBackend } from "./bun-backends";
import { createInterface } from "node:readline/promises";
import { stdin as input, stdout as output } from "node:process";
import { z } from "zod";

const apiKey = process.env.LLM_API_KEY;
if (!apiKey) {
  console.error("Set LLM_API_KEY (see .env.example).");
  process.exit(1);
}

// DeepSeek V4 tool-calls cleanly; GLM emits broken empty-key tool args.
const model = new OpenAICompatibleModel({
  apiKey,
  baseURL: process.env.LLM_BASE_URL ?? "https://api.featherless.ai/v1",
  model: process.env.LLM_MODEL ?? "deepseek-ai/DeepSeek-V4-Flash",
  thinking: "on",
});

const KNOWN: Record<string, string> = {
  Paris: "Sunny, 21°C",
  Tokyo: "Rainy, 18°C",
};

const weather = defineTool({
  name: "weather",
  description: "Get the current weather for a city.",
  parameters: z.object({
    city: z.string().describe('City to look up, e.g. "Paris".'),
  }),
  execute: async ({ city }) => {
    const report = KNOWN[city];
    if (!report) {
      throw new Error(`No weather for "${city}". Known cities: ${Object.keys(KNOWN).join(", ")}.`);
    }
    return { content: `${report} in ${city}.` };
  },
});

const notes: string[] = [];
const remember = defineTool({
  name: "remember",
  description: "Save a short note for later in the conversation.",
  parameters: z.object({ note: z.string().describe("The note to save.") }),
  execute: async ({ note }) => {
    notes.push(note);
    return { content: `Saved. ${notes.length} note(s) so far.` };
  },
  executionMode: ExecutionMode.Sequential,
});

// The SDK ships the `shell` contract; you bring the backend that runs the command.
const shell = shellTool(bunShellBackend());

const registry = new ToolRegistry([weather, remember, shell]);

// Render every event the loop emits, including the dimmed reasoning channel.
function render(e: AgentEvent) {
  switch (e.type) {
    case AgentEventType.ReasoningDelta:
      process.stdout.write(`\x1b[2m${e.text}\x1b[22m`);
      break;
    case AgentEventType.TextDelta:
      process.stdout.write(e.text);
      break;
    case AgentEventType.ToolStart:
      console.log(`→ ${e.toolName}(${JSON.stringify(e.args)})`);
      break;
    case AgentEventType.ToolEnd:
      console.log(`← ${e.toolName} [${e.isError ? "error" : "ok"}]: ${e.result}`);
      break;
  }
}

// Multi-turn: one memory + one sessionId, reused every turn.
const memory = new SessionMemoryStore();
const sessionId = "tool-tutorial";
const rl = createInterface({ input, output });

// Pre-approve the read-only tools; ask before the shell runs anything.
const permissions = new InMemoryPermissionStore({
  fallback: PermissionPolicy.Ask,
  rules: { weather: PermissionPolicy.Allow, remember: PermissionPolicy.Allow },
});
// A terminal prompter: show the call (name + args) and ask y/N.
const prompter: ApprovalPrompter = {
  async ask(batch) {
    const choices: ApprovalChoice[] = [];
    for (const { toolCall, args } of batch) {
      const ok = (await rl.question(`\n🔐 allow ${toolCall.function.name}(${JSON.stringify(args)})? [y/N] `)).trim().toLowerCase() === "y";
      choices.push(ok ? ApprovalChoice.AllowOnce : ApprovalChoice.DenyOnce);
    }
    return choices;
  },
};
const gate = permissionGate(permissions, prompter); 

while (true) {
  const prompt = (await rl.question("\nyou › ")).trim();
  if (prompt === "" || prompt === "exit") break;

  process.stdout.write("bot › ");
  await runAgent({
    model,
    memory,
    sessionId,
    prompt,
    tools: registry.list(),
    hooks: { gateToolCalls: gate }, 
    onEvent: render,
  });
}
rl.close();

bun run examples/tool-tutorial/step5.ts
you › list the files in the current directory

Ask about the weather and it just runs; ask it to touch the filesystem and you get a [y/N] before shell executes. The whole turn's calls arrive together, so the gate runs once per turn — ahead of the parallel execution phase, so the prompt never races a running tool. A denied call never runs; it comes back as an error tool-result the model can react to. Return ApprovalChoice.AllowAlways / DenyAlways instead of the "once" variants and permissionGate persists that policy to the store, so the next turn won't ask again.

Recap

Starting from a plain chat loop, a few lines at a time you gave the model a tool, let it recover from a tool's error instead of crashing, organized many tools in a registry and ran a stateful one safely, handed it a built-in tool over a backend you own, then gated its riskiest call behind your approval. The loop never changed — tools are just values you pass it.

From here:

Skills bundle tools with instructions the model loads on demand — the next level of progressive disclosure.
Planning Tools are batteries-included tools that give the model durable working memory.
Permissions & Credentials goes deeper on the gate you just used — persisting "always" choices, writing the ApprovalPrompter — and adds credentials: feeding a tool a secret the model never sees.
The API reference has exact signatures for defineTool, ToolRegistry, and every built-in tool and backend.