Planning Tools

A short tutorial — give the model durable working memory with a to-do list and a scratchpad, then freeze its plan into a replayable workflow.

A long, multi-step task needs the model to hold a plan it can revise over many turns. The model's own thinking can carry some of this — interleaved/extended thinking keeps reasoning across turns, and this SDK resends it on tool-call turns — but thinking is a fragile place to store a plan: whether it survives depends on the provider and config, it can come back summarized or encrypted rather than re-readable text, and it's freeform prose, not queryable state you can persist. A tool result is durable, structured, and yours to keep — so the robust way to give the model working memory is to make it write the plan to a tool. Two batteries-included tool sets do exactly that:

Tool set	Tools it returns	What it's for
`todoListTools()`	`todo_append`, `todo_list`, `todo_update`	A task tracker with a `pending → in_progress → done` (plus `cancelled` / `failed`) lifecycle and retry counting.
`scratchpadTools()`	`read_scratchpad`, `write_scratchpad`	One free-form text slot for a plan or running notes.

This tutorial grows one program across four steps:

Track work with a to-do list the model maintains itself.
Add a scratchpad for free-form notes.
Author a workflow — hand the agent a prose brief, let it decompose and run, then freeze the plan.
Replay that frozen workflow deterministically.

Steps 2–4 show only the lines they add, highlighted.

Step 1 — Track work with a to-do list

Construct an InMemoryTodoStore once (like memory, reusing it is what makes the list durable) and spread todoListTools(store) into the tool list. Give the model a multi-step job and it appends a step per task, flips each to in_progress, does it, and marks it done:

examples/planning-tutorial/step1.ts

import {
  AgentEventType,
  defineTool,
  formatTodoList,
  InMemoryTodoStore,
  runAgent,
  SessionMemoryStore,
  todoListTools,
} from "@open-agent-loops/core";
import type { AgentEvent } from "@open-agent-loops/core";
import { OpenAICompatibleModel } from "@open-agent-loops/core/providers/openai";
import { z } from "zod";

const apiKey = process.env.LLM_API_KEY;
if (!apiKey) {
  console.error("Set LLM_API_KEY (see .env.example).");
  process.exit(1);
}

// DeepSeek V4 tool-calls cleanly; GLM emits broken empty-key tool args.
const model = new OpenAICompatibleModel({
  apiKey,
  baseURL: process.env.LLM_BASE_URL ?? "https://api.featherless.ai/v1",
  model: process.env.LLM_MODEL ?? "deepseek-ai/DeepSeek-V4-Flash",
  thinking: "on",
});

// Three small tools the task needs. Pure functions — the focus is the planning,
// not these.
const wordCount = defineTool({
  name: "word_count",
  description: "Count the words in a block of text.",
  parameters: z.object({ text: z.string().describe("The text to measure.") }),
  execute: ({ text }) => ({ content: `${text.trim().split(/\s+/).filter(Boolean).length} words.` }),
});
const slugify = defineTool({
  name: "slugify",
  description: "Turn a title into a lowercase, hyphenated URL slug.",
  parameters: z.object({ title: z.string().describe("The title to slugify.") }),
  execute: ({ title }) => ({
    content: title.toLowerCase().replace(/[^a-z0-9]+/g, "-").replace(/^-+|-+$/g, ""),
  }),
});
const saveDraft = defineTool({
  name: "save_draft",
  description: "Save a post draft under its slug. Returns the file path.",
  parameters: z.object({
    slug: z.string().describe("URL slug; used as the filename."),
    body: z.string().describe("The post body to save."),
  }),
  execute: ({ slug }) => ({ content: `Saved draft to drafts/${slug}.md` }),
});

// Construct the to-do store ONCE. Like `memory`, reusing it is what makes the
// list durable working memory the model can revise across turns.
const todos = new InMemoryTodoStore();

const system = [
  "You are a task agent. For multi-step work, keep a to-do list: append a step per",
  "task, set it in_progress before you start, and done when you finish. Use the",
  "tools provided instead of doing the work in your head.",
].join("\n");

const post = [
  "Title: Hello, Agents",
  "",
  "Body: Agents are small programs that decide what to do next by calling tools",
  "in a loop. Give one a goal and a few capabilities and it will plan, act, and",
  "check its own work until the goal is met.",
].join("\n");

// Render every event the loop emits, including the dimmed reasoning channel.
function render(e: AgentEvent) {
  switch (e.type) {
    case AgentEventType.ReasoningDelta:
      process.stdout.write(`\x1b[2m${e.text}\x1b[22m`);
      break;
    case AgentEventType.TextDelta:
      process.stdout.write(e.text);
      break;
    case AgentEventType.ToolStart:
      console.log(`→ ${e.toolName}(${JSON.stringify(e.args)})`);
      break;
    case AgentEventType.ToolEnd:
      console.log(`← ${e.toolName} [${e.isError ? "error" : "ok"}]: ${e.result}`);
      break;
  }
}

await runAgent({
  model,
  memory: new SessionMemoryStore(),
  sessionId: "planning-step1",
  system,
  // The to-do tools sit alongside your own tools — all plain Tool[].
  tools: [wordCount, slugify, saveDraft, ...todoListTools(todos)],
  prompt: `Get this blog post ready to publish — measure its length, make a URL slug from the title, and save the draft:\n\n${post}`,
  maxSteps: 15,
  onEvent: render,
});

// The list is canonical state, queried fresh from the store — not reconstructed.
console.log(`\n${formatTodoList(todos.read(true))}`);

bun run examples/planning-tutorial/step1.ts

The list survives across the loop's turns even though the reasoning between them doesn't — that's the point. At the end we read it straight from the store with todos.read(...): it's canonical state, not reconstructed from the transcript.

Step 2 — Add a scratchpad

A to-do list is structured (one row per task); a scratchpad is unstructured (one block of prose). Construct an InMemoryScratchpad and spread scratchpadTools(scratchpad) in too — the highlighted lines are all it takes:

examples/planning-tutorial/step2.ts

import {
  AgentEventType,
  defineTool,
  formatTodoList,
  InMemoryScratchpad, 
  InMemoryTodoStore,
  runAgent,
  scratchpadTools, 
  SessionMemoryStore,
  todoListTools,
} from "@open-agent-loops/core";
import type { AgentEvent } from "@open-agent-loops/core";
import { OpenAICompatibleModel } from "@open-agent-loops/core/providers/openai";
import { z } from "zod";

const apiKey = process.env.LLM_API_KEY;
if (!apiKey) {
  console.error("Set LLM_API_KEY (see .env.example).");
  process.exit(1);
}

// DeepSeek V4 tool-calls cleanly; GLM emits broken empty-key tool args.
const model = new OpenAICompatibleModel({
  apiKey,
  baseURL: process.env.LLM_BASE_URL ?? "https://api.featherless.ai/v1",
  model: process.env.LLM_MODEL ?? "deepseek-ai/DeepSeek-V4-Flash",
  thinking: "on",
});

// Three small tools the task needs. Pure functions — the focus is the planning,
// not these.
const wordCount = defineTool({
  name: "word_count",
  description: "Count the words in a block of text.",
  parameters: z.object({ text: z.string().describe("The text to measure.") }),
  execute: ({ text }) => ({ content: `${text.trim().split(/\s+/).filter(Boolean).length} words.` }),
});
const slugify = defineTool({
  name: "slugify",
  description: "Turn a title into a lowercase, hyphenated URL slug.",
  parameters: z.object({ title: z.string().describe("The title to slugify.") }),
  execute: ({ title }) => ({
    content: title.toLowerCase().replace(/[^a-z0-9]+/g, "-").replace(/^-+|-+$/g, ""),
  }),
});
const saveDraft = defineTool({
  name: "save_draft",
  description: "Save a post draft under its slug. Returns the file path.",
  parameters: z.object({
    slug: z.string().describe("URL slug; used as the filename."),
    body: z.string().describe("The post body to save."),
  }),
  execute: ({ slug }) => ({ content: `Saved draft to drafts/${slug}.md` }),
});

// Construct each planning store ONCE. Like `memory`, reusing them is what makes
// the list and the notes durable working memory across turns.
const todos = new InMemoryTodoStore();
const scratchpad = new InMemoryScratchpad(); 

const system = [
  "You are a task agent. For multi-step work, keep a to-do list: append a step per",
  "task, set it in_progress before you start, and done when you finish. Use the",
  "tools provided instead of doing the work in your head.",
  "Jot intermediate findings (the word count, the slug) to your scratchpad as you go.", 
].join("\n");

const post = [
  "Title: Hello, Agents",
  "",
  "Body: Agents are small programs that decide what to do next by calling tools",
  "in a loop. Give one a goal and a few capabilities and it will plan, act, and",
  "check its own work until the goal is met.",
].join("\n");

// Render every event the loop emits, including the dimmed reasoning channel.
function render(e: AgentEvent) {
  switch (e.type) {
    case AgentEventType.ReasoningDelta:
      process.stdout.write(`\x1b[2m${e.text}\x1b[22m`);
      break;
    case AgentEventType.TextDelta:
      process.stdout.write(e.text);
      break;
    case AgentEventType.ToolStart:
      console.log(`→ ${e.toolName}(${JSON.stringify(e.args)})`);
      break;
    case AgentEventType.ToolEnd:
      console.log(`← ${e.toolName} [${e.isError ? "error" : "ok"}]: ${e.result}`);
      break;
  }
}

await runAgent({
  model,
  memory: new SessionMemoryStore(),
  sessionId: "planning-step2",
  system,
  // Spread each set into the tool list alongside your own tools.
  tools: [
    wordCount,
    slugify,
    saveDraft,
    ...todoListTools(todos),
    ...scratchpadTools(scratchpad), 
  ],
  prompt: `Get this blog post ready to publish — measure its length, make a URL slug from the title, and save the draft:\n\n${post}`,
  maxSteps: 15,
  onEvent: render,
});

console.log(`\n${formatTodoList(todos.read(true))}`);
console.log(`\nScratchpad:\n${scratchpad.read()}`);

bun run examples/planning-tutorial/step2.ts

Now the model jots intermediate findings (the word count, the slug) as it works and can re-read them later.

Two layers of memory

These stores and memory persist different things. memory records the transcript — every message, including each tool call and its result. The to-do and scratchpad stores hold the current canonical state, queried fresh with todo_list / read_scratchpad instead of reconstructed from history. Both sets run their tools sequentially, since the tools in a set share one mutable store. And like SessionMemoryStore, the in-memory stores are a swappable seam — back TodoStore / Scratchpad with a file or DB to share a plan across sessions or survive a restart.

Step 3 — Author a workflow from a brief

So far the model wrote its own list. Lean into that: hand it a brief — a prose spec — and it decomposes the brief into steps with todo_append, then works them. Capture the list it built and you have a workflow: a serializable goal, instructions, ordered steps, and the names of the tools those steps need. The highlighted lines read the brief and serialize the plan:

examples/planning-tutorial/step3.ts

import { mkdtempSync, readFileSync, writeFileSync } from "node:fs";
import { tmpdir } from "node:os";
import { join } from "node:path";
import { fileURLToPath } from "node:url";
import {
  AgentEventType,
  defineTool,
  InMemoryTodoStore,
  runAgent,
  SessionMemoryStore,
  todoListTools,
} from "@open-agent-loops/core";
import type { AgentEvent } from "@open-agent-loops/core";
import { OpenAICompatibleModel } from "@open-agent-loops/core/providers/openai";
import { z } from "zod";

const apiKey = process.env.LLM_API_KEY;
if (!apiKey) {
  console.error("Set LLM_API_KEY (see .env.example).");
  process.exit(1);
}

const model = new OpenAICompatibleModel({
  apiKey,
  baseURL: process.env.LLM_BASE_URL ?? "https://api.featherless.ai/v1",
  model: process.env.LLM_MODEL ?? "deepseek-ai/DeepSeek-V4-Flash",
  thinking: "on",
});

const wordCount = defineTool({
  name: "word_count",
  description: "Count the words in a block of text.",
  parameters: z.object({ text: z.string().describe("The text to measure.") }),
  execute: ({ text }) => ({ content: `${text.trim().split(/\s+/).filter(Boolean).length} words.` }),
});
const slugify = defineTool({
  name: "slugify",
  description: "Turn a title into a lowercase, hyphenated URL slug.",
  parameters: z.object({ title: z.string().describe("The title to slugify.") }),
  execute: ({ title }) => ({
    content: title.toLowerCase().replace(/[^a-z0-9]+/g, "-").replace(/^-+|-+$/g, ""),
  }),
});
const saveDraft = defineTool({
  name: "save_draft",
  description: "Save a post draft under its slug. Returns the file path.",
  parameters: z.object({
    slug: z.string().describe("URL slug; used as the filename."),
    body: z.string().describe("The post body to save."),
  }),
  execute: ({ slug }) => ({ content: `Saved draft to drafts/${slug}.md` }),
});

// Read the prose brief — the workflow's source form, authored by a human.
const here = fileURLToPath(new URL(".", import.meta.url));
const brief = readFileSync(join(here, "blog-post.brief.md"), "utf8");
const title = brief.match(/^#\s+(.+)$/m)?.[1] ?? "workflow";
const goal = brief.split("\n").find((l) => l.trim() && !l.startsWith("#"))?.trim() ?? title;

// The agent plans into THIS store (empty — it fills it), then works the plan.
const todos = new InMemoryTodoStore();

// A planner-executor prompt: decompose first with todo_append, then execute.
const system = [
  "You are a workflow planner-executor. Work in two phases:",
  "1. PLAN: break the brief into an ordered list of concrete, single-action steps,",
  "   recording each with todo_append (status pending) and a short id.",
  "2. EXECUTE: work them in order — todo_update to in_progress, do it with the",
  "   right tool, todo_update to done. Keep step text generic so the plan reuses.",
].join("\n");

const post = [
  "Title: Hello, Agents",
  "",
  "Body: Agents are small programs that decide what to do next by calling tools",
  "in a loop. Give one a goal and a few capabilities and it will plan, act, and",
  "check its own work until the goal is met.",
].join("\n");

function render(e: AgentEvent) {
  switch (e.type) {
    case AgentEventType.ReasoningDelta:
      process.stdout.write(`\x1b[2m${e.text}\x1b[22m`);
      break;
    case AgentEventType.TextDelta:
      process.stdout.write(e.text);
      break;
    case AgentEventType.ToolStart:
      console.log(`→ ${e.toolName}(${JSON.stringify(e.args)})`);
      break;
    case AgentEventType.ToolEnd:
      console.log(`← ${e.toolName} [${e.isError ? "error" : "ok"}]: ${e.result}`);
      break;
  }
}

await runAgent({
  model,
  memory: new SessionMemoryStore(),
  sessionId: "planning-step3",
  system,
  tools: [wordCount, slugify, saveDraft, ...todoListTools(todos)],
  prompt: `Brief:\n${brief}\n\nPost to process:\n${post}`,
  maxSteps: 15,
  onEvent: render,
});

// Compile: the to-do list the AGENT built becomes a frozen, replayable workflow.
const workflow = { 
  name: title.toLowerCase().replace(/[^a-z0-9]+/g, "-").replace(/^-+|-+$/g, ""),
  goal,
  instructions:
    "Work the steps in order: set each in_progress, do it with the named tool, then done.",
  toolNames: ["word_count", "slugify", "save_draft"],
  steps: todos.read(true).map(({ id, content }) => ({ id, content })),
};
const out = join(mkdtempSync(join(tmpdir(), "workflow-")), "blog-post.workflow.json"); 
writeFileSync(out, JSON.stringify(workflow, null, 2)); 
console.log(`\nThe agent decomposed the brief into ${workflow.steps.length} steps.`);
console.log(`Compiled workflow written to ${out} — replay it in step 4.`);

bun run examples/planning-tutorial/step3.ts

The agent does the decomposition; we just freeze the result. prose → plan → artifact. That JSON is the input to the next step.

Step 4 — Replay the frozen workflow

A workflow isn't a new primitive — there's nothing to add to the loop. It hydrates entirely onto seams you already have:

Serialized field	Hydrates into	Via
`steps`	a pre-seeded to-do list	`store.append(...)` per step
`toolNames`	the tools for this run	`ToolRegistry.resolve(names)`
`goal` + `instructions`	the system prompt	`runAgent({ system })`

So replaying is: deserialize, seed the store, resolve the tool names, run. No planning phase — the model executes a fixed, reviewable plan. The highlighted lines are that hydration:

examples/planning-tutorial/step4.ts

import { readFileSync } from "node:fs";
import { join } from "node:path";
import { fileURLToPath } from "node:url";
import {
  AgentEventType,
  defineTool,
  formatTodoList,
  InMemoryTodoStore,
  runAgent,
  SessionMemoryStore,
  todoListTools,
  ToolRegistry,
} from "@open-agent-loops/core";
import type { AgentEvent } from "@open-agent-loops/core";
import { OpenAICompatibleModel } from "@open-agent-loops/core/providers/openai";
import { z } from "zod";

const apiKey = process.env.LLM_API_KEY;
if (!apiKey) {
  console.error("Set LLM_API_KEY (see .env.example).");
  process.exit(1);
}

const model = new OpenAICompatibleModel({
  apiKey,
  baseURL: process.env.LLM_BASE_URL ?? "https://api.featherless.ai/v1",
  model: process.env.LLM_MODEL ?? "deepseek-ai/DeepSeek-V4-Flash",
  thinking: "on",
});

const wordCount = defineTool({
  name: "word_count",
  description: "Count the words in a block of text.",
  parameters: z.object({ text: z.string().describe("The text to measure.") }),
  execute: ({ text }) => ({ content: `${text.trim().split(/\s+/).filter(Boolean).length} words.` }),
});
const slugify = defineTool({
  name: "slugify",
  description: "Turn a title into a lowercase, hyphenated URL slug.",
  parameters: z.object({ title: z.string().describe("The title to slugify.") }),
  execute: ({ title }) => ({
    content: title.toLowerCase().replace(/[^a-z0-9]+/g, "-").replace(/^-+|-+$/g, ""),
  }),
});
const saveDraft = defineTool({
  name: "save_draft",
  description: "Save a post draft under its slug. Returns the file path.",
  parameters: z.object({
    slug: z.string().describe("URL slug; used as the filename."),
    body: z.string().describe("The post body to save."),
  }),
  execute: ({ slug }) => ({ content: `Saved draft to drafts/${slug}.md` }),
});

// A name → tool resolver. resolve(names) throws on an unknown name.
const registry = new ToolRegistry([wordCount, slugify, saveDraft]);

// Deserialize the frozen workflow (the kind step 3 produced).
const here = fileURLToPath(new URL(".", import.meta.url));
const workflow = JSON.parse(readFileSync(join(here, "blog-post.workflow.json"), "utf8"));

// Hydrate onto existing seams: names → tools, steps → a pre-seeded list.
const stepTools = registry.resolve(workflow.toolNames); 
const todos = new InMemoryTodoStore(); 
for (const step of workflow.steps) todos.append(step.id, step.content, "pending"); 

const system = [
  `You are executing the "${workflow.name}" workflow.`,
  `Goal: ${workflow.goal}`,
  workflow.instructions,
  "Your to-do list is pre-seeded with the steps — call todo_list to see them.",
].join("\n");

const post = [
  "Title: Hello, Agents",
  "",
  "Body: Agents are small programs that decide what to do next by calling tools",
  "in a loop. Give one a goal and a few capabilities and it will plan, act, and",
  "check its own work until the goal is met.",
].join("\n");

function render(e: AgentEvent) {
  switch (e.type) {
    case AgentEventType.ReasoningDelta:
      process.stdout.write(`\x1b[2m${e.text}\x1b[22m`);
      break;
    case AgentEventType.TextDelta:
      process.stdout.write(e.text);
      break;
    case AgentEventType.ToolStart:
      console.log(`→ ${e.toolName}(${JSON.stringify(e.args)})`);
      break;
    case AgentEventType.ToolEnd:
      console.log(`← ${e.toolName} [${e.isError ? "error" : "ok"}]: ${e.result}`);
      break;
  }
}

await runAgent({
  model,
  memory: new SessionMemoryStore(),
  sessionId: "planning-step4",
  system,
  tools: [...todoListTools(todos), ...stepTools],
  prompt: post,
  maxSteps: 15,
  onEvent: render,
});

console.log(`\n${formatTodoList(todos.read(true))}`);

bun run examples/planning-tutorial/step4.ts

Data serializes; behavior doesn't

A tool's execute is a function, so a workflow can't carry the tools themselves — it carries their names. ToolRegistry.resolve re-binds those names to live tools at load time and throws (listing what is registered) if the host never registered one, so a workflow that references a missing tool fails loudly at load instead of the model silently running without it.

Recap

Starting from a plain run, a few lines at a time you gave the model a to-do list it maintains across turns, a scratchpad for notes, and then turned that emergent list into a workflow — authored from a brief, frozen as data, replayed deterministically. The loop never changed: every step is the same runAgent taking a plain Tool[]. The finished programs are in examples/planning-tutorial/.