In this tutorial you will package a PDF-extraction agent as a Glovebox — a sandboxed, network-addressable Glove runtime that ships with pdftk, pandoc, and pdftotext baked in. The host process never touches a PDF; it hands a file to the box, the agent does the work in isolation, and the host gets back extracted text plus a structured outline.
This is the most compelling use of Glovebox: factor out an environment that would be painful to install on every web server, run it once behind a stable WebSocket endpoint, and let your host app talk to it through the regular client SDK. The agent inside the box is an ordinary Glove agent — same builder, same tools, same subscribers.
Prerequisites: read Glovebox for the surface area, and Server-Side Agents for the kind of agent you wrap. The example sources live at examples/glovebox-pdf-extractor/.
A box that takes a single PDF on /input and returns two artefacts: extracted.txt (the body text) and outline.json (page-numbered headings). The agent decides which CLI to invoke based on the document — pure text PDFs go through pdftotext, scans get a fallback path through pdftk + pandoc. Both binaries ship in glovebox/docs:1.2, so no extra packages are needed.
FileRef (inline below 1MB, otherwise wrapped through client storage)/input/document.pdf before invoking the agentextract_text, which shells out to pdftotext and writes /output/extracted.txtextract_outline, which uses pdftk to dump bookmarks and writes /output/outline.json/output, applies the outputs policy, and ships back a complete message with the resolved FileRefsThe agent is a plain Glove runnable. Two tools, an Anthropic adapter, an in-memory store, and the standard Displaymanager. Nothing here knows about Glovebox yet.
import { Glove, Displaymanager, createAdapter } from "glove-core";
import { exec } from "node:child_process";
import { promisify } from "node:util";
import { writeFile } from "node:fs/promises";
import path from "node:path";
import z from "zod";
const run = promisify(exec);
class MemoryStore {
identifier = "pdf";
private msgs: any[] = [];
private tokens = 0;
private turns = 0;
async getMessages() { return this.msgs; }
async appendMessages(m: any[]) { this.msgs.push(...m); }
async getTokenCount() { return this.tokens; }
async addTokens(n: number) { this.tokens += n; }
async getTurnCount() { return this.turns; }
async incrementTurn() { this.turns++; }
async resetCounters() { this.tokens = 0; this.turns = 0; }
}
export const agent = new Glove({
store: new MemoryStore(),
model: createAdapter({ provider: "anthropic", model: "claude-sonnet-4.5", stream: true }),
displayManager: new Displaymanager(),
serverMode: true,
systemPrompt:
"You extract structured data from PDFs. The user uploads one PDF " +
"to /input. Use extract_text for the body and extract_outline for " +
"the table of contents. Always write results into /output and " +
"summarise what you produced in one paragraph.",
compaction_config: { compaction_instructions: "Summarise extraction findings." },
})
.fold({
name: "extract_text",
description: "Run pdftotext on a PDF in /input. Writes plain text to /output/<name>.txt.",
inputSchema: z.object({
file: z.string().describe("Filename inside /input, e.g. 'document.pdf'."),
outputName: z.string().describe("Output filename, e.g. 'extracted.txt'."),
}),
async do(input) {
const src = path.join("/input", input.file);
const dest = path.join("/output", input.outputName);
await run(`pdftotext -layout '${src}' '${dest}'`);
return { status: "success", data: `Wrote ${dest}` };
},
})
.fold({
name: "extract_outline",
description: "Dump the PDF's bookmark tree as JSON via pdftk and write it to /output.",
inputSchema: z.object({
file: z.string(),
outputName: z.string(),
}),
async do(input) {
const src = path.join("/input", input.file);
const dest = path.join("/output", input.outputName);
const { stdout } = await run(`pdftk '${src}' dump_data_utf8`);
const headings = stdout
.split("\n")
.filter((l) => l.startsWith("BookmarkTitle:") || l.startsWith("BookmarkPageNumber:"));
const outline: { title: string; page: number }[] = [];
for (let i = 0; i < headings.length; i += 2) {
const title = headings[i]?.replace("BookmarkTitle: ", "") ?? "";
const page = Number(headings[i + 1]?.replace("BookmarkPageNumber: ", "") ?? "0");
outline.push({ title, page });
}
await writeFile(dest, JSON.stringify(outline, null, 2));
return { status: "success", data: `Wrote ${dest} (${outline.length} entries).` };
},
})
.build();Notice the agent uses serverMode: true and never touches the display manager. This is the headless shape — no permission gating, no UI checkpoints, just tools that read files and write files. The Displaymanager is still required by GloveConfig but stays empty.
The tools deliberately reach paths through /input and /output. Those mounts come from the default fs map the wrap config inherits — read-only inputs, writable outputs, and a writable /work if the agent ever wants scratch space.
glovebox.wrap turns the runnable into a deployable app. The base image carries every binary the tools call out to, so the packages map stays empty.
import { glovebox, rule, composite } from "glovebox-core";
import { agent } from "./agent";
export default glovebox.wrap(agent, {
name: "pdf-extractor",
version: "0.1.0",
base: "glovebox/docs",
env: {
ANTHROPIC_API_KEY: { required: true, secret: true },
},
storage: {
// Inputs default to url-then-inline; explicit here for clarity.
inputs: composite([rule.url(), rule.inline()]),
// Small extracts inline, anything larger stays on the box for an hour.
outputs: composite([
rule.inline({ below: "256KB" }),
rule.localServer({ ttl: "1h" }),
]),
},
limits: { cpu: "1", memory: "1Gi", timeout: "2m" },
});This is everything the build CLI needs. The default fs layout is fine; the kit's injected environment and workspace skills will appear automatically and the /output hook gives the agent an escape hatch if a tool ever writes outside /output and still wants the file shipped back.
pnpm exec glovebox build ./glovebox.ts
# ✓ Resolved base image: ghcr.io/porkytheblack/glovebox/docs:1.2
# ✓ Resolved packages (0 apt, 0 pip, 0 npm)
# ✓ Generated Dockerfile
# ✓ Generated nixpacks.toml
# ✓ Generated server bundle
# ✓ Generated auth key (fingerprint: 9f3a…b1c2)
# ✓ Wrote dist/The dist/ directory is now self-contained — a Dockerfile that FROMs ghcr.io/porkytheblack/glovebox/docs:1.2, an esbuild bundle of the agent + the kit, the manifest, and a single-use auth key. Running it is a docker invocation away.
docker build -t pdf-extractor dist/
GLOVEBOX_KEY=$(cat dist/glovebox.key) docker run \
-p 8080:8080 \
-e GLOVEBOX_KEY \
-e ANTHROPIC_API_KEY \
pdf-extractorThe host script is a thin GloveboxClient wrapper. It reads a PDF off disk, hands it to the box as a named input, streams deltas as the agent works, and writes the extracted artefacts to the local filesystem when the prompt completes.
import { GloveboxClient } from "glovebox-client";
import { readFile, writeFile } from "node:fs/promises";
const client = GloveboxClient.make({
endpoints: {
pdf: {
url: process.env.PDF_BOX_URL ?? "ws://localhost:8080",
key: process.env.PDF_BOX_KEY!,
},
},
});
async function extract(localPath: string) {
const box = client.box("pdf");
const bytes = await readFile(localPath);
const result = box.prompt(
"Extract the body text and the table of contents from /input/document.pdf. " +
"Write extracted.txt and outline.json into /output.",
{
files: {
"document.pdf": { mime: "application/pdf", bytes },
},
},
);
// Stream subscriber events as the agent works.
for await (const ev of result.events) {
if (ev.event_type === "tool_use") {
const e = ev.data as { name: string; input: unknown };
console.log(`[tool] ${e.name}`);
} else if (ev.event_type === "text_delta") {
process.stdout.write((ev.data as { text: string }).text);
}
}
const summary = await result.message;
console.log(`\n--\n${summary}`);
// Pull each output through the configured ClientStorage.
await writeFile("./extracted.txt", await result.read("extracted.txt"));
await writeFile("./outline.json", await result.read("outline.json"));
}
await extract(process.argv[2]!);
await client.close();box.prompt(...) returns immediately. The async iterables (events, display) drain as messages arrive on the WebSocket; the promises (message, outputs) settle when the kit sends complete. result.read(name) dispatches through ClientStorage — inline refs decode in place,server refs hit GET /files/:id with the bearer token. The host code never has to know which adapter the kit picked; the policy decides on the box side.
Everything ran on top of four extensions the kit folded onto the agent at boot — without touching the agent source.
environment skill let the model ask "what's installed?" mid-turn (it returns the manifest spec — base image, fs layout, packages, limits).workspace skill listed /input dynamically so the model could verify the upload landed before shelling out./output hook would have caught any path the agent wanted shipped from outside /output — both tools here write inside that mount, so it stays unused./clear-workspace hook is available if you turn this into a long-lived box that processes many PDFs in sequence; sending /clear-workspace between turns empties /work.On boot the kit also prepended an environment block to the existing system prompt — the agent now knows it is running in a glovebox, what version, what fs mounts exist, and what the limits are, before any user prompt arrives.
| Piece | Where it runs | Why |
|---|---|---|
agent.ts + tools | Inside the container | Calls pdftotext / pdftk; needs the docs base image. |
glovebox.ts (wrap) | Build step only | Resolved at glovebox build; the runtime reads its config from the bundle. |
startGlovebox (kit) | Inside the container | HTTP + WS endpoint, storage adapters, file routes, injections. |
extract.ts (client) | Host machine / worker / CI | Holds the PDF, drives the prompt, writes the extracted artefacts to disk. |