Glovebox is in beta. The wire protocol and authoring API are stable for v1, but several pieces — JWT auth, multiplex prompt execution, hot reload, GCS/Azure storage adapters — are deferred to v2. Expect the surface to grow, not break.
Glovebox packages a Glove agent as an isolated, network-addressable service. You write a Glove agent the way you always do — fold tools, wire a model adapter, build the runnable — then wrap it with glovebox.wrap(runnable, config) and run glovebox build. The output is a Dockerfile, a nixpacks.toml, a server bundle, a manifest, and an auth key. The deployed container exposes a single authenticated WebSocket endpoint per session; a matching client SDK speaks to it over the wire.
The point is to factor environment-heavy work out of your host process. Some agents need ffmpeg, or pdftk, or a headless Chromium, or a custom Python toolchain. Bundling all that into the same Node process that serves your web app is impractical: cold starts balloon, the container needs elevated capabilities, and a single misbehaving agent can drag down every other request. Glovebox keeps that machinery in a dedicated sandbox and gives the host a thin client to talk to it.
Glovebox is shipped as three coordinated packages. Each one solves a single piece of the lifecycle:
| Package | Where it runs | What it does |
|---|---|---|
glovebox | Author's machine / build step | Authoring kit and glovebox build CLI. Exposes glovebox.wrap, the storage policy DSL (rule.* / composite), and the wire types every side shares. |
glovebox-kit | Inside the container | The runtime. Reads the wrapped app, validates env + storage policy, injects the standard skills/hooks, mounts the WS endpoint, and serves the file routes. |
glovebox-client | Caller (host app, CLI, worker, …) | GloveboxClient + Box SDK. Manages the WebSocket, multiplexes prompts, marshals input/output FileRefs through pluggable client storage. |
The protocol — defined in glovebox/protocol — is the only contract between them. Anything else is private to the side that ships it.
A Glovebox app is an ordinary Glove runnable handed to glovebox.wrap. The wrap call is opaque from your perspective — the kit reads it at boot to discover the runnable and the resolved config, then layers its own injections on top. The runnable still exposes the usual processRequest, defineSkill, defineHook, and addSubscriber surface.
import { glovebox, rule, composite } from "glovebox-core";
import { agent } from "./my-agent";
export default glovebox.wrap(agent, {
name: "media-extractor",
version: "0.1.0",
base: "glovebox/media",
packages: {
apt: ["jq"],
npm: ["yt-dlp-exec"],
},
env: {
OPENAI_API_KEY: { required: true, secret: true },
},
storage: {
inputs: composite([rule.url(), rule.inline()]),
outputs: composite([
rule.inline({ below: "1MB" }),
rule.localServer({ ttl: "1h" }),
]),
},
limits: { cpu: "2", memory: "2Gi", timeout: "5m" },
});Every field is optional. Omit base and you get glovebox/base. Omit storage and the defaults apply: url then inline for inputs, inline-under-1MB then localServer with a one-hour TTL for outputs. The fs map defaults to /work (writable), /input (read-only), and /output (writable) — the same layout the base images set up.
Storage policies decide how each FileRef is materialised. Inputs are how the client hands bytes to the agent; outputs are how the agent hands bytes back. The DSL is intentionally tiny — four rule constructors and a composite combiner. Rules are evaluated in declaration order; the first match wins.
import { rule, composite } from "glovebox-core";
// Outputs: tiny files inline, anything bigger goes to S3.
composite([
rule.inline({ below: "256KB" }),
rule.s3({ bucket: "agent-outputs", region: "us-east-1", prefix: "v1/" }),
]);
// Inputs: prefer URLs (no bytes on the wire), fall back to inline.
composite([rule.url(), rule.inline()]);
// Outputs: keep small reports inline, ship larger artefacts via the
// container's own /files/:id route with a 24h TTL.
composite([
rule.inline({ below: "1MB" }),
rule.localServer({ ttl: "24h" }),
]);Each rule's below / above bounds accept human-readable sizes (B, KB, MB, GB). The kit validates the outputs policy at boot — every referenced adapter must be registered, and the policy must include a terminal rule (no bound, or marked always) so every file size has a home. The url adapter is read-only; pointing an outputs rule at it fails fast.
glovebox build imports your wrap module, reads the resolved config, and emits a self-contained dist/ directory. There is no other build step — no manual Dockerfile, no hand-tuned nixpacks.toml, no separate manifest to keep in sync.
pnpm exec glovebox build ./glovebox.ts
# or with overrides
pnpm exec glovebox build ./glovebox.ts --out ./dist --name media-extractorWhat ends up in dist/:
| File | Role |
|---|---|
Dockerfile | FROM the resolved base image, layers in declared apt/pip/npm packages, copies the bundled server, links the prebuilt better-sqlite3 from the base for known images, and runs as the glovebox user on port 8080. |
nixpacks.toml | Equivalent recipe for Railway / Fly / any nixpacks host that prefers a buildpack to a raw Dockerfile. |
server/ | Esbuild-bundled entry: your wrap module, the kit, and a minimal launcher. Includes a copy of glovebox.json next to index.js so the runtime resolves it via import.meta.url. |
glovebox.json | Manifest: name, version, base, fs layout, env spec, key fingerprint, full storage policy, packages, protocol version. |
glovebox.key | Per-build bearer token. The container reads it via the GLOVEBOX_KEY env var; the kit verifies its fingerprint against the manifest at boot. |
.env.example | Template populated from the env map in your wrap config (required vs optional, secret markers, defaults). |
Five published base images cover the workloads agents typically outgrow when run in-process. They all live under ghcr.io/porkytheblack/, ship the same glovebox user (uid 10001) and /work + /input + /output + /var/glovebox layout, and bake in the same prebuilt better-sqlite3 at /opt/glovebox-prebuilt/node_modules.
| Tag | Ships |
|---|---|
glovebox/base:1.0 | Node 20, the standard layout, prebuilt better-sqlite3. |
glovebox/media:1.4 | ffmpeg, imagemagick, sox, yt-dlp. |
glovebox/docs:1.2 | pandoc, qpdf, pdftk-java, headless LibreOffice. |
glovebox/python:1.3 | uv with numpy, pandas, pillow, scipy, matplotlib. |
glovebox/browser:1.1 | Playwright + Chromium, fonts, system deps for headless runs. |
The build CLI resolves glovebox/<name> to the published tag automatically. Set GLOVEBOX_REGISTRY to point at a fork or private mirror; pass a fully-qualified reference (anything containing a colon, or anything not under glovebox/) and the resolver leaves it alone. Non-standard bases trigger a fallback path that does the user/layout setup itself and runs npm install instead of linking the prebuilt modules.
One WebSocket per client session, authenticated on upgrade with Authorization: Bearer <key>. Multiple prompts are multiplexed by an id the client picks. The full type definitions live in glovebox/protocol; the table below is the at-a-glance version.
| Message | Shape | When |
|---|---|---|
prompt | { id, text, inputs?, outputs_policy? } | Start a turn. inputs is a name → FileRef map. |
abort | { id } | Cancel an in-flight prompt by id. |
display_resolve | { slot_id, value } | Resolve a server-pushed display slot. |
display_reject | { slot_id, error } | Reject a server-pushed display slot. |
ping | { ts } | Liveness check — server replies with a matching pong. |
| Message | Shape | Carries |
|---|---|---|
event | { id, event_type, data } | Subscriber events (text deltas, tool uses, compaction). Mirrors glove-core 1:1. |
display_push | { slot } | The agent pushed a display slot — caller renders it and resolves later. |
display_clear | { slot_id } | The agent removed a slot. |
complete | { id, message, outputs } | Final assistant text + the resolved outputs map. |
error | { id, error: { code, message } } | Terminal failure for a specific prompt. |
Two HTTP routes live alongside the WebSocket: GET /health is public and returns { ok, name, version }; GET /environment requires the bearer token and returns the manifest spec the client SDK's box.environment() consumes. Server-stored outputs are streamed by GET /files/:id (also bearer-authed); appending ?consume=1 deletes the file after read for one-shot downloads.
Raw bytes never cross the wire as part of a protocol message. Every file is a discriminated FileRef the receiving side materialises through a storage adapter. Five kinds are defined; the kit ships built-in handlers for the first three.
| Kind | Carries | Use for |
|---|---|---|
inline | base64 bytes | Small payloads (under ~1MB). Ships in the message itself. |
url | url + optional headers | Public or pre-signed URLs the kit fetches at request time. |
server | id + url | Files stored on the box itself. Read via GET /files/:id. |
s3 | bucket + key + optional region | Object storage. The S3 adapter requires caller-supplied upload/download functions. |
gcs | bucket + object | Google Cloud Storage. Same pattern as S3 — caller wires the SDK. |
On the kit side, adapters live behind the StorageAdapter interface (InlineStorage, UrlStorage, LocalServerStorage, S3Storage). Pass extra adapters into startGlovebox({ adapters }) and they merge into the registry by name. The pickAdapter helper applies the policy rules in order and picks the first match for a given file size.
Wrapping a runnable hands control to the kit at boot. Before the WS endpoint comes up, the kit layers four glovebox-flavored extensions onto your agent and prepends an environment block to the existing system prompt — once, statically. The agent code itself stays untouched.
| Extension | Kind | What it does |
|---|---|---|
environment | Skill (exposed to agent) | Returns the live spec — name, version, base, fs layout, installed packages, limits. |
workspace | Skill (exposed to agent) | Lists the current contents of every fs mount. Cheap way for the model to discover what landed in /input. |
/output | Hook | Tags an absolute path for exfiltration. Anything outside /output the agent wants to ship back to the caller goes through this. |
/clear-workspace | Hook | Empties /work between turns. Useful for deterministic, test-style prompts. |
On complete, the kit lists /output, picks an adapter per file via the resolved policy, plus any extra paths the /output hook tagged during the turn. The resulting FileRefs land on the complete message's outputs field.
glovebox-client is the host-side SDK. Construct one GloveboxClient per app, with one named entry per deployed glovebox. Boxes are lazy — the underlying WebSocket only opens on first prompt.
import { GloveboxClient } from "glovebox-client";
const client = GloveboxClient.make({
endpoints: {
media: {
url: "wss://media.example.com/run",
key: process.env.GLOVEBOX_MEDIA_KEY!,
},
},
});
const box = client.box("media");
const result = box.prompt("Trim the first 30 seconds and add a watermark.", {
files: {
"input.mp4": { mime: "video/mp4", bytes: await readFile("./clip.mp4") },
},
});
// Stream subscriber events as they arrive.
for await (const ev of result.events) {
if (ev.event_type === "text_delta") {
process.stdout.write((ev.data as { text: string }).text);
}
}
// Resolve display slots the agent pushed during the turn.
for await (const ev of result.display) {
if (ev.type === "push" && ev.slot?.renderer === "confirm") {
result.resolve(ev.slot.id, true);
}
}
await result.message; // final assistant text
const outputs = await result.outputs; // Record<string, FileRef>
const trimmed = await result.read("trimmed.mp4");
await client.close();events and display are async iterables — a small queue per stream, closed when complete or error arrives. message and outputs are promises that settle on the same boundary.read(name) dispatches through the configured ClientStorage, which knows how to fetch server refs (with the bearer token), open url refs directly, and decode inline refs in place. resolve / reject / abort are fire-and-forget — failures surface through box.onSendError(...) if you wire a listener.
box.environment() hits the bearer-authed GET /environment route once and caches the result. It is the right call when an app holds many endpoints and needs to pick one based on installed packages, the protocol version, or limits — nothing the developer-facing config encodes is hidden from the client.
The simplest path is a plain docker run. The image listens on 8080 and reads its key from the GLOVEBOX_KEY env var.
docker build -t my-glovebox dist/
GLOVEBOX_KEY=$(cat dist/glovebox.key) docker run \
-p 8080:8080 \
-e GLOVEBOX_KEY \
-e OPENAI_API_KEY \
my-gloveboxRailway, Fly, and any other nixpacks-aware host pick up the generated nixpacks.toml instead. Push dist/ as the deploy root, set GLOVEBOX_KEY plus whatever secrets your env map declared, and the platform builds the same layout. The pre-built base images are pulled from ghcr.io/porkytheblack/glovebox/* by default; a private mirror can be substituted via GLOVEBOX_REGISTRY at build time.
Behind a load balancer, terminate TLS upstream and forward both the HTTP/1.1 Upgrade handshake and the WebSocket frames. Sessions are sticky to a single container — there is no cross-instance state.
v1 is intentionally narrow. A few things to know before you wire it into anything load-bearing:
PromptMachine + Context are not concurrency-safe, so the kit chains prompts through a single promise per WS connection. Multiplex by opening multiple sessions; do not rely on parallel turns inside one.S3Storage wrapper, but you supply the upload/download functions. This keeps the runtime image free of provider SDKs.The Glovebox showcase walks through a PDF-extraction agent end-to-end — wrap config, a representative tool, and the host-side invocation — built on top of glovebox/docs:1.2.