paperclip/packages/adapter-utils/src/command-managed-runtime.ts
Devin Foley 486fb88a15
Add Cloudflare sandbox provider plugin (#5687)
> _Stacked on top of #5685#5686. Diff against master includes commits
from earlier PRs in the stack — review focuses on the two new commits
(`Extend sandbox callback bridge for Worker-hosted plugins` + `Add
Cloudflare sandbox provider plugin`)._

## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Each agent runs in a sandbox environment, and operators choose which
provider backs that sandbox — today E2B and Daytona are bundled with the
platform
> - Cloudflare Workers + Durable Objects + the Sandbox SDK offer a
credible new option: globally distributed, cheap idle, and
operator-deployable as a single Worker
> - To plug it in, Paperclip needs (a) a provider plugin that speaks the
`PaperclipPluginManifestV1` lifecycle and (b) a small operator-deployed
Worker — the **bridge** — that adapts Paperclip's runtime RPCs to the
Cloudflare Sandbox SDK
> - The plugin extends the existing sandbox-callback-bridge with a
`bridge.transport: "worker"` discriminator so the platform routes
runtime RPCs through the Worker bridge instead of the in-process runner
> - This pull request adds the plugin, the bridge Worker template, and
the supporting adapter-utils + server hooks the new transport needs
> - The benefit is that operators can run sandboxes on Cloudflare's edge
with no new platform code beyond installing the plugin and deploying the
Worker

## What Changed

**Shared support (`Extend sandbox callback bridge for Worker-hosted
plugins`):**

- `packages/adapter-utils/src/sandbox-callback-bridge.{ts,test.ts}`:
expose `expectedHostHeader` so plugin-side bridge clients can verify the
canonical request envelope before forwarding.
- `packages/adapter-utils/src/command-managed-runtime.{ts,test.ts}`:
relax the always-fresh runner construction so callers can re-use a
runner across exec calls (Worker-hosted bridges hold the runner inside a
Durable Object).
- `server/src/services/environment-runtime.ts` +
`environment-runtime.test.ts`: route Worker-hosted bridges through the
same env-shaping path as E2B and pin the `requestEnv` contract.
- `server/src/services/plugin-environment-driver.ts`: thread an optional
`issueId` through the runtime descriptor so bridges can scope leases to
the originating issue (used by Cloudflare to map a sandbox to the
issue/workflow for billing and audit).
- `packages/plugins/sdk/src/protocol.ts`: add `issueId?` to
`PluginEnvironmentDriverBaseParams` and the new `bridge.transport:
"worker"` discriminator that the new plugin declares.
- `server/__tests__/heartbeat-plugin-environment.test.ts`: pin the
heartbeat path against the new runtime descriptor.

**The Cloudflare plugin itself (`Add Cloudflare sandbox provider
plugin`):**

- `packages/plugins/sandbox-providers/cloudflare/`: plugin entry,
manifest, plugin runtime (lifecycle + bridge client), config parsing,
and Vitest coverage. Manifest declares `bridge.transport: "worker"` so
the platform routes runtime RPCs through the bridge client.
- `bridge-template/`: a Worker template the operator deploys with
`wrangler`. Owns Durable Object-backed sessions (`sessions.ts`),
exec/stream routes (`exec.ts`, `routes.ts`), and an HMAC auth layer
(`auth.ts`) that pins the `Host` header surface. Includes the
SDK-contract-correct exec implementation, lease recovery, and chunked
stdout/stderr streaming.
- Tests cover lease/session handoff (`bridge-template/src/exec.test.ts`,
`routes.test.ts`), bridge client request shaping
(`src/bridge-client.test.ts`), and end-to-end plugin behavior
(`src/plugin.test.ts`) including streamed exec output. 27 tests in
total.
- `README.md` walks the operator through deploying the bridge Worker,
registering the plugin, and configuring the runtime.

## Verification

- `pnpm typecheck`
- `pnpm exec vitest run --no-coverage
packages/adapter-utils/src/sandbox-callback-bridge.test.ts
packages/adapter-utils/src/command-managed-runtime.test.ts
server/src/__tests__/environment-runtime.test.ts
server/src/__tests__/heartbeat-plugin-environment.test.ts`
- `(cd packages/plugins/sandbox-providers/cloudflare && pnpm test)` — 27
passing

For an operator-side smoke test:

1. Deploy the bridge: `cd
packages/plugins/sandbox-providers/cloudflare/bridge-template &&
wrangler deploy`
2. Register the plugin in your Paperclip instance, point its bridge URL
at the deployed Worker, set the HMAC shared secret.
3. Create a sandbox environment whose provider is `cloudflare`, then run
a Codex or Claude job against it.

## Risks

- Adds a new `bridge.transport: "worker"` code path, but the existing
E2B / Daytona transports go through the same shaped helpers and have
explicit test coverage that pins their behavior unchanged.
- The Worker bridge stores session state in a Durable Object; operator
instances must be aware of the corresponding Cloudflare costs (DO
requests, storage). Documented in the README.
- The `issueId` plumbing is optional throughout — existing plugins that
don't supply it continue to work.

## Model Used

- Provider: Anthropic
- Model: Claude Opus 4.7 (1M context)
- Capabilities used: extended reasoning, tool use (Read/Edit/Bash/Grep)

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots — N/A, no UI change
- [x] I have updated relevant documentation to reflect my changes
(plugin README, bridge-template README)
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-11 07:33:13 -07:00

231 lines
8.7 KiB
TypeScript

import path from "node:path";
import {
prepareSandboxManagedRuntime,
type PreparedSandboxManagedRuntime,
type SandboxManagedRuntimeAsset,
type SandboxManagedRuntimeClient,
type SandboxRemoteExecutionSpec,
} from "./sandbox-managed-runtime.js";
import { preferredShellForSandbox, shellCommandArgs } from "./sandbox-shell.js";
import type { RunProcessResult } from "./server-utils.js";
export interface CommandManagedRuntimeRunner {
execute(input: {
command: string;
args?: string[];
cwd?: string;
env?: Record<string, string>;
stdin?: string;
timeoutMs?: number;
onLog?: (stream: "stdout" | "stderr", chunk: string) => Promise<void>;
onSpawn?: (meta: { pid: number; startedAt: string }) => Promise<void>;
}): Promise<RunProcessResult>;
}
export interface CommandManagedRuntimeSpec {
providerKey?: string | null;
shellCommand?: "bash" | "sh" | null;
leaseId?: string | null;
remoteCwd: string;
timeoutMs?: number | null;
}
export type CommandManagedRuntimeAsset = SandboxManagedRuntimeAsset;
function shellQuote(value: string) {
return `'${value.replace(/'/g, `'"'"'`)}'`;
}
function mergeRuntimeExcludes(entries: string[] | undefined): string[] {
return [...new Set([".paperclip-runtime", ...(entries ?? [])])];
}
const REMOTE_WRITE_BASE64_CHUNK_SIZE = 32 * 1024;
function toBuffer(bytes: Buffer | Uint8Array | ArrayBuffer): Buffer {
if (Buffer.isBuffer(bytes)) return bytes;
if (bytes instanceof ArrayBuffer) return Buffer.from(bytes);
return Buffer.from(bytes.buffer, bytes.byteOffset, bytes.byteLength);
}
function requireSuccessfulResult(result: RunProcessResult, action: string): void {
if (result.exitCode === 0 && !result.timedOut) return;
const stderr = result.stderr.trim();
const detail = stderr.length > 0 ? `: ${stderr}` : "";
throw new Error(`${action} failed with exit code ${result.exitCode ?? "null"}${detail}`);
}
export function createCommandManagedRuntimeClient(input: {
runner: CommandManagedRuntimeRunner;
commandCwd: string;
timeoutMs: number;
shellCommand?: "bash" | "sh" | null;
}): SandboxManagedRuntimeClient {
const shellCommand = preferredShellForSandbox(input.shellCommand);
const runShell = async (script: string, opts: { stdin?: string; timeoutMs?: number } = {}) => {
const result = await input.runner.execute({
command: shellCommand,
args: shellCommandArgs(script),
cwd: input.commandCwd,
stdin: opts.stdin,
timeoutMs: opts.timeoutMs ?? input.timeoutMs,
});
requireSuccessfulResult(result, script);
return result;
};
return {
makeDir: async (remotePath) => {
await runShell(`mkdir -p ${shellQuote(remotePath)}`);
},
writeFile: async (remotePath, bytes) => {
const body = toBuffer(bytes).toString("base64");
const remoteDir = path.posix.dirname(remotePath);
const remoteTempPath = `${remotePath}.paperclip-upload.b64`;
await runShell(
`mkdir -p ${shellQuote(remoteDir)} && rm -f ${shellQuote(remoteTempPath)} && : > ${shellQuote(remoteTempPath)}`,
);
for (let offset = 0; offset < body.length; offset += REMOTE_WRITE_BASE64_CHUNK_SIZE) {
const chunk = body.slice(offset, offset + REMOTE_WRITE_BASE64_CHUNK_SIZE);
await runShell(`printf '%s' ${shellQuote(chunk)} >> ${shellQuote(remoteTempPath)}`);
}
await runShell(
`base64 -d < ${shellQuote(remoteTempPath)} > ${shellQuote(remotePath)} && rm -f ${shellQuote(remoteTempPath)}`,
);
},
readFile: async (remotePath) => {
const result = await runShell(`base64 < ${shellQuote(remotePath)}`);
return Buffer.from(result.stdout.replace(/\s+/g, ""), "base64");
},
listFiles: async (remotePath) => {
const result = await runShell(
`if [ -d ${shellQuote(remotePath)} ]; then ` +
`for entry in ${shellQuote(remotePath)}/*; do ` +
`[ -f "$entry" ] || continue; ` +
`basename "$entry"; ` +
`done; ` +
`fi`,
);
return result.stdout
.split(/\r?\n/)
.map((entry) => entry.trim())
.filter((entry) => entry.length > 0)
.sort((left, right) => left.localeCompare(right));
},
remove: async (remotePath) => {
const result = await input.runner.execute({
command: shellCommand,
args: shellCommandArgs(`rm -rf ${shellQuote(remotePath)}`),
cwd: input.commandCwd,
timeoutMs: input.timeoutMs,
});
requireSuccessfulResult(result, `remove ${remotePath}`);
},
run: async (command, options) => {
const result = await input.runner.execute({
command: shellCommand,
args: shellCommandArgs(command),
cwd: input.commandCwd,
timeoutMs: options.timeoutMs,
});
requireSuccessfulResult(result, command);
},
};
}
export async function prepareCommandManagedRuntime(input: {
runner: CommandManagedRuntimeRunner;
spec: CommandManagedRuntimeSpec;
adapterKey: string;
workspaceLocalDir: string;
workspaceRemoteDir?: string;
workspaceExclude?: string[];
preserveAbsentOnRestore?: string[];
assets?: CommandManagedRuntimeAsset[];
installCommand?: string | null;
/** When provided alongside `installCommand`, skip the install if `command -v <detectCommand>` succeeds. */
detectCommand?: string | null;
}): Promise<PreparedSandboxManagedRuntime> {
const timeoutMs = input.spec.timeoutMs && input.spec.timeoutMs > 0 ? input.spec.timeoutMs : 300_000;
const workspaceRemoteDir = input.workspaceRemoteDir ?? input.spec.remoteCwd;
// Managed-runtime sync/restore scripts use absolute paths throughout, so
// run them from a stable cwd. The target workspace itself may be removed or
// recreated during a run, which breaks shell startup if we chdir into it.
const commandCwd = "/";
const runtimeSpec: SandboxRemoteExecutionSpec = {
transport: "sandbox",
provider: input.spec.providerKey ?? "sandbox",
sandboxId: input.spec.leaseId ?? "managed",
remoteCwd: workspaceRemoteDir,
timeoutMs,
apiKey: null,
};
const client = createCommandManagedRuntimeClient({
runner: input.runner,
commandCwd,
timeoutMs,
shellCommand: input.spec.shellCommand,
});
const shellCommand = preferredShellForSandbox(input.spec.shellCommand);
if (input.installCommand?.trim()) {
const installCommand = input.installCommand.trim();
const detectCommand = input.detectCommand?.trim();
// Skip the install when the binary is already on PATH. Without this
// probe the install runs unconditionally on every execute() call (and
// also runs a second time after `ensureAdapterExecutionTargetCommandResolvable`
// has already installed it during the resolvability gate).
if (detectCommand) {
const probe = await input.runner.execute({
command: shellCommand,
args: shellCommandArgs(`command -v ${shellQuote(detectCommand)} >/dev/null 2>&1`),
cwd: commandCwd,
timeoutMs,
});
if (!probe.timedOut && (probe.exitCode ?? 1) === 0) {
return await prepareSandboxManagedRuntime({
spec: runtimeSpec,
client,
adapterKey: input.adapterKey,
workspaceLocalDir: input.workspaceLocalDir,
workspaceRemoteDir,
workspaceExclude: mergeRuntimeExcludes(input.workspaceExclude),
preserveAbsentOnRestore: input.preserveAbsentOnRestore,
assets: input.assets,
});
}
}
const result = await input.runner.execute({
command: shellCommand,
args: shellCommandArgs(installCommand),
cwd: commandCwd,
timeoutMs,
});
// A failed install is not always fatal: the CLI may already be on PATH
// from a previous lease, the template image, or another path entry. Log
// and continue rather than aborting the agent run; downstream code that
// exec's the CLI will surface a clear "command not found" if it is in
// fact missing. The test path's `maybeRunSandboxInstallCommand` already
// honors this contract — keep them consistent.
if (result.timedOut || (result.exitCode ?? 0) !== 0) {
const tail = (text: string) =>
text.split(/\r?\n/).filter((line) => line.trim().length > 0).slice(-3).join(" | ").slice(0, 480);
const reason = result.timedOut ? "timed out" : `exited ${result.exitCode ?? "?"}`;
console.warn(
`[paperclip] managed-runtime install command ${reason}: ${installCommand} :: ${tail(result.stderr || result.stdout)}`,
);
}
}
return await prepareSandboxManagedRuntime({
spec: runtimeSpec,
client,
adapterKey: input.adapterKey,
workspaceLocalDir: input.workspaceLocalDir,
workspaceRemoteDir,
workspaceExclude: mergeRuntimeExcludes(input.workspaceExclude),
preserveAbsentOnRestore: input.preserveAbsentOnRestore,
assets: input.assets,
});
}