paperclip/packages/adapter-utils/src/sandbox-callback-bridge.test.ts

984 lines
34 KiB
TypeScript
Raw Normal View History

Add sandbox callback bridge for remote environment API access (#4801) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Agents can run inside sandboxed environments like E2B, which are isolated from the host network > - Sandboxed agents need to call back to the Paperclip API to report progress, post comments, and update issue status > - But sandbox environments cannot reach the Paperclip server directly because they run in isolated network namespaces > - This PR adds a callback bridge that proxies API requests from the sandbox to the Paperclip server, running as a local HTTP server on the host that forwards authenticated requests > - The bridge is started automatically when an adapter launches a sandbox execution, and torn down when the run completes > - The benefit is sandboxed agents can interact with the Paperclip API without requiring network-level access to the host, enabling E2B and similar providers to work end-to-end ## What Changed - Added `sandbox-callback-bridge.ts` in `packages/adapter-utils/` — a lightweight HTTP bridge server that accepts requests from sandbox environments and proxies them to the Paperclip API with authentication - Added request validation and security policy: the bridge only forwards requests to the configured API URL, validates content types, enforces size limits, and rejects non-API paths - Wired the bridge into all remote adapter execute paths (claude, codex, cursor, gemini, pi) — the bridge starts before the agent process and the bridge URL is passed via environment variables - Updated `environment-execution-target.ts` to prefer the explicit API URL from environment lease metadata for sandbox callback routing - Fixed Claude sandbox runtime setup to work with the bridge configuration - Added comprehensive test coverage for bridge request handling, policy enforcement, and sandbox execution integration - Fixed browser bundling — the bridge module is excluded from the frontend bundle via the adapter-utils index export ## Verification - `pnpm test` — all existing and new tests pass, including bridge unit tests and sandbox execution integration tests - `pnpm typecheck` — clean - Manual: configure an E2B environment, run an agent task, verify the agent can post comments and update issue status through the bridge ## Risks - Medium. This is a new network-facing component (HTTP server on localhost). The security policy restricts forwarding to the configured API URL only and validates all requests, but any proxy introduces attack surface. The bridge binds to localhost only and is scoped to the lifetime of a single agent run. ## Model Used Codex GPT 5.4 high via Paperclip. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-29 16:37:34 -07:00
import { execFile as execFileCallback } from "node:child_process";
import { mkdir, mkdtemp, readFile, readdir, rm, writeFile } from "node:fs/promises";
import os from "node:os";
import path from "node:path";
import { promisify } from "node:util";
Add Cloudflare sandbox provider plugin (#5687) > _Stacked on top of #5685 → #5686. Diff against master includes commits from earlier PRs in the stack — review focuses on the two new commits (`Extend sandbox callback bridge for Worker-hosted plugins` + `Add Cloudflare sandbox provider plugin`)._ ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Each agent runs in a sandbox environment, and operators choose which provider backs that sandbox — today E2B and Daytona are bundled with the platform > - Cloudflare Workers + Durable Objects + the Sandbox SDK offer a credible new option: globally distributed, cheap idle, and operator-deployable as a single Worker > - To plug it in, Paperclip needs (a) a provider plugin that speaks the `PaperclipPluginManifestV1` lifecycle and (b) a small operator-deployed Worker — the **bridge** — that adapts Paperclip's runtime RPCs to the Cloudflare Sandbox SDK > - The plugin extends the existing sandbox-callback-bridge with a `bridge.transport: "worker"` discriminator so the platform routes runtime RPCs through the Worker bridge instead of the in-process runner > - This pull request adds the plugin, the bridge Worker template, and the supporting adapter-utils + server hooks the new transport needs > - The benefit is that operators can run sandboxes on Cloudflare's edge with no new platform code beyond installing the plugin and deploying the Worker ## What Changed **Shared support (`Extend sandbox callback bridge for Worker-hosted plugins`):** - `packages/adapter-utils/src/sandbox-callback-bridge.{ts,test.ts}`: expose `expectedHostHeader` so plugin-side bridge clients can verify the canonical request envelope before forwarding. - `packages/adapter-utils/src/command-managed-runtime.{ts,test.ts}`: relax the always-fresh runner construction so callers can re-use a runner across exec calls (Worker-hosted bridges hold the runner inside a Durable Object). - `server/src/services/environment-runtime.ts` + `environment-runtime.test.ts`: route Worker-hosted bridges through the same env-shaping path as E2B and pin the `requestEnv` contract. - `server/src/services/plugin-environment-driver.ts`: thread an optional `issueId` through the runtime descriptor so bridges can scope leases to the originating issue (used by Cloudflare to map a sandbox to the issue/workflow for billing and audit). - `packages/plugins/sdk/src/protocol.ts`: add `issueId?` to `PluginEnvironmentDriverBaseParams` and the new `bridge.transport: "worker"` discriminator that the new plugin declares. - `server/__tests__/heartbeat-plugin-environment.test.ts`: pin the heartbeat path against the new runtime descriptor. **The Cloudflare plugin itself (`Add Cloudflare sandbox provider plugin`):** - `packages/plugins/sandbox-providers/cloudflare/`: plugin entry, manifest, plugin runtime (lifecycle + bridge client), config parsing, and Vitest coverage. Manifest declares `bridge.transport: "worker"` so the platform routes runtime RPCs through the bridge client. - `bridge-template/`: a Worker template the operator deploys with `wrangler`. Owns Durable Object-backed sessions (`sessions.ts`), exec/stream routes (`exec.ts`, `routes.ts`), and an HMAC auth layer (`auth.ts`) that pins the `Host` header surface. Includes the SDK-contract-correct exec implementation, lease recovery, and chunked stdout/stderr streaming. - Tests cover lease/session handoff (`bridge-template/src/exec.test.ts`, `routes.test.ts`), bridge client request shaping (`src/bridge-client.test.ts`), and end-to-end plugin behavior (`src/plugin.test.ts`) including streamed exec output. 27 tests in total. - `README.md` walks the operator through deploying the bridge Worker, registering the plugin, and configuring the runtime. ## Verification - `pnpm typecheck` - `pnpm exec vitest run --no-coverage packages/adapter-utils/src/sandbox-callback-bridge.test.ts packages/adapter-utils/src/command-managed-runtime.test.ts server/src/__tests__/environment-runtime.test.ts server/src/__tests__/heartbeat-plugin-environment.test.ts` - `(cd packages/plugins/sandbox-providers/cloudflare && pnpm test)` — 27 passing For an operator-side smoke test: 1. Deploy the bridge: `cd packages/plugins/sandbox-providers/cloudflare/bridge-template && wrangler deploy` 2. Register the plugin in your Paperclip instance, point its bridge URL at the deployed Worker, set the HMAC shared secret. 3. Create a sandbox environment whose provider is `cloudflare`, then run a Codex or Claude job against it. ## Risks - Adds a new `bridge.transport: "worker"` code path, but the existing E2B / Daytona transports go through the same shaped helpers and have explicit test coverage that pins their behavior unchanged. - The Worker bridge stores session state in a Durable Object; operator instances must be aware of the corresponding Cloudflare costs (DO requests, storage). Documented in the README. - The `issueId` plumbing is optional throughout — existing plugins that don't supply it continue to work. ## Model Used - Provider: Anthropic - Model: Claude Opus 4.7 (1M context) - Capabilities used: extended reasoning, tool use (Read/Edit/Bash/Grep) ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots — N/A, no UI change - [x] I have updated relevant documentation to reflect my changes (plugin README, bridge-template README) - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-11 07:33:13 -07:00
import { afterEach, describe, expect, it, vi } from "vitest";
Add sandbox callback bridge for remote environment API access (#4801) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Agents can run inside sandboxed environments like E2B, which are isolated from the host network > - Sandboxed agents need to call back to the Paperclip API to report progress, post comments, and update issue status > - But sandbox environments cannot reach the Paperclip server directly because they run in isolated network namespaces > - This PR adds a callback bridge that proxies API requests from the sandbox to the Paperclip server, running as a local HTTP server on the host that forwards authenticated requests > - The bridge is started automatically when an adapter launches a sandbox execution, and torn down when the run completes > - The benefit is sandboxed agents can interact with the Paperclip API without requiring network-level access to the host, enabling E2B and similar providers to work end-to-end ## What Changed - Added `sandbox-callback-bridge.ts` in `packages/adapter-utils/` — a lightweight HTTP bridge server that accepts requests from sandbox environments and proxies them to the Paperclip API with authentication - Added request validation and security policy: the bridge only forwards requests to the configured API URL, validates content types, enforces size limits, and rejects non-API paths - Wired the bridge into all remote adapter execute paths (claude, codex, cursor, gemini, pi) — the bridge starts before the agent process and the bridge URL is passed via environment variables - Updated `environment-execution-target.ts` to prefer the explicit API URL from environment lease metadata for sandbox callback routing - Fixed Claude sandbox runtime setup to work with the bridge configuration - Added comprehensive test coverage for bridge request handling, policy enforcement, and sandbox execution integration - Fixed browser bundling — the bridge module is excluded from the frontend bundle via the adapter-utils index export ## Verification - `pnpm test` — all existing and new tests pass, including bridge unit tests and sandbox execution integration tests - `pnpm typecheck` — clean - Manual: configure an E2B environment, run an agent task, verify the agent can post comments and update issue status through the bridge ## Risks - Medium. This is a new network-facing component (HTTP server on localhost). The security policy restricts forwarding to the configured API URL only and validates all requests, but any proxy introduces attack surface. The bridge binds to localhost only and is scoped to the lifetime of a single agent run. ## Model Used Codex GPT 5.4 high via Paperclip. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-29 16:37:34 -07:00
import { prepareCommandManagedRuntime } from "./command-managed-runtime.js";
import {
Expand sandbox callback bridge allowlist to cover the documented heartbeat surface (#5324) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - When an agent runs in an e2b sandbox or other non-managed environment, it talks back to the Paperclip server through a per-lease callback bridge that proxies HTTP requests > - The bridge has an allowlist of method/path patterns it will forward; anything outside the list is rejected to keep the bridge tight > - The allowlist had drifted behind what the heartbeat documentation describes as the supported callback surface — several documented endpoints (issue updates, agent-side log emit, work-status writes) were being rejected at the bridge > - This pull request expands the allowlist to cover the documented heartbeat surface and adds tests that pin every newly-allowed pattern, so the doc and the bridge stay in sync > - The benefit is sandboxed runs no longer hit "method not allowed" / "path not allowed" rejections on the documented set of callbacks ## What Changed - `packages/adapter-utils/src/sandbox-callback-bridge.ts`: expand the method/path allowlist to match the documented heartbeat callback surface - `packages/adapter-utils/src/sandbox-callback-bridge.test.ts`: add coverage for every newly-allowed pattern, plus negative cases for patterns that should still be rejected ## Verification - `pnpm vitest run --no-coverage --project @paperclipai/adapter-utils` - `pnpm typecheck` clean - Manual: previously-rejected callbacks from sandboxed runs now succeed end-to-end ## Risks Low. The allowlist only grows; nothing previously allowed is now blocked. Tests pin both the new allowed patterns and that out-of-doc patterns stay rejected. ## Model Used Claude Opus 4.7 (1M context) ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable — new tests cover added patterns + still-rejected negatives - [x] If this change affects the UI, I have included before/after screenshots — N/A (no UI) - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-05-05 19:30:11 -07:00
authorizeSandboxCallbackBridgeRequestWithRoutes,
Serialize sandbox callback bridge against concurrent heartbeats (#5326) > **Stacked PR.** This PR's branch carries cumulative content from #5324 (bridge allowlist expand) and #5325 (env sanitization) — the mutex/sha256 logic in this PR sits on top of both. Reviewers should focus on the files this PR's commit touches: `packages/adapter-utils/src/sandbox-callback-bridge.{ts,test.ts}`, `packages/adapter-utils/src/ssh.ts`, and `packages/adapter-utils/src/ssh-fixture.test.ts`. Will rebase onto `master` and force-push once both prerequisite PRs are merged. ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Each agent that runs in a sandbox or via SSH talks back to the Paperclip server through a per-lease callback bridge whose entrypoint script is uploaded to the remote > - When two heartbeats target the same agent on the same machine concurrently, both upload the bridge entrypoint and both write to the same response files — producing torn-write races: `SyntaxError: Identifier 'randomUUID' has already been declared` from a concatenated upload, `mv: cannot stat …` from colliding `.json.tmp` writes, and 0-byte commits from a truncated stdin > - This pull request serializes those operations with a POSIX `mkdir`-mutex (PID liveness check + atomic rename) at the bridge entrypoint upload, applies the same lock to the bridge response writer, forwards stdin into remote ssh commands so the entrypoint payload arrives intact, and verifies a sha256 of the upload before promoting it > - The benefit is concurrent heartbeats no longer corrupt each other's bridge state ## What Changed - `packages/adapter-utils/src/sandbox-callback-bridge.ts`: serialize entrypoint upload and response writes via POSIX `mkdir`-mutex with PID liveness; sha256 the upload before promoting via `mv`; content-skip when the existing entrypoint already matches - `packages/adapter-utils/src/ssh.ts`: forward stdin into remote ssh commands through the SSH managed runtime so `cat > "$remote_upload"` actually receives the base64-encoded entrypoint - `packages/adapter-utils/src/ssh-fixture.test.ts`: cover the stdin-forwarded SSH path - `packages/adapter-utils/src/sandbox-callback-bridge.test.ts`: cover the mutex, content-skip, sha256-verify, and atomic-rename paths ## Verification - `pnpm vitest run --no-coverage --project @paperclipai/adapter-utils` - `pnpm typecheck` clean - Manual: two parallel heartbeats targeting the same SSH agent no longer race on the bridge entrypoint or response files ## Risks Medium. Serializing previously-parallel operations adds latency on the contended path (one heartbeat waits on another), bounded by the entrypoint upload time. The mutex includes PID liveness so a crashed heartbeat doesn't deadlock subsequent ones. Sha256-verify gives a clear "torn upload" failure mode instead of silent 0-byte commits. ## Model Used Claude Opus 4.7 (1M context) ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable — tests cover mutex + sha256-verify + stdin-forwarded ssh - [x] If this change affects the UI, I have included before/after screenshots — N/A (no UI) - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-05-05 20:01:04 -07:00
createCommandManagedSandboxCallbackBridgeQueueClient,
Add sandbox callback bridge for remote environment API access (#4801) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Agents can run inside sandboxed environments like E2B, which are isolated from the host network > - Sandboxed agents need to call back to the Paperclip API to report progress, post comments, and update issue status > - But sandbox environments cannot reach the Paperclip server directly because they run in isolated network namespaces > - This PR adds a callback bridge that proxies API requests from the sandbox to the Paperclip server, running as a local HTTP server on the host that forwards authenticated requests > - The bridge is started automatically when an adapter launches a sandbox execution, and torn down when the run completes > - The benefit is sandboxed agents can interact with the Paperclip API without requiring network-level access to the host, enabling E2B and similar providers to work end-to-end ## What Changed - Added `sandbox-callback-bridge.ts` in `packages/adapter-utils/` — a lightweight HTTP bridge server that accepts requests from sandbox environments and proxies them to the Paperclip API with authentication - Added request validation and security policy: the bridge only forwards requests to the configured API URL, validates content types, enforces size limits, and rejects non-API paths - Wired the bridge into all remote adapter execute paths (claude, codex, cursor, gemini, pi) — the bridge starts before the agent process and the bridge URL is passed via environment variables - Updated `environment-execution-target.ts` to prefer the explicit API URL from environment lease metadata for sandbox callback routing - Fixed Claude sandbox runtime setup to work with the bridge configuration - Added comprehensive test coverage for bridge request handling, policy enforcement, and sandbox execution integration - Fixed browser bundling — the bridge module is excluded from the frontend bundle via the adapter-utils index export ## Verification - `pnpm test` — all existing and new tests pass, including bridge unit tests and sandbox execution integration tests - `pnpm typecheck` — clean - Manual: configure an E2B environment, run an agent task, verify the agent can post comments and update issue status through the bridge ## Risks - Medium. This is a new network-facing component (HTTP server on localhost). The security policy restricts forwarding to the configured API URL only and validates all requests, but any proxy introduces attack surface. The bridge binds to localhost only and is scoped to the lifetime of a single agent run. ## Model Used Codex GPT 5.4 high via Paperclip. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-29 16:37:34 -07:00
createFileSystemSandboxCallbackBridgeQueueClient,
createSandboxCallbackBridgeAsset,
createSandboxCallbackBridgeToken,
sandboxCallbackBridgeDirectories,
Serialize sandbox callback bridge against concurrent heartbeats (#5326) > **Stacked PR.** This PR's branch carries cumulative content from #5324 (bridge allowlist expand) and #5325 (env sanitization) — the mutex/sha256 logic in this PR sits on top of both. Reviewers should focus on the files this PR's commit touches: `packages/adapter-utils/src/sandbox-callback-bridge.{ts,test.ts}`, `packages/adapter-utils/src/ssh.ts`, and `packages/adapter-utils/src/ssh-fixture.test.ts`. Will rebase onto `master` and force-push once both prerequisite PRs are merged. ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Each agent that runs in a sandbox or via SSH talks back to the Paperclip server through a per-lease callback bridge whose entrypoint script is uploaded to the remote > - When two heartbeats target the same agent on the same machine concurrently, both upload the bridge entrypoint and both write to the same response files — producing torn-write races: `SyntaxError: Identifier 'randomUUID' has already been declared` from a concatenated upload, `mv: cannot stat …` from colliding `.json.tmp` writes, and 0-byte commits from a truncated stdin > - This pull request serializes those operations with a POSIX `mkdir`-mutex (PID liveness check + atomic rename) at the bridge entrypoint upload, applies the same lock to the bridge response writer, forwards stdin into remote ssh commands so the entrypoint payload arrives intact, and verifies a sha256 of the upload before promoting it > - The benefit is concurrent heartbeats no longer corrupt each other's bridge state ## What Changed - `packages/adapter-utils/src/sandbox-callback-bridge.ts`: serialize entrypoint upload and response writes via POSIX `mkdir`-mutex with PID liveness; sha256 the upload before promoting via `mv`; content-skip when the existing entrypoint already matches - `packages/adapter-utils/src/ssh.ts`: forward stdin into remote ssh commands through the SSH managed runtime so `cat > "$remote_upload"` actually receives the base64-encoded entrypoint - `packages/adapter-utils/src/ssh-fixture.test.ts`: cover the stdin-forwarded SSH path - `packages/adapter-utils/src/sandbox-callback-bridge.test.ts`: cover the mutex, content-skip, sha256-verify, and atomic-rename paths ## Verification - `pnpm vitest run --no-coverage --project @paperclipai/adapter-utils` - `pnpm typecheck` clean - Manual: two parallel heartbeats targeting the same SSH agent no longer race on the bridge entrypoint or response files ## Risks Medium. Serializing previously-parallel operations adds latency on the contended path (one heartbeat waits on another), bounded by the entrypoint upload time. The mutex includes PID liveness so a crashed heartbeat doesn't deadlock subsequent ones. Sha256-verify gives a clear "torn upload" failure mode instead of silent 0-byte commits. ## Model Used Claude Opus 4.7 (1M context) ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable — tests cover mutex + sha256-verify + stdin-forwarded ssh - [x] If this change affects the UI, I have included before/after screenshots — N/A (no UI) - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-05-05 20:01:04 -07:00
syncSandboxCallbackBridgeEntrypoint,
Add sandbox callback bridge for remote environment API access (#4801) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Agents can run inside sandboxed environments like E2B, which are isolated from the host network > - Sandboxed agents need to call back to the Paperclip API to report progress, post comments, and update issue status > - But sandbox environments cannot reach the Paperclip server directly because they run in isolated network namespaces > - This PR adds a callback bridge that proxies API requests from the sandbox to the Paperclip server, running as a local HTTP server on the host that forwards authenticated requests > - The bridge is started automatically when an adapter launches a sandbox execution, and torn down when the run completes > - The benefit is sandboxed agents can interact with the Paperclip API without requiring network-level access to the host, enabling E2B and similar providers to work end-to-end ## What Changed - Added `sandbox-callback-bridge.ts` in `packages/adapter-utils/` — a lightweight HTTP bridge server that accepts requests from sandbox environments and proxies them to the Paperclip API with authentication - Added request validation and security policy: the bridge only forwards requests to the configured API URL, validates content types, enforces size limits, and rejects non-API paths - Wired the bridge into all remote adapter execute paths (claude, codex, cursor, gemini, pi) — the bridge starts before the agent process and the bridge URL is passed via environment variables - Updated `environment-execution-target.ts` to prefer the explicit API URL from environment lease metadata for sandbox callback routing - Fixed Claude sandbox runtime setup to work with the bridge configuration - Added comprehensive test coverage for bridge request handling, policy enforcement, and sandbox execution integration - Fixed browser bundling — the bridge module is excluded from the frontend bundle via the adapter-utils index export ## Verification - `pnpm test` — all existing and new tests pass, including bridge unit tests and sandbox execution integration tests - `pnpm typecheck` — clean - Manual: configure an E2B environment, run an agent task, verify the agent can post comments and update issue status through the bridge ## Risks - Medium. This is a new network-facing component (HTTP server on localhost). The security policy restricts forwarding to the configured API URL only and validates all requests, but any proxy introduces attack surface. The bridge binds to localhost only and is scoped to the lifetime of a single agent run. ## Model Used Codex GPT 5.4 high via Paperclip. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-29 16:37:34 -07:00
startSandboxCallbackBridgeServer,
startSandboxCallbackBridgeWorker,
} from "./sandbox-callback-bridge.js";
import type { RunProcessResult } from "./server-utils.js";
const execFile = promisify(execFileCallback);
describe("sandbox callback bridge", () => {
const cleanupDirs: string[] = [];
const cleanupFns: Array<() => Promise<void>> = [];
function createExecRunner() {
return {
execute: async (input: {
command: string;
args?: string[];
cwd?: string;
env?: Record<string, string>;
stdin?: string;
timeoutMs?: number;
}): Promise<RunProcessResult> => {
const startedAt = new Date().toISOString();
const env = {
...process.env,
...input.env,
};
Let sandbox providers declare shell defaults (#5114) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Agents execute in sandboxed remote environments served by pluggable sandbox > providers (E2B today, more later) > - Today every sandbox command runs under `sh -lc` regardless of what the > provider's container actually ships > - That misses bash-only shell init on E2B (which ships bash) and prevents > future providers from declaring a different default — there's no way for a > provider to say "I have bash, use it" > - This PR adds a `shellCommand` field to sandbox execution targets so providers > can declare their preferred shell ("bash" for E2B), threads it through the > sandbox-managed-runtime client, callback bridge, and execution-target shell > helper, and validates the value at the lease-metadata boundary > - The benefit is that sandbox commands run under the right shell on the right > provider, and adding new sandbox providers only needs to declare a shell > preference ## What Changed - Added `packages/adapter-utils/src/sandbox-shell.ts` exporting `preferredShellForSandbox(shellCommand)` (returns `"bash"` if input is `"bash"`, else `"sh"`) - Added `shellCommand?: "bash" | "sh" | null` to `AdapterSandboxExecutionTarget` and `CommandManagedRuntimeSpec`; threaded it through `runAdapterExecutionTargetShellCommand`, `prepareAdapterExecutionTargetRuntime`, and `startAdapterExecutionTargetPaperclipBridge` - `createCommandManagedRuntimeClient`, `prepareCommandManagedRuntime`, and `createCommandManagedSandboxCallbackBridgeQueueClient` now take an optional `shellCommand` and use `preferredShellForSandbox` to pick the shell - `startSandboxCallbackBridgeServer` accepts a `shellCommand` for its server startup, readiness probe, and stop hook - E2B sandbox plugin declares `shellCommand: "bash"` in `leaseMetadata` - `resolveEnvironmentExecutionTarget` reads `shellCommand` from lease metadata (validating against `"bash" | "sh" | null`) - `environment-runtime.ts` adds `"shellCommand"` to `INTERNAL_PLUGIN_SANDBOX_CONFIG_KEYS` so the field round-trips through internal plugin config without leaking to external plugin metadata - Updated tests in `command-managed-runtime.test.ts`, `execution-target-sandbox.test.ts`, `sandbox-callback-bridge.test.ts`, `environment-execution-target.test.ts` ## Verification - `pnpm --filter @paperclipai/adapter-utils test` - `pnpm --filter @paperclipai/server test -- environment-execution-target` - `pnpm --filter @paperclipai/sandbox-providers-e2b test` - Manual QA: boot a Paperclip instance, create an E2B-backed environment, run a claude_local agent against it, and confirm the run completes (verifies bash shell semantics flow through the callback bridge end-to-end) ## Risks - E2B sandbox commands now run under `bash -lc` instead of `sh -lc`. Bash is a strict superset for the commands we issue (no busybox-only flags in our shell scripts), so risk is low. The shellCommand field is opt-in via lease metadata — providers that don't declare it stay on `sh`. - New optional field on `CommandManagedRuntimeSpec` and `AdapterSandboxExecutionTarget`. Consumers ignoring the field retain previous behaviour (sh). - Lease metadata now carries an additional field. Existing leases without `shellCommand` resolve to `null` and fall back to sh — backwards compatible. ## Model Used - OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI - Provider: OpenAI - Used to author the code changes in this PR ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots — N/A (no UI changes) - [ ] I have updated relevant documentation to reflect my changes — N/A - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-05-03 12:19:35 -07:00
const command =
input.command === "sh" ? "/bin/sh" : input.command === "bash" ? "/bin/bash" : input.command;
Add sandbox callback bridge for remote environment API access (#4801) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Agents can run inside sandboxed environments like E2B, which are isolated from the host network > - Sandboxed agents need to call back to the Paperclip API to report progress, post comments, and update issue status > - But sandbox environments cannot reach the Paperclip server directly because they run in isolated network namespaces > - This PR adds a callback bridge that proxies API requests from the sandbox to the Paperclip server, running as a local HTTP server on the host that forwards authenticated requests > - The bridge is started automatically when an adapter launches a sandbox execution, and torn down when the run completes > - The benefit is sandboxed agents can interact with the Paperclip API without requiring network-level access to the host, enabling E2B and similar providers to work end-to-end ## What Changed - Added `sandbox-callback-bridge.ts` in `packages/adapter-utils/` — a lightweight HTTP bridge server that accepts requests from sandbox environments and proxies them to the Paperclip API with authentication - Added request validation and security policy: the bridge only forwards requests to the configured API URL, validates content types, enforces size limits, and rejects non-API paths - Wired the bridge into all remote adapter execute paths (claude, codex, cursor, gemini, pi) — the bridge starts before the agent process and the bridge URL is passed via environment variables - Updated `environment-execution-target.ts` to prefer the explicit API URL from environment lease metadata for sandbox callback routing - Fixed Claude sandbox runtime setup to work with the bridge configuration - Added comprehensive test coverage for bridge request handling, policy enforcement, and sandbox execution integration - Fixed browser bundling — the bridge module is excluded from the frontend bundle via the adapter-utils index export ## Verification - `pnpm test` — all existing and new tests pass, including bridge unit tests and sandbox execution integration tests - `pnpm typecheck` — clean - Manual: configure an E2B environment, run an agent task, verify the agent can post comments and update issue status through the bridge ## Risks - Medium. This is a new network-facing component (HTTP server on localhost). The security policy restricts forwarding to the configured API URL only and validates all requests, but any proxy introduces attack surface. The bridge binds to localhost only and is scoped to the lifetime of a single agent run. ## Model Used Codex GPT 5.4 high via Paperclip. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-29 16:37:34 -07:00
const args = [...(input.args ?? [])];
Let sandbox providers declare shell defaults (#5114) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Agents execute in sandboxed remote environments served by pluggable sandbox > providers (E2B today, more later) > - Today every sandbox command runs under `sh -lc` regardless of what the > provider's container actually ships > - That misses bash-only shell init on E2B (which ships bash) and prevents > future providers from declaring a different default — there's no way for a > provider to say "I have bash, use it" > - This PR adds a `shellCommand` field to sandbox execution targets so providers > can declare their preferred shell ("bash" for E2B), threads it through the > sandbox-managed-runtime client, callback bridge, and execution-target shell > helper, and validates the value at the lease-metadata boundary > - The benefit is that sandbox commands run under the right shell on the right > provider, and adding new sandbox providers only needs to declare a shell > preference ## What Changed - Added `packages/adapter-utils/src/sandbox-shell.ts` exporting `preferredShellForSandbox(shellCommand)` (returns `"bash"` if input is `"bash"`, else `"sh"`) - Added `shellCommand?: "bash" | "sh" | null` to `AdapterSandboxExecutionTarget` and `CommandManagedRuntimeSpec`; threaded it through `runAdapterExecutionTargetShellCommand`, `prepareAdapterExecutionTargetRuntime`, and `startAdapterExecutionTargetPaperclipBridge` - `createCommandManagedRuntimeClient`, `prepareCommandManagedRuntime`, and `createCommandManagedSandboxCallbackBridgeQueueClient` now take an optional `shellCommand` and use `preferredShellForSandbox` to pick the shell - `startSandboxCallbackBridgeServer` accepts a `shellCommand` for its server startup, readiness probe, and stop hook - E2B sandbox plugin declares `shellCommand: "bash"` in `leaseMetadata` - `resolveEnvironmentExecutionTarget` reads `shellCommand` from lease metadata (validating against `"bash" | "sh" | null`) - `environment-runtime.ts` adds `"shellCommand"` to `INTERNAL_PLUGIN_SANDBOX_CONFIG_KEYS` so the field round-trips through internal plugin config without leaking to external plugin metadata - Updated tests in `command-managed-runtime.test.ts`, `execution-target-sandbox.test.ts`, `sandbox-callback-bridge.test.ts`, `environment-execution-target.test.ts` ## Verification - `pnpm --filter @paperclipai/adapter-utils test` - `pnpm --filter @paperclipai/server test -- environment-execution-target` - `pnpm --filter @paperclipai/sandbox-providers-e2b test` - Manual QA: boot a Paperclip instance, create an E2B-backed environment, run a claude_local agent against it, and confirm the run completes (verifies bash shell semantics flow through the callback bridge end-to-end) ## Risks - E2B sandbox commands now run under `bash -lc` instead of `sh -lc`. Bash is a strict superset for the commands we issue (no busybox-only flags in our shell scripts), so risk is low. The shellCommand field is opt-in via lease metadata — providers that don't declare it stay on `sh`. - New optional field on `CommandManagedRuntimeSpec` and `AdapterSandboxExecutionTarget`. Consumers ignoring the field retain previous behaviour (sh). - Lease metadata now carries an additional field. Existing leases without `shellCommand` resolve to `null` and fall back to sh — backwards compatible. ## Model Used - OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI - Provider: OpenAI - Used to author the code changes in this PR ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots — N/A (no UI changes) - [ ] I have updated relevant documentation to reflect my changes — N/A - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-05-03 12:19:35 -07:00
if (
input.stdin != null &&
(input.command === "sh" || input.command === "bash") &&
Add secrets provider vaults and remote import (#5429) ## Thinking Path > - Paperclip orchestrates AI-agent companies and needs secrets handling to work across local development, hosted operators, and governed agent execution. > - The affected subsystem is the company-scoped secrets control plane: database schema, server services/routes, CLI workflows, and the Secrets settings UI. > - The gap was that secrets were local-only and operators could not manage provider vaults or import existing remote references without exposing plaintext. > - This branch adds provider vault configuration plus an AWS Secrets Manager remote-import path while preserving company boundaries, binding context, and audit trails. > - I kept the PR to a single branch PR, removed unrelated lockfile/package drift, rebased the full branch onto the current `public-gh/master`, and addressed fresh Greptile findings. > - The benefit is a reviewable implementation of provider-backed secrets with focused tests covering provider selection, import conflicts, deleted secret reuse, rotation guards, and AWS signing behavior. ## What Changed - Added provider vault support for company secrets, including provider config storage, default vault handling, health checks, binding usage, access events, and remote import preview/commit. - Added an AWS Secrets Manager provider using SigV4 request signing, bounded request timeouts, namespace guardrails, cached runtime credential resolution, and external-reference linking without plaintext reads. - Added Secrets UI surfaces for vault management and remote import, plus CLI/API documentation for setup and operations. - Stabilized routine webhook secret binding paths and SSH environment-driver fixture bindings discovered during verification. - Addressed Greptile and CI findings: no lockfile/package drift, monotonic migration metadata, disabled-vault default races, soft-deleted secret hiding/recreate behavior, remove behavior with disabled vaults, soft-deleted external-reference re-import, non-active rotation guards, managed-secret soft deletion through PATCH, and per-call AWS SDK credential client churn. - Rebased this branch onto `public-gh/master` at `0e1a5828` and force-pushed with lease to keep this as the single PR for the branch. ## Verification - `git fetch public-gh master` - `git rebase public-gh/master` - `git diff --name-only public-gh/master...HEAD | grep '^pnpm-lock\.yaml$' || true` confirmed `pnpm-lock.yaml` is not in the PR diff. - Confirmed migration ordering: master ends at `0081_optimal_dormammu`; this PR adds `0082_dry_vision` and `0083_company_secret_provider_configs`. - Inspected migrations for repeat safety: new tables/indexes use `IF NOT EXISTS`; foreign keys are guarded by `DO $$ ... IF NOT EXISTS`; column additions use `ADD COLUMN IF NOT EXISTS`. - `pnpm -r typecheck` passed before the Greptile follow-up commits. - `pnpm test:run` ran the full stable Vitest path before the Greptile follow-up commits; it completed with 3 timing-related failures under parallel load: `codex-local-execute.test.ts`, `cursor-local-execute.test.ts`, and `environment-service.test.ts`. - `pnpm --filter @paperclipai/server exec vitest run src/__tests__/codex-local-execute.test.ts src/__tests__/cursor-local-execute.test.ts src/__tests__/environment-service.test.ts` passed on targeted rerun (`24/24`). - `pnpm build` passed before the Greptile follow-up commits. Vite reported existing chunk-size/dynamic-import warnings. - After Greptile follow-up commits: `pnpm --filter @paperclipai/server exec vitest run src/__tests__/secrets-service.test.ts` passed (`26/26`). - After Greptile follow-up commits: `pnpm --filter @paperclipai/server exec vitest run src/__tests__/aws-secrets-manager-provider.test.ts src/__tests__/secrets-service.test.ts` passed (`39/39`). - After Greptile follow-up commits: `pnpm --filter @paperclipai/server typecheck` passed. - Captured Storybook screenshots from `ui/storybook-static` for visual review. - Latest PR checks on `5ca3a5cf`: `policy`, serialized server suites 1/4-4/4, `Canary Dry Run`, `e2e`, `security/snyk`, and `Greptile Review` pass; aggregate `verify` is still registering the completed child checks. - Greptile review loop continued through the latest requested pass; all Greptile review threads are resolved and the latest `Greptile Review` check on `5ca3a5cf` passed with 0 comments added. ## Screenshots Before: the provider-vault and remote-import surfaces did not exist on `master`; these are after-state screenshots from the Storybook fixtures. ![Secrets inventory](https://raw.githubusercontent.com/paperclipai/paperclip/PAP-2339-secrets-make-a-plan/doc/pr/5429/secrets-inventory.png) ![Secret binding picker](https://raw.githubusercontent.com/paperclipai/paperclip/PAP-2339-secrets-make-a-plan/doc/pr/5429/secret-binding-picker.png) ![Environment editor with secrets](https://raw.githubusercontent.com/paperclipai/paperclip/PAP-2339-secrets-make-a-plan/doc/pr/5429/env-editor-with-secrets.png) ## Risks - Migration risk: this adds new secret provider tables and extends existing secret rows. The migrations were checked for monotonic ordering and idempotent guards, but reviewers should still inspect upgrade behavior carefully. - Provider risk: AWS support uses direct SigV4 requests. Automated tests cover signing, request timeouts, vault-config selection, namespace guardrails, pending-version archival, sanitized provider errors, and service-level cleanup paths. A real-vault AWS smoke test remains deployment validation for an operator with AWS credentials rather than an unverified merge blocker in this local branch. - UI risk: the Secrets page and import dialog are large new surfaces; screenshots are included above for reviewer inspection. - Verification risk: the full local stable test command hit parallel-load timing failures, although the exact failed files passed when rerun directly. - Operational risk: remote import intentionally avoids plaintext reads; operators must understand that imported external references resolve at runtime and may fail if AWS permissions change. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex, GPT-5 coding agent with local shell/tool use in the Paperclip worktree. Exact context-window size was not exposed by the runtime. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [ ] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-09 18:22:17 -05:00
(args[0] === "-c" || args[0] === "-lc") &&
Let sandbox providers declare shell defaults (#5114) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Agents execute in sandboxed remote environments served by pluggable sandbox > providers (E2B today, more later) > - Today every sandbox command runs under `sh -lc` regardless of what the > provider's container actually ships > - That misses bash-only shell init on E2B (which ships bash) and prevents > future providers from declaring a different default — there's no way for a > provider to say "I have bash, use it" > - This PR adds a `shellCommand` field to sandbox execution targets so providers > can declare their preferred shell ("bash" for E2B), threads it through the > sandbox-managed-runtime client, callback bridge, and execution-target shell > helper, and validates the value at the lease-metadata boundary > - The benefit is that sandbox commands run under the right shell on the right > provider, and adding new sandbox providers only needs to declare a shell > preference ## What Changed - Added `packages/adapter-utils/src/sandbox-shell.ts` exporting `preferredShellForSandbox(shellCommand)` (returns `"bash"` if input is `"bash"`, else `"sh"`) - Added `shellCommand?: "bash" | "sh" | null` to `AdapterSandboxExecutionTarget` and `CommandManagedRuntimeSpec`; threaded it through `runAdapterExecutionTargetShellCommand`, `prepareAdapterExecutionTargetRuntime`, and `startAdapterExecutionTargetPaperclipBridge` - `createCommandManagedRuntimeClient`, `prepareCommandManagedRuntime`, and `createCommandManagedSandboxCallbackBridgeQueueClient` now take an optional `shellCommand` and use `preferredShellForSandbox` to pick the shell - `startSandboxCallbackBridgeServer` accepts a `shellCommand` for its server startup, readiness probe, and stop hook - E2B sandbox plugin declares `shellCommand: "bash"` in `leaseMetadata` - `resolveEnvironmentExecutionTarget` reads `shellCommand` from lease metadata (validating against `"bash" | "sh" | null`) - `environment-runtime.ts` adds `"shellCommand"` to `INTERNAL_PLUGIN_SANDBOX_CONFIG_KEYS` so the field round-trips through internal plugin config without leaking to external plugin metadata - Updated tests in `command-managed-runtime.test.ts`, `execution-target-sandbox.test.ts`, `sandbox-callback-bridge.test.ts`, `environment-execution-target.test.ts` ## Verification - `pnpm --filter @paperclipai/adapter-utils test` - `pnpm --filter @paperclipai/server test -- environment-execution-target` - `pnpm --filter @paperclipai/sandbox-providers-e2b test` - Manual QA: boot a Paperclip instance, create an E2B-backed environment, run a claude_local agent against it, and confirm the run completes (verifies bash shell semantics flow through the callback bridge end-to-end) ## Risks - E2B sandbox commands now run under `bash -lc` instead of `sh -lc`. Bash is a strict superset for the commands we issue (no busybox-only flags in our shell scripts), so risk is low. The shellCommand field is opt-in via lease metadata — providers that don't declare it stay on `sh`. - New optional field on `CommandManagedRuntimeSpec` and `AdapterSandboxExecutionTarget`. Consumers ignoring the field retain previous behaviour (sh). - Lease metadata now carries an additional field. Existing leases without `shellCommand` resolve to `null` and fall back to sh — backwards compatible. ## Model Used - OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI - Provider: OpenAI - Used to author the code changes in this PR ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots — N/A (no UI changes) - [ ] I have updated relevant documentation to reflect my changes — N/A - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-05-03 12:19:35 -07:00
typeof args[1] === "string"
) {
Add sandbox callback bridge for remote environment API access (#4801) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Agents can run inside sandboxed environments like E2B, which are isolated from the host network > - Sandboxed agents need to call back to the Paperclip API to report progress, post comments, and update issue status > - But sandbox environments cannot reach the Paperclip server directly because they run in isolated network namespaces > - This PR adds a callback bridge that proxies API requests from the sandbox to the Paperclip server, running as a local HTTP server on the host that forwards authenticated requests > - The bridge is started automatically when an adapter launches a sandbox execution, and torn down when the run completes > - The benefit is sandboxed agents can interact with the Paperclip API without requiring network-level access to the host, enabling E2B and similar providers to work end-to-end ## What Changed - Added `sandbox-callback-bridge.ts` in `packages/adapter-utils/` — a lightweight HTTP bridge server that accepts requests from sandbox environments and proxies them to the Paperclip API with authentication - Added request validation and security policy: the bridge only forwards requests to the configured API URL, validates content types, enforces size limits, and rejects non-API paths - Wired the bridge into all remote adapter execute paths (claude, codex, cursor, gemini, pi) — the bridge starts before the agent process and the bridge URL is passed via environment variables - Updated `environment-execution-target.ts` to prefer the explicit API URL from environment lease metadata for sandbox callback routing - Fixed Claude sandbox runtime setup to work with the bridge configuration - Added comprehensive test coverage for bridge request handling, policy enforcement, and sandbox execution integration - Fixed browser bundling — the bridge module is excluded from the frontend bundle via the adapter-utils index export ## Verification - `pnpm test` — all existing and new tests pass, including bridge unit tests and sandbox execution integration tests - `pnpm typecheck` — clean - Manual: configure an E2B environment, run an agent task, verify the agent can post comments and update issue status through the bridge ## Risks - Medium. This is a new network-facing component (HTTP server on localhost). The security policy restricts forwarding to the configured API URL only and validates all requests, but any proxy introduces attack surface. The bridge binds to localhost only and is scoped to the lifetime of a single agent run. ## Model Used Codex GPT 5.4 high via Paperclip. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-29 16:37:34 -07:00
env.PAPERCLIP_TEST_STDIN = input.stdin;
args[1] = `printf '%s' \"$PAPERCLIP_TEST_STDIN\" | (${args[1]})`;
}
try {
const result = await execFile(command, args, {
cwd: input.cwd,
env,
maxBuffer: 32 * 1024 * 1024,
timeout: input.timeoutMs,
});
return {
exitCode: 0,
signal: null,
timedOut: false,
stdout: result.stdout,
stderr: result.stderr,
pid: null,
startedAt,
};
} catch (error) {
const err = error as NodeJS.ErrnoException & {
stdout?: string;
stderr?: string;
code?: string | number | null;
signal?: NodeJS.Signals | null;
killed?: boolean;
};
return {
exitCode: typeof err.code === "number" ? err.code : null,
signal: err.signal ?? null,
timedOut: Boolean(err.killed && input.timeoutMs),
stdout: err.stdout ?? "",
stderr: err.stderr ?? "",
pid: null,
startedAt,
};
}
},
};
}
async function waitForJsonFile(directory: string, timeoutMs = 2_000): Promise<string> {
const deadline = Date.now() + timeoutMs;
while (Date.now() < deadline) {
const entries = await readdir(directory).catch(() => []);
const match = entries.find((entry) => entry.endsWith(".json"));
if (match) return match;
await new Promise((resolve) => setTimeout(resolve, 10));
}
throw new Error(`Timed out waiting for a JSON file in ${directory}.`);
}
afterEach(async () => {
while (cleanupFns.length > 0) {
const cleanup = cleanupFns.pop();
if (!cleanup) continue;
await cleanup().catch(() => undefined);
}
while (cleanupDirs.length > 0) {
const dir = cleanupDirs.pop();
if (!dir) continue;
await rm(dir, { recursive: true, force: true }).catch(() => undefined);
}
});
it("round-trips localhost bridge requests over the sandbox queue without forwarding the bridge token", async () => {
const rootDir = await mkdtemp(path.join(os.tmpdir(), "paperclip-bridge-runtime-"));
cleanupDirs.push(rootDir);
const localWorkspaceDir = path.join(rootDir, "local-workspace");
const remoteWorkspaceDir = path.join(rootDir, "remote-workspace");
await mkdir(localWorkspaceDir, { recursive: true });
await mkdir(remoteWorkspaceDir, { recursive: true });
await writeFile(path.join(localWorkspaceDir, "README.md"), "bridge test\n", "utf8");
const runner = createExecRunner();
const bridgeAsset = await createSandboxCallbackBridgeAsset();
cleanupFns.push(bridgeAsset.cleanup);
const prepared = await prepareCommandManagedRuntime({
runner,
spec: {
remoteCwd: remoteWorkspaceDir,
timeoutMs: 30_000,
},
adapterKey: "codex",
workspaceLocalDir: localWorkspaceDir,
assets: [
{
key: "bridge",
localDir: bridgeAsset.localDir,
},
],
});
const queueDir = path.posix.join(prepared.runtimeRootDir, "paperclip-bridge");
const directories = sandboxCallbackBridgeDirectories(queueDir);
const bridgeToken = createSandboxCallbackBridgeToken();
const seenRequests: Array<{
method: string;
path: string;
query: string;
headers: Record<string, string>;
body: string;
}> = [];
const worker = await startSandboxCallbackBridgeWorker({
client: createFileSystemSandboxCallbackBridgeQueueClient(),
queueDir,
authorizeRequest: async (request) =>
request.path === "/api/agents/me" ? null : `Route not allowed: ${request.method} ${request.path}`,
handleRequest: async (request) => {
seenRequests.push({
method: request.method,
path: request.path,
query: request.query,
headers: request.headers,
body: request.body,
});
return {
status: 200,
headers: {
"content-type": "application/json",
etag: '"bridge-rev-1"',
"last-modified": "Tue, 01 Apr 2025 00:00:00 GMT",
},
body: JSON.stringify({
ok: true,
method: request.method,
path: request.path,
}),
};
},
});
cleanupFns.push(async () => {
await worker.stop();
});
const bridge = await startSandboxCallbackBridgeServer({
runner,
remoteCwd: remoteWorkspaceDir,
assetRemoteDir: prepared.assetDirs.bridge,
queueDir,
bridgeToken,
timeoutMs: 30_000,
});
cleanupFns.push(async () => {
await bridge.stop();
});
const okResponse = await fetch(`${bridge.baseUrl}/api/agents/me?view=compact`, {
headers: {
authorization: `Bearer ${bridgeToken}`,
accept: "application/json",
"if-none-match": '"client-cache-key"',
"x-paperclip-run-id": "run-bridge-1",
"x-bridge-debug": "drop-me",
},
});
expect(okResponse.status).toBe(200);
expect(okResponse.headers.get("content-type")).toContain("application/json");
expect(okResponse.headers.get("etag")).toBe('"bridge-rev-1"');
expect(okResponse.headers.get("last-modified")).toBe("Tue, 01 Apr 2025 00:00:00 GMT");
await expect(okResponse.json()).resolves.toMatchObject({
ok: true,
method: "GET",
path: "/api/agents/me",
});
const deniedResponse = await fetch(`${bridge.baseUrl}/api/issues/issue-1`, {
method: "PATCH",
headers: {
authorization: `Bearer ${bridgeToken}`,
"content-type": "application/json",
},
body: JSON.stringify({ status: "in_progress" }),
});
expect(deniedResponse.status).toBe(403);
await expect(deniedResponse.json()).resolves.toMatchObject({
error: "Route not allowed: PATCH /api/issues/issue-1",
});
const unauthorizedResponse = await fetch(`${bridge.baseUrl}/api/agents/me`, {
headers: {
authorization: "Bearer wrong-token",
},
});
expect(unauthorizedResponse.status).toBe(401);
await expect(unauthorizedResponse.json()).resolves.toMatchObject({
error: "Invalid bridge token.",
});
expect(seenRequests).toHaveLength(1);
expect(seenRequests[0]).toMatchObject({
method: "GET",
path: "/api/agents/me",
query: "?view=compact",
body: "",
headers: {
accept: "application/json",
"if-none-match": '"client-cache-key"',
},
});
expect(seenRequests[0]?.headers.authorization).toBeUndefined();
expect(seenRequests[0]?.headers["x-paperclip-run-id"]).toBeUndefined();
});
it("denies non-allowlisted requests by default", async () => {
const rootDir = await mkdtemp(path.join(os.tmpdir(), "paperclip-bridge-default-policy-"));
cleanupDirs.push(rootDir);
const queueDir = path.posix.join(rootDir, "queue");
const directories = sandboxCallbackBridgeDirectories(queueDir);
let handled = 0;
const worker = await startSandboxCallbackBridgeWorker({
client: createFileSystemSandboxCallbackBridgeQueueClient(),
queueDir,
handleRequest: async () => {
handled += 1;
return {
status: 200,
body: "should not happen",
};
},
});
await writeFile(
path.posix.join(directories.requestsDir, "req-1.json"),
`${JSON.stringify({
id: "req-1",
method: "DELETE",
path: "/api/secrets",
query: "",
headers: {},
body: "",
createdAt: new Date().toISOString(),
})}\n`,
"utf8",
);
await worker.stop({ drainTimeoutMs: 1_000 });
const response = JSON.parse(
await readFile(path.posix.join(directories.responsesDir, "req-1.json"), "utf8"),
) as { status: number; body: string };
expect(handled).toBe(0);
expect(response.status).toBe(403);
expect(JSON.parse(response.body)).toEqual({
error: "Route not allowed: DELETE /api/secrets",
});
});
it("drains already-queued requests on stop", async () => {
const rootDir = await mkdtemp(path.join(os.tmpdir(), "paperclip-bridge-drain-"));
cleanupDirs.push(rootDir);
const queueDir = path.posix.join(rootDir, "queue");
const directories = sandboxCallbackBridgeDirectories(queueDir);
const processed: string[] = [];
const worker = await startSandboxCallbackBridgeWorker({
client: createFileSystemSandboxCallbackBridgeQueueClient(),
queueDir,
authorizeRequest: async () => null,
handleRequest: async (request) => {
processed.push(request.id);
await new Promise((resolve) => setTimeout(resolve, 25));
return {
status: 200,
body: request.id,
};
},
});
await writeFile(
path.posix.join(directories.requestsDir, "req-a.json"),
`${JSON.stringify({
id: "req-a",
method: "GET",
path: "/api/agents/me",
query: "",
headers: {},
body: "",
createdAt: new Date().toISOString(),
})}\n`,
"utf8",
);
await writeFile(
path.posix.join(directories.requestsDir, "req-b.json"),
`${JSON.stringify({
id: "req-b",
method: "GET",
path: "/api/agents/me",
query: "",
headers: {},
body: "",
createdAt: new Date().toISOString(),
})}\n`,
"utf8",
);
await worker.stop({ drainTimeoutMs: 1_000 });
expect(processed).toEqual(["req-a", "req-b"]);
await expect(readFile(path.posix.join(directories.responsesDir, "req-a.json"), "utf8")).resolves.toContain("\"req-a\"");
await expect(readFile(path.posix.join(directories.responsesDir, "req-b.json"), "utf8")).resolves.toContain("\"req-b\"");
});
it("writes fast 503 responses for queued requests that miss the drain deadline", async () => {
const rootDir = await mkdtemp(path.join(os.tmpdir(), "paperclip-bridge-drain-timeout-"));
cleanupDirs.push(rootDir);
const queueDir = path.posix.join(rootDir, "queue");
const directories = sandboxCallbackBridgeDirectories(queueDir);
const processed: string[] = [];
const worker = await startSandboxCallbackBridgeWorker({
client: createFileSystemSandboxCallbackBridgeQueueClient(),
queueDir,
authorizeRequest: async () => null,
handleRequest: async (request) => {
processed.push(request.id);
await new Promise((resolve) => setTimeout(resolve, 100));
return {
status: 200,
body: request.id,
};
},
});
await writeFile(
path.posix.join(directories.requestsDir, "req-a.json"),
`${JSON.stringify({
id: "req-a",
method: "GET",
path: "/api/agents/me",
query: "",
headers: {},
body: "",
createdAt: new Date().toISOString(),
})}\n`,
"utf8",
);
await writeFile(
path.posix.join(directories.requestsDir, "req-b.json"),
`${JSON.stringify({
id: "req-b",
method: "GET",
path: "/api/agents/me",
query: "",
headers: {},
body: "",
createdAt: new Date().toISOString(),
})}\n`,
"utf8",
);
for (let attempt = 0; attempt < 50 && processed.length === 0; attempt += 1) {
await new Promise((resolve) => setTimeout(resolve, 5));
}
await worker.stop({ drainTimeoutMs: 10 });
expect(processed).toEqual(["req-a"]);
await expect(readFile(path.posix.join(directories.responsesDir, "req-a.json"), "utf8")).resolves.toContain("\"req-a\"");
await expect(readFile(path.posix.join(directories.responsesDir, "req-b.json"), "utf8")).resolves.toContain(
"Bridge worker stopped before request could be handled.",
);
});
Stabilize runtime probes and Codex env tests (#5445) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Adapters expose a Test action that probes the configured runtime — install, resolvability, hello — to give operators a fast yes/no on whether an environment is healthy > - The Codex test path was running its hello probe directly without going through the managed-runtime preparation that production runs use, so a healthy production setup could still report a probe failure > - The plugin worker manager wasn't surfacing terminated workers cleanly, leaving the runtime probe waiting on a dead worker until the request timed out > - This pull request routes the Codex test probe through `prepareAdapterExecutionTargetRuntime` (so it sees the same managed Codex home production sees), exposes `commandCwd` on `createCommandManagedRuntimeClient` so callers can target a per-probe directory without leaking the workspace `remoteCwd`, and propagates plugin-worker termination as a usable error instead of a hang > - The benefit is the Codex Test action mirrors production behavior end-to-end, and probes against a terminated plugin worker fail fast instead of timing out ## What Changed - `packages/adapter-utils/src/command-managed-runtime.ts`: rename the `remoteCwd` knob to `commandCwd` so callers can target a per-probe directory without inheriting the workspace cwd; matching test coverage in `command-managed-runtime.test.ts` - `packages/adapter-utils/src/sandbox-callback-bridge.{ts,test.ts}`: small fixes to keep callback bridge stop semantics deterministic - `packages/adapters/codex-local/src/server/test.ts`: thread the Codex hello probe through `prepareAdapterExecutionTargetRuntime` + `prepareManagedCodexHome` so the probe sees the same managed home production sees; new `test.remote.test.ts` covers the remote probe path - `packages/adapters/cursor-local/src/server/execute.ts`: small probe-side cleanup that aligns with the new commandCwd contract - `server/src/services/plugin-worker-manager.ts`: surface plugin-worker termination as a structured error so callers fail fast; new `plugin-worker-terminated.cjs` fixture and `plugin-worker-manager.test.ts` cases pin the behavior ## Verification - `pnpm vitest run --no-coverage --project @paperclipai/adapter-utils --project @paperclipai/adapter-codex-local --project @paperclipai/adapter-cursor-local --project @paperclipai/server` — 1749/1750 passing (1 unrelated skip) - `pnpm typecheck` clean ## Risks Low–medium. The `remoteCwd → commandCwd` rename is a parameter renaming on an internal helper used only by adapter test/execute paths in this repo. The plugin-worker-terminated path was previously a hang; failing fast may surface latent timeouts as explicit termination errors in callers that already expected them. ## Model Used Claude Opus 4.7 (1M context) ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable — new tests cover commandCwd, plugin-worker termination, and Codex remote test path - [x] If this change affects the UI, I have included before/after screenshots — N/A (no UI) - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --- > **Stacked PR.** Sits on top of #5444 which adds the per-run runtime API surface this PR builds on. Cumulative diff against `master` includes that PR's content; the files touched by *this* PR's commit are listed under "What Changed" above. Will rebase onto `master` and force-push once #5444 merges.
2026-05-07 14:52:31 -07:00
it("handles SSH queue polling failures without emitting an unhandled rejection", async () => {
const rootDir = await mkdtemp(path.join(os.tmpdir(), "paperclip-bridge-ssh-failure-"));
cleanupDirs.push(rootDir);
const queueDir = path.posix.join(rootDir, "queue");
const unhandled: unknown[] = [];
const onUnhandledRejection = (reason: unknown) => {
unhandled.push(reason);
};
process.on("unhandledRejection", onUnhandledRejection);
try {
const worker = await startSandboxCallbackBridgeWorker({
client: {
makeDir: async () => {},
listJsonFiles: async () => {
throw new Error(
"list /remote/.paperclip-runtime/gemini/paperclip-bridge/queue/requests failed with exit code 255: kex_exchange_identification: read: Connection reset by peer",
);
},
readTextFile: async () => {
throw new Error("unexpected readTextFile");
},
writeTextFile: async () => {
throw new Error("unexpected writeTextFile");
},
rename: async () => {
throw new Error("unexpected rename");
},
remove: async () => {},
},
queueDir,
authorizeRequest: async () => null,
handleRequest: async () => ({
status: 200,
body: "ok",
}),
});
await new Promise((resolve) => setTimeout(resolve, 50));
await worker.stop();
expect(unhandled).toEqual([]);
} finally {
process.off("unhandledRejection", onUnhandledRejection);
}
});
Serialize sandbox callback bridge against concurrent heartbeats (#5326) > **Stacked PR.** This PR's branch carries cumulative content from #5324 (bridge allowlist expand) and #5325 (env sanitization) — the mutex/sha256 logic in this PR sits on top of both. Reviewers should focus on the files this PR's commit touches: `packages/adapter-utils/src/sandbox-callback-bridge.{ts,test.ts}`, `packages/adapter-utils/src/ssh.ts`, and `packages/adapter-utils/src/ssh-fixture.test.ts`. Will rebase onto `master` and force-push once both prerequisite PRs are merged. ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Each agent that runs in a sandbox or via SSH talks back to the Paperclip server through a per-lease callback bridge whose entrypoint script is uploaded to the remote > - When two heartbeats target the same agent on the same machine concurrently, both upload the bridge entrypoint and both write to the same response files — producing torn-write races: `SyntaxError: Identifier 'randomUUID' has already been declared` from a concatenated upload, `mv: cannot stat …` from colliding `.json.tmp` writes, and 0-byte commits from a truncated stdin > - This pull request serializes those operations with a POSIX `mkdir`-mutex (PID liveness check + atomic rename) at the bridge entrypoint upload, applies the same lock to the bridge response writer, forwards stdin into remote ssh commands so the entrypoint payload arrives intact, and verifies a sha256 of the upload before promoting it > - The benefit is concurrent heartbeats no longer corrupt each other's bridge state ## What Changed - `packages/adapter-utils/src/sandbox-callback-bridge.ts`: serialize entrypoint upload and response writes via POSIX `mkdir`-mutex with PID liveness; sha256 the upload before promoting via `mv`; content-skip when the existing entrypoint already matches - `packages/adapter-utils/src/ssh.ts`: forward stdin into remote ssh commands through the SSH managed runtime so `cat > "$remote_upload"` actually receives the base64-encoded entrypoint - `packages/adapter-utils/src/ssh-fixture.test.ts`: cover the stdin-forwarded SSH path - `packages/adapter-utils/src/sandbox-callback-bridge.test.ts`: cover the mutex, content-skip, sha256-verify, and atomic-rename paths ## Verification - `pnpm vitest run --no-coverage --project @paperclipai/adapter-utils` - `pnpm typecheck` clean - Manual: two parallel heartbeats targeting the same SSH agent no longer race on the bridge entrypoint or response files ## Risks Medium. Serializing previously-parallel operations adds latency on the contended path (one heartbeat waits on another), bounded by the entrypoint upload time. The mutex includes PID liveness so a crashed heartbeat doesn't deadlock subsequent ones. Sha256-verify gives a clear "torn upload" failure mode instead of silent 0-byte commits. ## Model Used Claude Opus 4.7 (1M context) ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable — tests cover mutex + sha256-verify + stdin-forwarded ssh - [x] If this change affects the UI, I have included before/after screenshots — N/A (no UI) - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-05-05 20:01:04 -07:00
it("serializes remote response writes so stop does not recreate a late orphaned response", async () => {
const rootDir = await mkdtemp(path.join(os.tmpdir(), "paperclip-bridge-response-lock-"));
cleanupDirs.push(rootDir);
const localWorkspaceDir = path.join(rootDir, "local-workspace");
const remoteWorkspaceDir = path.join(rootDir, "remote-workspace");
await mkdir(localWorkspaceDir, { recursive: true });
await mkdir(remoteWorkspaceDir, { recursive: true });
await writeFile(path.join(localWorkspaceDir, "README.md"), "bridge response lock test\n", "utf8");
const runner = createExecRunner();
const bridgeAsset = await createSandboxCallbackBridgeAsset();
cleanupFns.push(bridgeAsset.cleanup);
const prepared = await prepareCommandManagedRuntime({
runner,
spec: {
remoteCwd: remoteWorkspaceDir,
timeoutMs: 30_000,
},
adapterKey: "codex",
workspaceLocalDir: localWorkspaceDir,
assets: [{ key: "bridge", localDir: bridgeAsset.localDir }],
});
const queueDir = path.posix.join(prepared.runtimeRootDir, "paperclip-bridge");
const directories = sandboxCallbackBridgeDirectories(queueDir);
const bridgeToken = createSandboxCallbackBridgeToken();
const seenRequestIds: string[] = [];
const worker = await startSandboxCallbackBridgeWorker({
client: createCommandManagedSandboxCallbackBridgeQueueClient({
runner,
remoteCwd: remoteWorkspaceDir,
timeoutMs: 30_000,
}),
queueDir,
authorizeRequest: async () => null,
handleRequest: async (request) => {
seenRequestIds.push(request.id);
Add secrets provider vaults and remote import (#5429) ## Thinking Path > - Paperclip orchestrates AI-agent companies and needs secrets handling to work across local development, hosted operators, and governed agent execution. > - The affected subsystem is the company-scoped secrets control plane: database schema, server services/routes, CLI workflows, and the Secrets settings UI. > - The gap was that secrets were local-only and operators could not manage provider vaults or import existing remote references without exposing plaintext. > - This branch adds provider vault configuration plus an AWS Secrets Manager remote-import path while preserving company boundaries, binding context, and audit trails. > - I kept the PR to a single branch PR, removed unrelated lockfile/package drift, rebased the full branch onto the current `public-gh/master`, and addressed fresh Greptile findings. > - The benefit is a reviewable implementation of provider-backed secrets with focused tests covering provider selection, import conflicts, deleted secret reuse, rotation guards, and AWS signing behavior. ## What Changed - Added provider vault support for company secrets, including provider config storage, default vault handling, health checks, binding usage, access events, and remote import preview/commit. - Added an AWS Secrets Manager provider using SigV4 request signing, bounded request timeouts, namespace guardrails, cached runtime credential resolution, and external-reference linking without plaintext reads. - Added Secrets UI surfaces for vault management and remote import, plus CLI/API documentation for setup and operations. - Stabilized routine webhook secret binding paths and SSH environment-driver fixture bindings discovered during verification. - Addressed Greptile and CI findings: no lockfile/package drift, monotonic migration metadata, disabled-vault default races, soft-deleted secret hiding/recreate behavior, remove behavior with disabled vaults, soft-deleted external-reference re-import, non-active rotation guards, managed-secret soft deletion through PATCH, and per-call AWS SDK credential client churn. - Rebased this branch onto `public-gh/master` at `0e1a5828` and force-pushed with lease to keep this as the single PR for the branch. ## Verification - `git fetch public-gh master` - `git rebase public-gh/master` - `git diff --name-only public-gh/master...HEAD | grep '^pnpm-lock\.yaml$' || true` confirmed `pnpm-lock.yaml` is not in the PR diff. - Confirmed migration ordering: master ends at `0081_optimal_dormammu`; this PR adds `0082_dry_vision` and `0083_company_secret_provider_configs`. - Inspected migrations for repeat safety: new tables/indexes use `IF NOT EXISTS`; foreign keys are guarded by `DO $$ ... IF NOT EXISTS`; column additions use `ADD COLUMN IF NOT EXISTS`. - `pnpm -r typecheck` passed before the Greptile follow-up commits. - `pnpm test:run` ran the full stable Vitest path before the Greptile follow-up commits; it completed with 3 timing-related failures under parallel load: `codex-local-execute.test.ts`, `cursor-local-execute.test.ts`, and `environment-service.test.ts`. - `pnpm --filter @paperclipai/server exec vitest run src/__tests__/codex-local-execute.test.ts src/__tests__/cursor-local-execute.test.ts src/__tests__/environment-service.test.ts` passed on targeted rerun (`24/24`). - `pnpm build` passed before the Greptile follow-up commits. Vite reported existing chunk-size/dynamic-import warnings. - After Greptile follow-up commits: `pnpm --filter @paperclipai/server exec vitest run src/__tests__/secrets-service.test.ts` passed (`26/26`). - After Greptile follow-up commits: `pnpm --filter @paperclipai/server exec vitest run src/__tests__/aws-secrets-manager-provider.test.ts src/__tests__/secrets-service.test.ts` passed (`39/39`). - After Greptile follow-up commits: `pnpm --filter @paperclipai/server typecheck` passed. - Captured Storybook screenshots from `ui/storybook-static` for visual review. - Latest PR checks on `5ca3a5cf`: `policy`, serialized server suites 1/4-4/4, `Canary Dry Run`, `e2e`, `security/snyk`, and `Greptile Review` pass; aggregate `verify` is still registering the completed child checks. - Greptile review loop continued through the latest requested pass; all Greptile review threads are resolved and the latest `Greptile Review` check on `5ca3a5cf` passed with 0 comments added. ## Screenshots Before: the provider-vault and remote-import surfaces did not exist on `master`; these are after-state screenshots from the Storybook fixtures. ![Secrets inventory](https://raw.githubusercontent.com/paperclipai/paperclip/PAP-2339-secrets-make-a-plan/doc/pr/5429/secrets-inventory.png) ![Secret binding picker](https://raw.githubusercontent.com/paperclipai/paperclip/PAP-2339-secrets-make-a-plan/doc/pr/5429/secret-binding-picker.png) ![Environment editor with secrets](https://raw.githubusercontent.com/paperclipai/paperclip/PAP-2339-secrets-make-a-plan/doc/pr/5429/env-editor-with-secrets.png) ## Risks - Migration risk: this adds new secret provider tables and extends existing secret rows. The migrations were checked for monotonic ordering and idempotent guards, but reviewers should still inspect upgrade behavior carefully. - Provider risk: AWS support uses direct SigV4 requests. Automated tests cover signing, request timeouts, vault-config selection, namespace guardrails, pending-version archival, sanitized provider errors, and service-level cleanup paths. A real-vault AWS smoke test remains deployment validation for an operator with AWS credentials rather than an unverified merge blocker in this local branch. - UI risk: the Secrets page and import dialog are large new surfaces; screenshots are included above for reviewer inspection. - Verification risk: the full local stable test command hit parallel-load timing failures, although the exact failed files passed when rerun directly. - Operational risk: remote import intentionally avoids plaintext reads; operators must understand that imported external references resolve at runtime and may fail if AWS permissions change. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex, GPT-5 coding agent with local shell/tool use in the Paperclip worktree. Exact context-window size was not exposed by the runtime. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [ ] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-09 18:22:17 -05:00
await new Promise((resolve) => setTimeout(resolve, 250));
Serialize sandbox callback bridge against concurrent heartbeats (#5326) > **Stacked PR.** This PR's branch carries cumulative content from #5324 (bridge allowlist expand) and #5325 (env sanitization) — the mutex/sha256 logic in this PR sits on top of both. Reviewers should focus on the files this PR's commit touches: `packages/adapter-utils/src/sandbox-callback-bridge.{ts,test.ts}`, `packages/adapter-utils/src/ssh.ts`, and `packages/adapter-utils/src/ssh-fixture.test.ts`. Will rebase onto `master` and force-push once both prerequisite PRs are merged. ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Each agent that runs in a sandbox or via SSH talks back to the Paperclip server through a per-lease callback bridge whose entrypoint script is uploaded to the remote > - When two heartbeats target the same agent on the same machine concurrently, both upload the bridge entrypoint and both write to the same response files — producing torn-write races: `SyntaxError: Identifier 'randomUUID' has already been declared` from a concatenated upload, `mv: cannot stat …` from colliding `.json.tmp` writes, and 0-byte commits from a truncated stdin > - This pull request serializes those operations with a POSIX `mkdir`-mutex (PID liveness check + atomic rename) at the bridge entrypoint upload, applies the same lock to the bridge response writer, forwards stdin into remote ssh commands so the entrypoint payload arrives intact, and verifies a sha256 of the upload before promoting it > - The benefit is concurrent heartbeats no longer corrupt each other's bridge state ## What Changed - `packages/adapter-utils/src/sandbox-callback-bridge.ts`: serialize entrypoint upload and response writes via POSIX `mkdir`-mutex with PID liveness; sha256 the upload before promoting via `mv`; content-skip when the existing entrypoint already matches - `packages/adapter-utils/src/ssh.ts`: forward stdin into remote ssh commands through the SSH managed runtime so `cat > "$remote_upload"` actually receives the base64-encoded entrypoint - `packages/adapter-utils/src/ssh-fixture.test.ts`: cover the stdin-forwarded SSH path - `packages/adapter-utils/src/sandbox-callback-bridge.test.ts`: cover the mutex, content-skip, sha256-verify, and atomic-rename paths ## Verification - `pnpm vitest run --no-coverage --project @paperclipai/adapter-utils` - `pnpm typecheck` clean - Manual: two parallel heartbeats targeting the same SSH agent no longer race on the bridge entrypoint or response files ## Risks Medium. Serializing previously-parallel operations adds latency on the contended path (one heartbeat waits on another), bounded by the entrypoint upload time. The mutex includes PID liveness so a crashed heartbeat doesn't deadlock subsequent ones. Sha256-verify gives a clear "torn upload" failure mode instead of silent 0-byte commits. ## Model Used Claude Opus 4.7 (1M context) ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable — tests cover mutex + sha256-verify + stdin-forwarded ssh - [x] If this change affects the UI, I have included before/after screenshots — N/A (no UI) - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-05-05 20:01:04 -07:00
return {
status: 200,
headers: { "content-type": "application/json" },
body: JSON.stringify({ ok: true, id: request.id }),
};
},
});
cleanupFns.push(async () => {
await worker.stop();
});
const bridge = await startSandboxCallbackBridgeServer({
runner,
remoteCwd: remoteWorkspaceDir,
assetRemoteDir: prepared.assetDirs.bridge,
queueDir,
bridgeToken,
timeoutMs: 30_000,
});
cleanupFns.push(async () => {
await bridge.stop();
});
const responsePromise = fetch(`${bridge.baseUrl}/api/agents/me`, {
headers: {
authorization: `Bearer ${bridgeToken}`,
},
});
for (let attempt = 0; attempt < 50 && seenRequestIds.length === 0; attempt += 1) {
await new Promise((resolve) => setTimeout(resolve, 5));
}
expect(seenRequestIds).toHaveLength(1);
await worker.stop({ drainTimeoutMs: 10 });
const response = await responsePromise;
expect(response.status).toBe(503);
await expect(response.json()).resolves.toEqual({
error: "Bridge worker stopped before request could be handled.",
});
Add secrets provider vaults and remote import (#5429) ## Thinking Path > - Paperclip orchestrates AI-agent companies and needs secrets handling to work across local development, hosted operators, and governed agent execution. > - The affected subsystem is the company-scoped secrets control plane: database schema, server services/routes, CLI workflows, and the Secrets settings UI. > - The gap was that secrets were local-only and operators could not manage provider vaults or import existing remote references without exposing plaintext. > - This branch adds provider vault configuration plus an AWS Secrets Manager remote-import path while preserving company boundaries, binding context, and audit trails. > - I kept the PR to a single branch PR, removed unrelated lockfile/package drift, rebased the full branch onto the current `public-gh/master`, and addressed fresh Greptile findings. > - The benefit is a reviewable implementation of provider-backed secrets with focused tests covering provider selection, import conflicts, deleted secret reuse, rotation guards, and AWS signing behavior. ## What Changed - Added provider vault support for company secrets, including provider config storage, default vault handling, health checks, binding usage, access events, and remote import preview/commit. - Added an AWS Secrets Manager provider using SigV4 request signing, bounded request timeouts, namespace guardrails, cached runtime credential resolution, and external-reference linking without plaintext reads. - Added Secrets UI surfaces for vault management and remote import, plus CLI/API documentation for setup and operations. - Stabilized routine webhook secret binding paths and SSH environment-driver fixture bindings discovered during verification. - Addressed Greptile and CI findings: no lockfile/package drift, monotonic migration metadata, disabled-vault default races, soft-deleted secret hiding/recreate behavior, remove behavior with disabled vaults, soft-deleted external-reference re-import, non-active rotation guards, managed-secret soft deletion through PATCH, and per-call AWS SDK credential client churn. - Rebased this branch onto `public-gh/master` at `0e1a5828` and force-pushed with lease to keep this as the single PR for the branch. ## Verification - `git fetch public-gh master` - `git rebase public-gh/master` - `git diff --name-only public-gh/master...HEAD | grep '^pnpm-lock\.yaml$' || true` confirmed `pnpm-lock.yaml` is not in the PR diff. - Confirmed migration ordering: master ends at `0081_optimal_dormammu`; this PR adds `0082_dry_vision` and `0083_company_secret_provider_configs`. - Inspected migrations for repeat safety: new tables/indexes use `IF NOT EXISTS`; foreign keys are guarded by `DO $$ ... IF NOT EXISTS`; column additions use `ADD COLUMN IF NOT EXISTS`. - `pnpm -r typecheck` passed before the Greptile follow-up commits. - `pnpm test:run` ran the full stable Vitest path before the Greptile follow-up commits; it completed with 3 timing-related failures under parallel load: `codex-local-execute.test.ts`, `cursor-local-execute.test.ts`, and `environment-service.test.ts`. - `pnpm --filter @paperclipai/server exec vitest run src/__tests__/codex-local-execute.test.ts src/__tests__/cursor-local-execute.test.ts src/__tests__/environment-service.test.ts` passed on targeted rerun (`24/24`). - `pnpm build` passed before the Greptile follow-up commits. Vite reported existing chunk-size/dynamic-import warnings. - After Greptile follow-up commits: `pnpm --filter @paperclipai/server exec vitest run src/__tests__/secrets-service.test.ts` passed (`26/26`). - After Greptile follow-up commits: `pnpm --filter @paperclipai/server exec vitest run src/__tests__/aws-secrets-manager-provider.test.ts src/__tests__/secrets-service.test.ts` passed (`39/39`). - After Greptile follow-up commits: `pnpm --filter @paperclipai/server typecheck` passed. - Captured Storybook screenshots from `ui/storybook-static` for visual review. - Latest PR checks on `5ca3a5cf`: `policy`, serialized server suites 1/4-4/4, `Canary Dry Run`, `e2e`, `security/snyk`, and `Greptile Review` pass; aggregate `verify` is still registering the completed child checks. - Greptile review loop continued through the latest requested pass; all Greptile review threads are resolved and the latest `Greptile Review` check on `5ca3a5cf` passed with 0 comments added. ## Screenshots Before: the provider-vault and remote-import surfaces did not exist on `master`; these are after-state screenshots from the Storybook fixtures. ![Secrets inventory](https://raw.githubusercontent.com/paperclipai/paperclip/PAP-2339-secrets-make-a-plan/doc/pr/5429/secrets-inventory.png) ![Secret binding picker](https://raw.githubusercontent.com/paperclipai/paperclip/PAP-2339-secrets-make-a-plan/doc/pr/5429/secret-binding-picker.png) ![Environment editor with secrets](https://raw.githubusercontent.com/paperclipai/paperclip/PAP-2339-secrets-make-a-plan/doc/pr/5429/env-editor-with-secrets.png) ## Risks - Migration risk: this adds new secret provider tables and extends existing secret rows. The migrations were checked for monotonic ordering and idempotent guards, but reviewers should still inspect upgrade behavior carefully. - Provider risk: AWS support uses direct SigV4 requests. Automated tests cover signing, request timeouts, vault-config selection, namespace guardrails, pending-version archival, sanitized provider errors, and service-level cleanup paths. A real-vault AWS smoke test remains deployment validation for an operator with AWS credentials rather than an unverified merge blocker in this local branch. - UI risk: the Secrets page and import dialog are large new surfaces; screenshots are included above for reviewer inspection. - Verification risk: the full local stable test command hit parallel-load timing failures, although the exact failed files passed when rerun directly. - Operational risk: remote import intentionally avoids plaintext reads; operators must understand that imported external references resolve at runtime and may fail if AWS permissions change. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex, GPT-5 coding agent with local shell/tool use in the Paperclip worktree. Exact context-window size was not exposed by the runtime. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [ ] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-09 18:22:17 -05:00
await new Promise((resolve) => setTimeout(resolve, 300));
Serialize sandbox callback bridge against concurrent heartbeats (#5326) > **Stacked PR.** This PR's branch carries cumulative content from #5324 (bridge allowlist expand) and #5325 (env sanitization) — the mutex/sha256 logic in this PR sits on top of both. Reviewers should focus on the files this PR's commit touches: `packages/adapter-utils/src/sandbox-callback-bridge.{ts,test.ts}`, `packages/adapter-utils/src/ssh.ts`, and `packages/adapter-utils/src/ssh-fixture.test.ts`. Will rebase onto `master` and force-push once both prerequisite PRs are merged. ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Each agent that runs in a sandbox or via SSH talks back to the Paperclip server through a per-lease callback bridge whose entrypoint script is uploaded to the remote > - When two heartbeats target the same agent on the same machine concurrently, both upload the bridge entrypoint and both write to the same response files — producing torn-write races: `SyntaxError: Identifier 'randomUUID' has already been declared` from a concatenated upload, `mv: cannot stat …` from colliding `.json.tmp` writes, and 0-byte commits from a truncated stdin > - This pull request serializes those operations with a POSIX `mkdir`-mutex (PID liveness check + atomic rename) at the bridge entrypoint upload, applies the same lock to the bridge response writer, forwards stdin into remote ssh commands so the entrypoint payload arrives intact, and verifies a sha256 of the upload before promoting it > - The benefit is concurrent heartbeats no longer corrupt each other's bridge state ## What Changed - `packages/adapter-utils/src/sandbox-callback-bridge.ts`: serialize entrypoint upload and response writes via POSIX `mkdir`-mutex with PID liveness; sha256 the upload before promoting via `mv`; content-skip when the existing entrypoint already matches - `packages/adapter-utils/src/ssh.ts`: forward stdin into remote ssh commands through the SSH managed runtime so `cat > "$remote_upload"` actually receives the base64-encoded entrypoint - `packages/adapter-utils/src/ssh-fixture.test.ts`: cover the stdin-forwarded SSH path - `packages/adapter-utils/src/sandbox-callback-bridge.test.ts`: cover the mutex, content-skip, sha256-verify, and atomic-rename paths ## Verification - `pnpm vitest run --no-coverage --project @paperclipai/adapter-utils` - `pnpm typecheck` clean - Manual: two parallel heartbeats targeting the same SSH agent no longer race on the bridge entrypoint or response files ## Risks Medium. Serializing previously-parallel operations adds latency on the contended path (one heartbeat waits on another), bounded by the entrypoint upload time. The mutex includes PID liveness so a crashed heartbeat doesn't deadlock subsequent ones. Sha256-verify gives a clear "torn upload" failure mode instead of silent 0-byte commits. ## Model Used Claude Opus 4.7 (1M context) ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable — tests cover mutex + sha256-verify + stdin-forwarded ssh - [x] If this change affects the UI, I have included before/after screenshots — N/A (no UI) - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-05-05 20:01:04 -07:00
await expect(readdir(directories.responsesDir)).resolves.toEqual([]);
await expect(
readdir(directories.responsesDir).then((entries) =>
entries.filter((entry) => entry.endsWith(".tmp") || entry.includes(".paperclip-write.lock")),
),
).resolves.toEqual([]);
});
Add sandbox callback bridge for remote environment API access (#4801) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Agents can run inside sandboxed environments like E2B, which are isolated from the host network > - Sandboxed agents need to call back to the Paperclip API to report progress, post comments, and update issue status > - But sandbox environments cannot reach the Paperclip server directly because they run in isolated network namespaces > - This PR adds a callback bridge that proxies API requests from the sandbox to the Paperclip server, running as a local HTTP server on the host that forwards authenticated requests > - The bridge is started automatically when an adapter launches a sandbox execution, and torn down when the run completes > - The benefit is sandboxed agents can interact with the Paperclip API without requiring network-level access to the host, enabling E2B and similar providers to work end-to-end ## What Changed - Added `sandbox-callback-bridge.ts` in `packages/adapter-utils/` — a lightweight HTTP bridge server that accepts requests from sandbox environments and proxies them to the Paperclip API with authentication - Added request validation and security policy: the bridge only forwards requests to the configured API URL, validates content types, enforces size limits, and rejects non-API paths - Wired the bridge into all remote adapter execute paths (claude, codex, cursor, gemini, pi) — the bridge starts before the agent process and the bridge URL is passed via environment variables - Updated `environment-execution-target.ts` to prefer the explicit API URL from environment lease metadata for sandbox callback routing - Fixed Claude sandbox runtime setup to work with the bridge configuration - Added comprehensive test coverage for bridge request handling, policy enforcement, and sandbox execution integration - Fixed browser bundling — the bridge module is excluded from the frontend bundle via the adapter-utils index export ## Verification - `pnpm test` — all existing and new tests pass, including bridge unit tests and sandbox execution integration tests - `pnpm typecheck` — clean - Manual: configure an E2B environment, run an agent task, verify the agent can post comments and update issue status through the bridge ## Risks - Medium. This is a new network-facing component (HTTP server on localhost). The security policy restricts forwarding to the configured API URL only and validates all requests, but any proxy introduces attack surface. The bridge binds to localhost only and is scoped to the lifetime of a single agent run. ## Model Used Codex GPT 5.4 high via Paperclip. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-29 16:37:34 -07:00
it("rejects non-JSON request bodies and full queues at the bridge server", async () => {
const rootDir = await mkdtemp(path.join(os.tmpdir(), "paperclip-bridge-server-guards-"));
cleanupDirs.push(rootDir);
const localWorkspaceDir = path.join(rootDir, "local-workspace");
const remoteWorkspaceDir = path.join(rootDir, "remote-workspace");
await mkdir(localWorkspaceDir, { recursive: true });
await mkdir(remoteWorkspaceDir, { recursive: true });
await writeFile(path.join(localWorkspaceDir, "README.md"), "bridge guard test\n", "utf8");
const runner = createExecRunner();
const bridgeAsset = await createSandboxCallbackBridgeAsset();
cleanupFns.push(bridgeAsset.cleanup);
const prepared = await prepareCommandManagedRuntime({
runner,
spec: {
remoteCwd: remoteWorkspaceDir,
timeoutMs: 30_000,
},
adapterKey: "codex",
workspaceLocalDir: localWorkspaceDir,
assets: [{ key: "bridge", localDir: bridgeAsset.localDir }],
});
const queueDir = path.posix.join(prepared.runtimeRootDir, "paperclip-bridge");
const directories = sandboxCallbackBridgeDirectories(queueDir);
const bridgeToken = createSandboxCallbackBridgeToken();
const bridge = await startSandboxCallbackBridgeServer({
runner,
remoteCwd: remoteWorkspaceDir,
assetRemoteDir: prepared.assetDirs.bridge,
queueDir,
bridgeToken,
timeoutMs: 30_000,
maxQueueDepth: 1,
});
cleanupFns.push(async () => {
await bridge.stop();
});
await writeFile(
path.posix.join(directories.requestsDir, "existing.json"),
`${JSON.stringify({
id: "existing",
method: "GET",
path: "/api/agents/me",
query: "",
headers: {},
body: "",
createdAt: new Date().toISOString(),
})}\n`,
"utf8",
);
const queueFullResponse = await fetch(`${bridge.baseUrl}/api/agents/me`, {
headers: {
authorization: `Bearer ${bridgeToken}`,
},
});
expect(queueFullResponse.status).toBe(503);
await expect(queueFullResponse.json()).resolves.toEqual({
error: "Bridge request queue is full.",
});
await rm(path.posix.join(directories.requestsDir, "existing.json"), { force: true });
const nonJsonResponse = await fetch(`${bridge.baseUrl}/api/issues/issue-1/comments`, {
method: "POST",
headers: {
authorization: `Bearer ${bridgeToken}`,
"content-type": "text/plain",
},
body: "not json",
});
expect(nonJsonResponse.status).toBe(415);
await expect(nonJsonResponse.json()).resolves.toEqual({
error: "Bridge only accepts JSON request bodies.",
});
});
it("returns a 502 when the host response times out", async () => {
const rootDir = await mkdtemp(path.join(os.tmpdir(), "paperclip-bridge-timeout-"));
cleanupDirs.push(rootDir);
const localWorkspaceDir = path.join(rootDir, "local-workspace");
const remoteWorkspaceDir = path.join(rootDir, "remote-workspace");
await mkdir(localWorkspaceDir, { recursive: true });
await mkdir(remoteWorkspaceDir, { recursive: true });
await writeFile(path.join(localWorkspaceDir, "README.md"), "bridge timeout test\n", "utf8");
const runner = createExecRunner();
const bridgeAsset = await createSandboxCallbackBridgeAsset();
cleanupFns.push(bridgeAsset.cleanup);
const prepared = await prepareCommandManagedRuntime({
runner,
spec: {
remoteCwd: remoteWorkspaceDir,
timeoutMs: 30_000,
},
adapterKey: "codex",
workspaceLocalDir: localWorkspaceDir,
assets: [{ key: "bridge", localDir: bridgeAsset.localDir }],
});
const queueDir = path.posix.join(prepared.runtimeRootDir, "paperclip-bridge");
const bridgeToken = createSandboxCallbackBridgeToken();
const bridge = await startSandboxCallbackBridgeServer({
runner,
remoteCwd: remoteWorkspaceDir,
assetRemoteDir: prepared.assetDirs.bridge,
queueDir,
bridgeToken,
timeoutMs: 30_000,
pollIntervalMs: 10,
responseTimeoutMs: 75,
});
cleanupFns.push(async () => {
await bridge.stop();
});
const response = await fetch(`${bridge.baseUrl}/api/agents/me`, {
headers: {
authorization: `Bearer ${bridgeToken}`,
},
});
expect(response.status).toBe(502);
await expect(response.json()).resolves.toEqual({
error: "Timed out waiting for host bridge response.",
});
});
it("returns a 502 for malformed host response files", async () => {
const rootDir = await mkdtemp(path.join(os.tmpdir(), "paperclip-bridge-malformed-response-"));
cleanupDirs.push(rootDir);
const localWorkspaceDir = path.join(rootDir, "local-workspace");
const remoteWorkspaceDir = path.join(rootDir, "remote-workspace");
await mkdir(localWorkspaceDir, { recursive: true });
await mkdir(remoteWorkspaceDir, { recursive: true });
await writeFile(path.join(localWorkspaceDir, "README.md"), "bridge malformed response test\n", "utf8");
const runner = createExecRunner();
const bridgeAsset = await createSandboxCallbackBridgeAsset();
cleanupFns.push(bridgeAsset.cleanup);
const prepared = await prepareCommandManagedRuntime({
runner,
spec: {
remoteCwd: remoteWorkspaceDir,
timeoutMs: 30_000,
},
adapterKey: "codex",
workspaceLocalDir: localWorkspaceDir,
assets: [{ key: "bridge", localDir: bridgeAsset.localDir }],
});
const queueDir = path.posix.join(prepared.runtimeRootDir, "paperclip-bridge");
const directories = sandboxCallbackBridgeDirectories(queueDir);
const bridgeToken = createSandboxCallbackBridgeToken();
const bridge = await startSandboxCallbackBridgeServer({
runner,
remoteCwd: remoteWorkspaceDir,
assetRemoteDir: prepared.assetDirs.bridge,
queueDir,
bridgeToken,
timeoutMs: 30_000,
pollIntervalMs: 10,
responseTimeoutMs: 1_000,
});
cleanupFns.push(async () => {
await bridge.stop();
});
const responsePromise = fetch(`${bridge.baseUrl}/api/agents/me`, {
headers: {
authorization: `Bearer ${bridgeToken}`,
},
});
const requestFile = await waitForJsonFile(directories.requestsDir);
await writeFile(
path.posix.join(directories.responsesDir, requestFile),
'{"status":200,"headers":{"content-type":"application/json"},"body"',
"utf8",
);
const response = await responsePromise;
expect(response.status).toBe(502);
await expect(response.json()).resolves.toMatchObject({
error: expect.stringMatching(/JSON|Unexpected|Unterminated/i),
});
});
Expand sandbox callback bridge allowlist to cover the documented heartbeat surface (#5324) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - When an agent runs in an e2b sandbox or other non-managed environment, it talks back to the Paperclip server through a per-lease callback bridge that proxies HTTP requests > - The bridge has an allowlist of method/path patterns it will forward; anything outside the list is rejected to keep the bridge tight > - The allowlist had drifted behind what the heartbeat documentation describes as the supported callback surface — several documented endpoints (issue updates, agent-side log emit, work-status writes) were being rejected at the bridge > - This pull request expands the allowlist to cover the documented heartbeat surface and adds tests that pin every newly-allowed pattern, so the doc and the bridge stay in sync > - The benefit is sandboxed runs no longer hit "method not allowed" / "path not allowed" rejections on the documented set of callbacks ## What Changed - `packages/adapter-utils/src/sandbox-callback-bridge.ts`: expand the method/path allowlist to match the documented heartbeat callback surface - `packages/adapter-utils/src/sandbox-callback-bridge.test.ts`: add coverage for every newly-allowed pattern, plus negative cases for patterns that should still be rejected ## Verification - `pnpm vitest run --no-coverage --project @paperclipai/adapter-utils` - `pnpm typecheck` clean - Manual: previously-rejected callbacks from sandboxed runs now succeed end-to-end ## Risks Low. The allowlist only grows; nothing previously allowed is now blocked. Tests pin both the new allowed patterns and that out-of-doc patterns stay rejected. ## Model Used Claude Opus 4.7 (1M context) ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable — new tests cover added patterns + still-rejected negatives - [x] If this change affects the UI, I have included before/after screenshots — N/A (no UI) - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-05-05 19:30:11 -07:00
Serialize sandbox callback bridge against concurrent heartbeats (#5326) > **Stacked PR.** This PR's branch carries cumulative content from #5324 (bridge allowlist expand) and #5325 (env sanitization) — the mutex/sha256 logic in this PR sits on top of both. Reviewers should focus on the files this PR's commit touches: `packages/adapter-utils/src/sandbox-callback-bridge.{ts,test.ts}`, `packages/adapter-utils/src/ssh.ts`, and `packages/adapter-utils/src/ssh-fixture.test.ts`. Will rebase onto `master` and force-push once both prerequisite PRs are merged. ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Each agent that runs in a sandbox or via SSH talks back to the Paperclip server through a per-lease callback bridge whose entrypoint script is uploaded to the remote > - When two heartbeats target the same agent on the same machine concurrently, both upload the bridge entrypoint and both write to the same response files — producing torn-write races: `SyntaxError: Identifier 'randomUUID' has already been declared` from a concatenated upload, `mv: cannot stat …` from colliding `.json.tmp` writes, and 0-byte commits from a truncated stdin > - This pull request serializes those operations with a POSIX `mkdir`-mutex (PID liveness check + atomic rename) at the bridge entrypoint upload, applies the same lock to the bridge response writer, forwards stdin into remote ssh commands so the entrypoint payload arrives intact, and verifies a sha256 of the upload before promoting it > - The benefit is concurrent heartbeats no longer corrupt each other's bridge state ## What Changed - `packages/adapter-utils/src/sandbox-callback-bridge.ts`: serialize entrypoint upload and response writes via POSIX `mkdir`-mutex with PID liveness; sha256 the upload before promoting via `mv`; content-skip when the existing entrypoint already matches - `packages/adapter-utils/src/ssh.ts`: forward stdin into remote ssh commands through the SSH managed runtime so `cat > "$remote_upload"` actually receives the base64-encoded entrypoint - `packages/adapter-utils/src/ssh-fixture.test.ts`: cover the stdin-forwarded SSH path - `packages/adapter-utils/src/sandbox-callback-bridge.test.ts`: cover the mutex, content-skip, sha256-verify, and atomic-rename paths ## Verification - `pnpm vitest run --no-coverage --project @paperclipai/adapter-utils` - `pnpm typecheck` clean - Manual: two parallel heartbeats targeting the same SSH agent no longer race on the bridge entrypoint or response files ## Risks Medium. Serializing previously-parallel operations adds latency on the contended path (one heartbeat waits on another), bounded by the entrypoint upload time. The mutex includes PID liveness so a crashed heartbeat doesn't deadlock subsequent ones. Sha256-verify gives a clear "torn upload" failure mode instead of silent 0-byte commits. ## Model Used Claude Opus 4.7 (1M context) ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable — tests cover mutex + sha256-verify + stdin-forwarded ssh - [x] If this change affects the UI, I have included before/after screenshots — N/A (no UI) - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-05-05 20:01:04 -07:00
it("reuses an already-uploaded bridge entrypoint when the remote file hash matches", async () => {
const rootDir = await mkdtemp(path.join(os.tmpdir(), "paperclip-bridge-sync-"));
cleanupDirs.push(rootDir);
const remoteWorkspaceDir = path.join(rootDir, "remote-workspace");
const remoteAssetDir = path.posix.join(
remoteWorkspaceDir,
".paperclip-runtime",
"codex",
"paperclip-bridge",
"server",
);
await mkdir(remoteWorkspaceDir, { recursive: true });
const bridgeAsset = await createSandboxCallbackBridgeAsset();
cleanupFns.push(bridgeAsset.cleanup);
const originalSource = await readFile(bridgeAsset.entrypoint, "utf8");
const expandedSource = `${originalSource}\n// bridge payload padding\n`;
await writeFile(bridgeAsset.entrypoint, expandedSource, "utf8");
const runner = createExecRunner();
const first = await syncSandboxCallbackBridgeEntrypoint({
runner,
remoteCwd: remoteWorkspaceDir,
assetRemoteDir: remoteAssetDir,
bridgeAsset,
timeoutMs: 30_000,
});
const second = await syncSandboxCallbackBridgeEntrypoint({
runner,
remoteCwd: remoteWorkspaceDir,
assetRemoteDir: remoteAssetDir,
bridgeAsset,
timeoutMs: 30_000,
});
expect(first.uploaded).toBe(true);
expect(second.uploaded).toBe(false);
await expect(readFile(path.posix.join(remoteAssetDir, "paperclip-bridge-server.mjs"), "utf8")).resolves.toBe(expandedSource);
await expect(
readdir(remoteAssetDir).then((entries) =>
entries.filter(
(entry) =>
entry.endsWith(".paperclip-upload.b64") ||
entry.endsWith(".partial") ||
entry === ".paperclip-bridge-upload.lock",
),
),
).resolves.toEqual([]);
});
it("rejects a corrupted bridge entrypoint upload without committing a torn remote file", async () => {
const rootDir = await mkdtemp(path.join(os.tmpdir(), "paperclip-bridge-sync-corrupt-"));
cleanupDirs.push(rootDir);
const remoteWorkspaceDir = path.join(rootDir, "remote-workspace");
const remoteAssetDir = path.posix.join(
remoteWorkspaceDir,
".paperclip-runtime",
"codex",
"paperclip-bridge",
"server",
);
await mkdir(remoteWorkspaceDir, { recursive: true });
const bridgeAsset = await createSandboxCallbackBridgeAsset();
cleanupFns.push(bridgeAsset.cleanup);
const runner = {
execute: async (input: {
command: string;
args?: string[];
cwd?: string;
env?: Record<string, string>;
stdin?: string;
timeoutMs?: number;
}) =>
await createExecRunner().execute({
...input,
stdin: input.stdin != null ? "" : input.stdin,
}),
};
await expect(
syncSandboxCallbackBridgeEntrypoint({
runner,
remoteCwd: remoteWorkspaceDir,
assetRemoteDir: remoteAssetDir,
bridgeAsset,
timeoutMs: 30_000,
}),
).rejects.toThrow(/sha mismatch/i);
await expect(readFile(path.posix.join(remoteAssetDir, "paperclip-bridge-server.mjs"), "utf8")).rejects.toThrow();
await expect(
readdir(remoteAssetDir).then((entries) =>
entries.filter(
(entry) =>
entry.endsWith(".paperclip-upload.b64") ||
entry.endsWith(".partial") ||
entry === ".paperclip-bridge-upload.lock",
),
),
).resolves.toEqual([]);
});
Expand sandbox callback bridge allowlist to cover the documented heartbeat surface (#5324) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - When an agent runs in an e2b sandbox or other non-managed environment, it talks back to the Paperclip server through a per-lease callback bridge that proxies HTTP requests > - The bridge has an allowlist of method/path patterns it will forward; anything outside the list is rejected to keep the bridge tight > - The allowlist had drifted behind what the heartbeat documentation describes as the supported callback surface — several documented endpoints (issue updates, agent-side log emit, work-status writes) were being rejected at the bridge > - This pull request expands the allowlist to cover the documented heartbeat surface and adds tests that pin every newly-allowed pattern, so the doc and the bridge stay in sync > - The benefit is sandboxed runs no longer hit "method not allowed" / "path not allowed" rejections on the documented set of callbacks ## What Changed - `packages/adapter-utils/src/sandbox-callback-bridge.ts`: expand the method/path allowlist to match the documented heartbeat callback surface - `packages/adapter-utils/src/sandbox-callback-bridge.test.ts`: add coverage for every newly-allowed pattern, plus negative cases for patterns that should still be rejected ## Verification - `pnpm vitest run --no-coverage --project @paperclipai/adapter-utils` - `pnpm typecheck` clean - Manual: previously-rejected callbacks from sandboxed runs now succeed end-to-end ## Risks Low. The allowlist only grows; nothing previously allowed is now blocked. Tests pin both the new allowed patterns and that out-of-doc patterns stay rejected. ## Model Used Claude Opus 4.7 (1M context) ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable — new tests cover added patterns + still-rejected negatives - [x] If this change affects the UI, I have included before/after screenshots — N/A (no UI) - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-05-05 19:30:11 -07:00
it("permits the documented heartbeat surface and denies unrelated routes", () => {
const allowed: Array<{ method: string; path: string }> = [
{ method: "GET", path: "/api/agents/me" },
{ method: "GET", path: "/api/agents/me/inbox-lite" },
{ method: "GET", path: "/api/agents/me/inbox/mine" },
{ method: "GET", path: "/api/agents/agent-1" },
{ method: "GET", path: "/api/agents/agent-1/skills" },
{ method: "POST", path: "/api/agents/agent-1/skills/sync" },
{ method: "PATCH", path: "/api/agents/agent-1/instructions-path" },
{ method: "GET", path: "/api/companies/co-1" },
{ method: "GET", path: "/api/companies/co-1/dashboard" },
{ method: "GET", path: "/api/companies/co-1/agents" },
{ method: "GET", path: "/api/companies/co-1/issues" },
{ method: "GET", path: "/api/companies/co-1/projects" },
{ method: "GET", path: "/api/companies/co-1/goals" },
{ method: "GET", path: "/api/companies/co-1/org" },
{ method: "GET", path: "/api/companies/co-1/approvals" },
{ method: "GET", path: "/api/companies/co-1/routines" },
{ method: "GET", path: "/api/companies/co-1/skills" },
{ method: "GET", path: "/api/projects/proj-1" },
{ method: "GET", path: "/api/goals/goal-1" },
{ method: "GET", path: "/api/issues/issue-1" },
{ method: "GET", path: "/api/issues/issue-1/heartbeat-context" },
{ method: "GET", path: "/api/issues/issue-1/comments" },
{ method: "GET", path: "/api/issues/issue-1/comments/c-1" },
{ method: "POST", path: "/api/issues/issue-1/comments" },
{ method: "GET", path: "/api/issues/issue-1/documents" },
{ method: "GET", path: "/api/issues/issue-1/documents/plan" },
{ method: "GET", path: "/api/issues/issue-1/documents/plan/revisions" },
{ method: "PUT", path: "/api/issues/issue-1/documents/plan" },
{ method: "POST", path: "/api/issues/issue-1/checkout" },
{ method: "POST", path: "/api/issues/issue-1/release" },
{ method: "PATCH", path: "/api/issues/issue-1" },
{ method: "GET", path: "/api/issues/issue-1/approvals" },
{ method: "GET", path: "/api/issues/issue-1/interactions" },
{ method: "GET", path: "/api/issues/issue-1/interactions/inter-1" },
{ method: "POST", path: "/api/issues/issue-1/interactions" },
{ method: "POST", path: "/api/issues/issue-1/interactions/inter-1/accept" },
{ method: "POST", path: "/api/issues/issue-1/interactions/inter-1/reject" },
{ method: "POST", path: "/api/issues/issue-1/interactions/inter-1/respond" },
{ method: "POST", path: "/api/companies/co-1/issues" },
{ method: "GET", path: "/api/approvals/ap-1" },
{ method: "GET", path: "/api/approvals/ap-1/issues" },
{ method: "GET", path: "/api/approvals/ap-1/comments" },
{ method: "POST", path: "/api/approvals/ap-1/comments" },
{ method: "POST", path: "/api/companies/co-1/approvals" },
{ method: "GET", path: "/api/execution-workspaces/ws-1" },
{ method: "POST", path: "/api/execution-workspaces/ws-1/runtime-services/start" },
{ method: "POST", path: "/api/execution-workspaces/ws-1/runtime-services/stop" },
{ method: "POST", path: "/api/execution-workspaces/ws-1/runtime-services/restart" },
{ method: "GET", path: "/api/routines/r-1" },
{ method: "GET", path: "/api/routines/r-1/runs" },
{ method: "POST", path: "/api/companies/co-1/routines" },
{ method: "PATCH", path: "/api/routines/r-1" },
{ method: "POST", path: "/api/routines/r-1/run" },
{ method: "POST", path: "/api/routines/r-1/triggers" },
{ method: "PATCH", path: "/api/routine-triggers/t-1" },
{ method: "DELETE", path: "/api/routine-triggers/t-1" },
];
for (const request of allowed) {
expect(authorizeSandboxCallbackBridgeRequestWithRoutes(request)).toBeNull();
}
const denied: Array<{ method: string; path: string }> = [
{ method: "DELETE", path: "/api/secrets" },
// Pin the runtime-services regex to start/stop/restart only — anything
// else (delete, reset, wipe, etc.) must stay denied even if the API
// grows new actions later.
{ method: "POST", path: "/api/execution-workspaces/ws-1/runtime-services/delete" },
{ method: "POST", path: "/api/companies/co-1/agents" },
{ method: "POST", path: "/api/agents/agent-1/pause" },
{ method: "POST", path: "/api/agents/agent-1/terminate" },
{ method: "POST", path: "/api/agents/agent-1/keys" },
{ method: "POST", path: "/api/companies/co-1/exports" },
{ method: "POST", path: "/api/companies/co-1/imports/apply" },
{ method: "POST", path: "/api/companies/co-1/archive" },
{ method: "DELETE", path: "/api/issues/issue-1/documents/plan" },
{ method: "DELETE", path: "/api/issues/issue-1/approvals/ap-1" },
{ method: "POST", path: "/api/approvals/ap-1/approve" },
{ method: "POST", path: "/api/approvals/ap-1/reject" },
{ method: "POST", path: "/api/companies/co-1/logo" },
{ method: "GET", path: "/api/companies/co-1/secrets" },
{ method: "PATCH", path: "/api/secrets/secret-1" },
];
for (const request of denied) {
expect(authorizeSandboxCallbackBridgeRequestWithRoutes(request)).toBe(
`Route not allowed: ${request.method} ${request.path}`,
);
}
});
Add Cloudflare sandbox provider plugin (#5687) > _Stacked on top of #5685 → #5686. Diff against master includes commits from earlier PRs in the stack — review focuses on the two new commits (`Extend sandbox callback bridge for Worker-hosted plugins` + `Add Cloudflare sandbox provider plugin`)._ ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Each agent runs in a sandbox environment, and operators choose which provider backs that sandbox — today E2B and Daytona are bundled with the platform > - Cloudflare Workers + Durable Objects + the Sandbox SDK offer a credible new option: globally distributed, cheap idle, and operator-deployable as a single Worker > - To plug it in, Paperclip needs (a) a provider plugin that speaks the `PaperclipPluginManifestV1` lifecycle and (b) a small operator-deployed Worker — the **bridge** — that adapts Paperclip's runtime RPCs to the Cloudflare Sandbox SDK > - The plugin extends the existing sandbox-callback-bridge with a `bridge.transport: "worker"` discriminator so the platform routes runtime RPCs through the Worker bridge instead of the in-process runner > - This pull request adds the plugin, the bridge Worker template, and the supporting adapter-utils + server hooks the new transport needs > - The benefit is that operators can run sandboxes on Cloudflare's edge with no new platform code beyond installing the plugin and deploying the Worker ## What Changed **Shared support (`Extend sandbox callback bridge for Worker-hosted plugins`):** - `packages/adapter-utils/src/sandbox-callback-bridge.{ts,test.ts}`: expose `expectedHostHeader` so plugin-side bridge clients can verify the canonical request envelope before forwarding. - `packages/adapter-utils/src/command-managed-runtime.{ts,test.ts}`: relax the always-fresh runner construction so callers can re-use a runner across exec calls (Worker-hosted bridges hold the runner inside a Durable Object). - `server/src/services/environment-runtime.ts` + `environment-runtime.test.ts`: route Worker-hosted bridges through the same env-shaping path as E2B and pin the `requestEnv` contract. - `server/src/services/plugin-environment-driver.ts`: thread an optional `issueId` through the runtime descriptor so bridges can scope leases to the originating issue (used by Cloudflare to map a sandbox to the issue/workflow for billing and audit). - `packages/plugins/sdk/src/protocol.ts`: add `issueId?` to `PluginEnvironmentDriverBaseParams` and the new `bridge.transport: "worker"` discriminator that the new plugin declares. - `server/__tests__/heartbeat-plugin-environment.test.ts`: pin the heartbeat path against the new runtime descriptor. **The Cloudflare plugin itself (`Add Cloudflare sandbox provider plugin`):** - `packages/plugins/sandbox-providers/cloudflare/`: plugin entry, manifest, plugin runtime (lifecycle + bridge client), config parsing, and Vitest coverage. Manifest declares `bridge.transport: "worker"` so the platform routes runtime RPCs through the bridge client. - `bridge-template/`: a Worker template the operator deploys with `wrangler`. Owns Durable Object-backed sessions (`sessions.ts`), exec/stream routes (`exec.ts`, `routes.ts`), and an HMAC auth layer (`auth.ts`) that pins the `Host` header surface. Includes the SDK-contract-correct exec implementation, lease recovery, and chunked stdout/stderr streaming. - Tests cover lease/session handoff (`bridge-template/src/exec.test.ts`, `routes.test.ts`), bridge client request shaping (`src/bridge-client.test.ts`), and end-to-end plugin behavior (`src/plugin.test.ts`) including streamed exec output. 27 tests in total. - `README.md` walks the operator through deploying the bridge Worker, registering the plugin, and configuring the runtime. ## Verification - `pnpm typecheck` - `pnpm exec vitest run --no-coverage packages/adapter-utils/src/sandbox-callback-bridge.test.ts packages/adapter-utils/src/command-managed-runtime.test.ts server/src/__tests__/environment-runtime.test.ts server/src/__tests__/heartbeat-plugin-environment.test.ts` - `(cd packages/plugins/sandbox-providers/cloudflare && pnpm test)` — 27 passing For an operator-side smoke test: 1. Deploy the bridge: `cd packages/plugins/sandbox-providers/cloudflare/bridge-template && wrangler deploy` 2. Register the plugin in your Paperclip instance, point its bridge URL at the deployed Worker, set the HMAC shared secret. 3. Create a sandbox environment whose provider is `cloudflare`, then run a Codex or Claude job against it. ## Risks - Adds a new `bridge.transport: "worker"` code path, but the existing E2B / Daytona transports go through the same shaped helpers and have explicit test coverage that pins their behavior unchanged. - The Worker bridge stores session state in a Durable Object; operator instances must be aware of the corresponding Cloudflare costs (DO requests, storage). Documented in the README. - The `issueId` plumbing is optional throughout — existing plugins that don't supply it continue to work. ## Model Used - Provider: Anthropic - Model: Claude Opus 4.7 (1M context) - Capabilities used: extended reasoning, tool use (Read/Edit/Bash/Grep) ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots — N/A, no UI change - [x] I have updated relevant documentation to reflect my changes (plugin README, bridge-template README) - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-11 07:33:13 -07:00
it("marks command-managed bridge operations with the bridge execution channel", async () => {
const runner = {
execute: vi.fn(async () => ({
exitCode: 0,
signal: null,
timedOut: false,
stdout: "",
stderr: "",
pid: null,
startedAt: new Date().toISOString(),
})),
};
const client = createCommandManagedSandboxCallbackBridgeQueueClient({
runner,
remoteCwd: "/workspace",
timeoutMs: 30_000,
});
await client.makeDir("/workspace/.paperclip-runtime/codex/paperclip-bridge/queue");
expect(runner.execute).toHaveBeenCalledWith(expect.objectContaining({
env: {
PAPERCLIP_SANDBOX_EXEC_CHANNEL: "bridge",
},
}));
});
Add sandbox callback bridge for remote environment API access (#4801) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Agents can run inside sandboxed environments like E2B, which are isolated from the host network > - Sandboxed agents need to call back to the Paperclip API to report progress, post comments, and update issue status > - But sandbox environments cannot reach the Paperclip server directly because they run in isolated network namespaces > - This PR adds a callback bridge that proxies API requests from the sandbox to the Paperclip server, running as a local HTTP server on the host that forwards authenticated requests > - The bridge is started automatically when an adapter launches a sandbox execution, and torn down when the run completes > - The benefit is sandboxed agents can interact with the Paperclip API without requiring network-level access to the host, enabling E2B and similar providers to work end-to-end ## What Changed - Added `sandbox-callback-bridge.ts` in `packages/adapter-utils/` — a lightweight HTTP bridge server that accepts requests from sandbox environments and proxies them to the Paperclip API with authentication - Added request validation and security policy: the bridge only forwards requests to the configured API URL, validates content types, enforces size limits, and rejects non-API paths - Wired the bridge into all remote adapter execute paths (claude, codex, cursor, gemini, pi) — the bridge starts before the agent process and the bridge URL is passed via environment variables - Updated `environment-execution-target.ts` to prefer the explicit API URL from environment lease metadata for sandbox callback routing - Fixed Claude sandbox runtime setup to work with the bridge configuration - Added comprehensive test coverage for bridge request handling, policy enforcement, and sandbox execution integration - Fixed browser bundling — the bridge module is excluded from the frontend bundle via the adapter-utils index export ## Verification - `pnpm test` — all existing and new tests pass, including bridge unit tests and sandbox execution integration tests - `pnpm typecheck` — clean - Manual: configure an E2B environment, run an agent task, verify the agent can post comments and update issue status through the bridge ## Risks - Medium. This is a new network-facing component (HTTP server on localhost). The security policy restricts forwarding to the configured API URL only and validates all requests, but any proxy introduces attack surface. The bridge binds to localhost only and is scoped to the lifetime of a single agent run. ## Model Used Codex GPT 5.4 high via Paperclip. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-29 16:37:34 -07:00
});