mirror of
https://github.com/alkimake/paperclip.git
synced 2026-06-14 01:50:39 +09:00
Harden Cloudflare sandbox execution (#5967)
## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies. > - Remote-managed adapters need sandbox/environment execution to behave like real agent runs, not just local host probes. > - The Cloudflare sandbox path was the weakest leg in the SSH + Cloudflare QA matrix because bridge execution could truncate output, time out long-running installs, and under-provision the worker instance. > - That made several adapters fail for reasons unrelated to their actual business logic, which blocks confidence in Paperclip's non-local environment model. > - This pull request hardens the Cloudflare bridge/runtime path and adjusts sandbox probe budgets so adapter verification matches the measured behavior of the fixed environment. > - It also corrects the Pi sandbox install command so the QA matrix exercises a real, supported install path. > - The benefit is a materially more reliable SSH + Cloudflare adapter matrix with fewer false negatives and clearer failure boundaries. ## What Changed - Switched the Cloudflare bridge worker instance type to `standard-2` for the QA-matrix execution path. - Raised Cloudflare bridge/plugin-worker timeout budgets and added SSE keepalives so long-running install/exec calls can complete instead of dying at the transport layer. - Fixed Cloudflare bridge-channel command handling to avoid dropped final stdout chunks on short-lived execs. - Made Claude, OpenCode, and Cursor sandbox probe timeouts configurable/sandbox-aware, then tightened the defaults to the measured post-fix range. - Updated the Pi sandbox install command to use the package currently installed by the official `pi.dev` installer, pinned to a specific npm version. - Added/updated tests around Cloudflare bridge behavior and adapter sandbox probe paths. ## Verification - `pnpm --filter @paperclipai/adapter-claude-local typecheck` - `pnpm --filter @paperclipai/adapter-opencode-local typecheck` - `pnpm --filter @paperclipai/adapter-cursor-local typecheck` - `pnpm vitest run packages/adapters/cursor-local packages/adapters/claude-local packages/adapters/opencode-local packages/adapters/pi-local packages/plugins/sandbox-providers/cloudflare server/src/services/__tests__/plugin-worker-manager.test.ts` - Manual QA on the dedicated dev instance using the SSH + Cloudflare environment matrix (`ENV-29` through `ENV-40`). Clean end-to-end passes: SSH `claude_local`, `codex_local`, `cursor`, `gemini_local`; Cloudflare `claude_local`, `codex_local`, `cursor`, `gemini_local`. ## Risks - Cloudflare sandbox cost increases because the bridge worker now runs on `standard-2` instead of `lite`. - Higher timeout ceilings can delay surfacing truly hung Cloudflare bridge calls, even though they remove transport-level false negatives. - The manual heartbeat matrix still exposed follow-on execution/sync/disposition bugs in `opencode_local` and `pi_local`; those are not fixed by this PR. ## Model Used - OpenAI `gpt-5.4` via Paperclip `codex_local`, reasoning effort `high`, tool use enabled, repo search enabled. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots (not applicable) - [x] I have updated relevant documentation to reflect my changes (not applicable) - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing>
This commit is contained in:
parent
f4bed4a70f
commit
1bd44c8a0d
10 changed files with 113 additions and 12 deletions
|
|
@ -9,6 +9,7 @@ import type {
|
|||
import type { AdapterExecutionTarget } from "@paperclipai/adapter-utils/execution-target";
|
||||
import {
|
||||
asBoolean,
|
||||
asNumber,
|
||||
asString,
|
||||
asStringArray,
|
||||
parseObject,
|
||||
|
|
@ -72,6 +73,7 @@ export async function testEnvironment(
|
|||
const command = asString(config.command, "opencode");
|
||||
const target = ctx.executionTarget ?? null;
|
||||
const targetIsRemote = target?.kind === "remote";
|
||||
const targetIsSandbox = target?.kind === "remote" && target.transport === "sandbox";
|
||||
const cwd = resolveAdapterExecutionTargetCwd(target, asString(config.cwd, ""), process.cwd());
|
||||
const targetLabel = targetIsRemote
|
||||
? ctx.environmentName ?? describeAdapterExecutionTarget(target)
|
||||
|
|
@ -334,6 +336,14 @@ export async function testEnvironment(
|
|||
if (variant) args.push("--variant", variant);
|
||||
if (extraArgs.length > 0) args.push(...extraArgs);
|
||||
|
||||
// Sandbox bridges still add cold-start and transport overhead, but the
|
||||
// standard-2 Cloudflare tier now probes quickly enough that 90s keeps
|
||||
// useful headroom without letting slow hangs linger.
|
||||
const helloProbeTimeoutSec = Math.max(
|
||||
1,
|
||||
asNumber(config.helloProbeTimeoutSec, targetIsSandbox ? 90 : 60),
|
||||
);
|
||||
|
||||
try {
|
||||
const probe = await runAdapterExecutionTargetProcess(
|
||||
runId,
|
||||
|
|
@ -343,7 +353,7 @@ export async function testEnvironment(
|
|||
{
|
||||
cwd: runtimeCwd,
|
||||
env: runtimeEnv,
|
||||
timeoutSec: 60,
|
||||
timeoutSec: helloProbeTimeoutSec,
|
||||
graceSec: 5,
|
||||
stdin: "Respond with hello.",
|
||||
onLog: async () => {},
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue