Commit graph

2 commits

Author SHA1 Message Date
Devin Foley
ad0bb57350
Fix exe.dev sandbox installs for gemini/opencode local adapters (#5737)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies, including
running adapter CLIs inside remote sandboxes
> - The QA matrix in PAPA-316 spins up local-runtime adapters
(claude/gemini/opencode) against both SSH and the new exe.dev sandbox
provider, and "Test" exercises the same install + probe path the real
runtime uses
> - On exe.dev the QA matrix failed at three different points:
SSH/sandbox secret refs would not resolve, gemini-local could not find
npm, and opencode-local installed a binary that was not on the
probe-shell PATH
> - These are all environment-shape issues the runtime should handle,
not regressions in any individual adapter, so they need to be fixed in
the shared install/resolve layer before the matrix can pass
> - This pull request wires the environment id through to secret-ref
resolution, bootstraps npm from a portable Node tarball when the sandbox
image lacks Node, and symlinks the opencode binary into a directory that
non-login shells see
> - The benefit is that the QA matrix passes end-to-end on exe.dev, and
any future sandbox provider that ships without Node or relies on rc-file
PATH wiring gets the same fixes for free

## What Changed

- `server/src/services/environment-execution-target.ts`: pass the
environment `id` into `resolveEnvironmentDriverConfigForRuntime` for
both the sandbox and SSH branches, so `privateKeySecretRef` /
sandbox-provider secret refs (e.g. exe.dev `apiKey`) can resolve against
the secret store at runtime instead of throwing `Runtime secret
resolution requires an environment id`.
- `packages/adapter-utils/src/sandbox-install-command.ts`: extend
`buildSandboxNpmInstallCommand` with an `ENSURE_NPM_PREAMBLE` that, when
`npm` is missing, downloads a portable Node v22 tarball into
`$HOME/.local` and sets `PAPERCLIP_NPM_BOOTSTRAPPED=1` so the install
step skips sudo (sudo's `secure_path` would lose the freshly-installed
`npm` in `$HOME/.local/bin`). Distro-packaged Node from apt-get is
intentionally avoided because it tends to be too old to parse modern JS
syntax used by `@google/gemini-cli`.
- `packages/adapters/gemini-local/src/index.ts`: switch the hardcoded
`npm install -g @google/gemini-cli` to `buildSandboxNpmInstallCommand`,
so gemini-local picks up the same sudo-aware + npm-bootstrap behavior as
the other local adapters.
- `packages/adapters/opencode-local/src/index.ts`: append a step to the
install command that symlinks `$HOME/.opencode/bin/opencode` into
`$HOME/.local/bin`. The upstream installer only adds `~/.opencode/bin`
to PATH via `~/.bashrc`, which non-login `sh -c` probe invocations do
not source.
- `packages/adapter-utils/src/sandbox-install-command.test.ts`: cover
the new preamble plus the unchanged root/sudo/user-prefix branches.

## Verification

- `cd packages/adapter-utils && npm test -- sandbox-install-command`
(passes; new "bootstraps npm from a portable Node tarball when missing"
case is included).
- Manual: ran the in-app `Test` action against the QA matrix dev
instance for `QA exe.dev Claude`, `QA exe.dev Gemini`, and `QA exe.dev
OpenCode` — all three now report `status=pass` including the hello
probe. `QA SSH Claude` also passes; without the environment-id fix, SSH
resolution threw before the wrapper / install fixes could run.
- Suggested reviewer check: re-run the matrix on a fresh exe.dev
environment and confirm the install step no longer hits `npm: command
not found` for gemini and the opencode probe no longer hits `opencode:
command not found`.

## Risks

- Low/medium. The npm bootstrap pins Node `v22.11.0` from
`nodejs.org/dist`; if that URL becomes unreachable the install will fail
with a clear `curl` error rather than corrupting state. The bootstrap
path is only taken when `npm` is genuinely missing, so existing sandbox
images that ship with Node are unaffected.
- The opencode symlink uses `ln -sf` into `$HOME/.local/bin`, which is
created with `mkdir -p`; idempotent on re-install.
- The `id` change is a strict additive: callers previously got
`undefined` and only the secret-ref code paths actually read it. No
behavior change for environments without secret refs.

## Model Used

- Claude (Anthropic), `claude-opus-4-7`, with extended thinking and tool
use enabled. Iterated through the Paperclip QA matrix harness; no other
model assisted.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots (n/a — runtime/install path only)
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-11 14:28:22 -07:00
Devin Foley
b24c6909e8
Harden remote sandbox runtime probes, timeouts, and installs (#5685)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Each agent runs inside a sandbox environment so its CLI is isolated
from the host
> - Sandbox-backed adapter runs go through a small set of shared helpers
— `ensureAdapterExecutionTargetCommandResolvable`, the sandbox callback
bridge runner, and per-adapter `SANDBOX_INSTALL_COMMAND` strings
> - When standing up new sandbox provider plugins, the existing helpers
timed out, missed install fallbacks, or leaned on assumptions that only
held for E2B
> - Local adapters (`claude-local`, `codex-local`, `gemini-local`,
`opencode-local`) needed slightly hardened probes so they could install
themselves and validate inside *any* remote sandbox transport, not just
E2B
> - This pull request bundles those runtime fixes so future sandbox
provider plugins inherit a working baseline
> - The benefit is that adding a new sandbox provider plugin no longer
requires touching adapter-utils or each local-adapter probe — the
supporting infra is already correct

## What Changed

- `packages/adapter-utils/src/execution-target.ts`: introduce
`DEFAULT_REMOTE_SANDBOX_ADAPTER_TIMEOUT_SEC = 1800` and
`resolveAdapterExecutionTargetTimeoutSec(...)`. Local and SSH adapters
keep the historical "0 means no adapter timeout" behavior;
sandbox-backed runs without an explicit `timeoutSec` get an explicit
30-minute default so remote installs and warm-up don't time out at the
per-RPC default. Plumbed `timeoutSec` through
`ensureAdapterExecutionTargetCommandResolvable` so install probes inside
a sandbox honor adapter-level overrides instead of the bridge's 5-minute
default.
- `packages/adapters/opencode-local/src/index.ts`: switch
`SANDBOX_INSTALL_COMMAND` from `npm install -g opencode-ai` to `curl
-fsSL https://opencode.ai/install | bash`. The npm package reifies four
large prebuilt-binary subpackages in parallel even though only one
matches the host arch; on bandwidth-constrained sandboxes that blew
through the 240s install budget. The official installer fetches one
arch-specific binary and adds `$HOME/.opencode/bin` to PATH via
`~/.bashrc`, which the sandbox-callback-bridge login-shell script
already sources.
- `packages/adapters/{claude,codex,gemini,opencode}-local/`: harden
remote-target probes — pass `--skip-git-repo-check` for Codex when
probing outside a repo, normalize permission flags for Claude, and add
`*.remote.test.ts` coverage that exercises the remote-sandbox path
explicitly for each adapter.
- `packages/adapter-utils/src/sandbox-install-command.{ts,test.ts}`
(new): add `buildSandboxNpmInstallCommand` helper.
`server/src/adapters/registry.ts` + new
`server/src/__tests__/adapter-registry.test.ts`: wire adapter install
commands so they fall back to a writable `$HOME/.local` prefix when
global install isn't available.
- `server/src/__tests__/plugin-worker-manager.test.ts` + new
`server/src/__tests__/fixtures/plugin-worker-delayed.cjs`: pin per-call
timeout overrides so plugin worker exec calls honor the caller's timeout
instead of the worker's default.

## Verification

- `pnpm typecheck`
- `pnpm exec vitest run --no-coverage
packages/adapter-utils/src/execution-target-sandbox.test.ts
packages/adapter-utils/src/sandbox-install-command.test.ts`
- `pnpm exec vitest run --no-coverage
server/src/__tests__/plugin-worker-manager.test.ts
server/src/__tests__/adapter-registry.test.ts
server/src/__tests__/claude-local-adapter-environment.test.ts
server/src/__tests__/claude-local-execute.test.ts
server/src/__tests__/gemini-local-adapter-environment.test.ts`
- `pnpm exec vitest run --no-coverage
packages/adapters/codex-local/src/server/test.remote.test.ts
packages/adapters/opencode-local/src/server/test.remote.test.ts
packages/adapters/codex-local/src/server/codex-args.test.ts
packages/adapters/codex-local/src/server/execute.remote.test.ts
packages/adapters/gemini-local/src/server/execute.remote.test.ts`

All passing locally.

## Risks

- Touches shared `adapter-utils` and several `*-local` adapters. The
30-minute default applies only when both (a) the target is
`remote+sandbox` and (b) no `timeoutSec` is configured — local + SSH
paths are unchanged. New test coverage was added alongside each behavior
change to pin the contracts.
- Switching OpenCode's install command to the official installer is a
behavior change for any operator running OpenCode inside a remote
sandbox. Local installs are unaffected (the `SANDBOX_INSTALL_COMMAND`
only runs when an adapter is being installed inside a sandbox).
- Low risk overall — no migrations, no API surface change.

## Model Used

- Provider: Anthropic
- Model: Claude Opus 4.7 (1M context)
- Capabilities used: extended reasoning, tool use (Read/Edit/Bash/Grep),
no code execution beyond local repo commands

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots — N/A, no UI change
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-11 00:31:54 -07:00