Let sandbox providers declare shell defaults (#5114)

## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Agents execute in sandboxed remote environments served by pluggable
sandbox
>   providers (E2B today, more later)
> - Today every sandbox command runs under `sh -lc` regardless of what
the
>   provider's container actually ships
> - That misses bash-only shell init on E2B (which ships bash) and
prevents
> future providers from declaring a different default — there's no way
for a
>   provider to say "I have bash, use it"
> - This PR adds a `shellCommand` field to sandbox execution targets so
providers
> can declare their preferred shell ("bash" for E2B), threads it through
the
> sandbox-managed-runtime client, callback bridge, and execution-target
shell
>   helper, and validates the value at the lease-metadata boundary
> - The benefit is that sandbox commands run under the right shell on
the right
> provider, and adding new sandbox providers only needs to declare a
shell
>   preference

## What Changed

- Added `packages/adapter-utils/src/sandbox-shell.ts` exporting
`preferredShellForSandbox(shellCommand)` (returns `"bash"` if input is
`"bash"`,
  else `"sh"`)
- Added `shellCommand?: "bash" | "sh" | null` to
`AdapterSandboxExecutionTarget`
  and `CommandManagedRuntimeSpec`; threaded it through
`runAdapterExecutionTargetShellCommand`,
`prepareAdapterExecutionTargetRuntime`,
  and `startAdapterExecutionTargetPaperclipBridge`
- `createCommandManagedRuntimeClient`, `prepareCommandManagedRuntime`,
and
`createCommandManagedSandboxCallbackBridgeQueueClient` now take an
optional
  `shellCommand` and use `preferredShellForSandbox` to pick the shell
- `startSandboxCallbackBridgeServer` accepts a `shellCommand` for its
server
  startup, readiness probe, and stop hook
- E2B sandbox plugin declares `shellCommand: "bash"` in `leaseMetadata`
- `resolveEnvironmentExecutionTarget` reads `shellCommand` from lease
metadata
  (validating against `"bash" | "sh" | null`)
- `environment-runtime.ts` adds `"shellCommand"` to
`INTERNAL_PLUGIN_SANDBOX_CONFIG_KEYS`
so the field round-trips through internal plugin config without leaking
to
  external plugin metadata
- Updated tests in `command-managed-runtime.test.ts`,
  `execution-target-sandbox.test.ts`, `sandbox-callback-bridge.test.ts`,
  `environment-execution-target.test.ts`

## Verification

- `pnpm --filter @paperclipai/adapter-utils test`
- `pnpm --filter @paperclipai/server test --
environment-execution-target`
- `pnpm --filter @paperclipai/sandbox-providers-e2b test`
- Manual QA: boot a Paperclip instance, create an E2B-backed
environment, run a
claude_local agent against it, and confirm the run completes (verifies
bash
  shell semantics flow through the callback bridge end-to-end)

## Risks

- E2B sandbox commands now run under `bash -lc` instead of `sh -lc`.
Bash is a
strict superset for the commands we issue (no busybox-only flags in our
shell
scripts), so risk is low. The shellCommand field is opt-in via lease
metadata —
  providers that don't declare it stay on `sh`.
- New optional field on `CommandManagedRuntimeSpec` and
`AdapterSandboxExecutionTarget`.
  Consumers ignoring the field retain previous behaviour (sh).
- Lease metadata now carries an additional field. Existing leases
without
`shellCommand` resolve to `null` and fall back to sh — backwards
compatible.

## Model Used

- OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI
- Provider: OpenAI
- Used to author the code changes in this PR

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots — N/A (no UI changes)
- [ ] I have updated relevant documentation to reflect my changes — N/A
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
This commit is contained in:
Devin Foley 2026-05-03 12:19:35 -07:00 committed by GitHub
parent 15eac43b43
commit a7b45938b7
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
11 changed files with 133 additions and 14 deletions

View file

@ -4,6 +4,7 @@ import os from "node:os";
import path from "node:path";
import type { CommandManagedRuntimeRunner } from "./command-managed-runtime.js";
import { preferredShellForSandbox } from "./sandbox-shell.js";
import type { RunProcessResult } from "./server-utils.js";
const DEFAULT_BRIDGE_TOKEN_BYTES = 24;
@ -133,9 +134,10 @@ async function runShell(
cwd: string,
script: string,
timeoutMs: number,
shellCommand: "bash" | "sh" = "sh",
): Promise<RunProcessResult> {
return await runner.execute({
command: "sh",
command: shellCommand,
args: ["-lc", script],
cwd,
timeoutMs,
@ -266,10 +268,12 @@ export function createCommandManagedSandboxCallbackBridgeQueueClient(input: {
runner: CommandManagedRuntimeRunner;
remoteCwd: string;
timeoutMs?: number | null;
shellCommand?: "bash" | "sh" | null;
}): SandboxCallbackBridgeQueueClient {
const timeoutMs = normalizeTimeoutMs(input.timeoutMs, DEFAULT_BRIDGE_RESPONSE_TIMEOUT_MS);
const shellCommand = preferredShellForSandbox(input.shellCommand);
const runChecked = async (action: string, script: string) =>
requireSuccessfulResult(action, await runShell(input.runner, input.remoteCwd, script, timeoutMs));
requireSuccessfulResult(action, await runShell(input.runner, input.remoteCwd, script, timeoutMs, shellCommand));
return {
makeDir: async (remotePath) => {
@ -288,6 +292,7 @@ export function createCommandManagedSandboxCallbackBridgeQueueClient(input: {
"fi",
].join("\n"),
timeoutMs,
shellCommand,
);
requireSuccessfulResult(`list ${remotePath}`, result);
return result.stdout
@ -525,10 +530,12 @@ export async function startSandboxCallbackBridgeServer(input: {
responseTimeoutMs?: number | null;
timeoutMs?: number | null;
nodeCommand?: string;
shellCommand?: "bash" | "sh" | null;
maxQueueDepth?: number | null;
maxBodyBytes?: number | null;
}): Promise<StartedSandboxCallbackBridgeServer> {
const timeoutMs = normalizeTimeoutMs(input.timeoutMs, DEFAULT_BRIDGE_RESPONSE_TIMEOUT_MS);
const shellCommand = preferredShellForSandbox(input.shellCommand);
const directories = sandboxCallbackBridgeDirectories(input.queueDir);
const remoteEntrypoint = path.posix.join(input.assetRemoteDir, SANDBOX_CALLBACK_BRIDGE_ENTRYPOINT);
if (input.bridgeAsset) {
@ -536,6 +543,7 @@ export async function startSandboxCallbackBridgeServer(input: {
runner: input.runner,
remoteCwd: input.remoteCwd,
timeoutMs,
shellCommand,
});
await assetClient.makeDir(input.assetRemoteDir);
const entrypointSource = await fs.readFile(input.bridgeAsset.entrypoint, "utf8");
@ -553,7 +561,7 @@ export async function startSandboxCallbackBridgeServer(input: {
});
const nodeCommand = input.nodeCommand?.trim() || "node";
const startResult = await input.runner.execute({
command: "sh",
command: shellCommand,
args: [
"-lc",
[
@ -594,6 +602,7 @@ export async function startSandboxCallbackBridgeServer(input: {
"exit 1",
].join("\n"),
timeoutMs,
shellCommand,
);
requireSuccessfulResult("wait for sandbox callback bridge readiness", readyResult);
@ -626,7 +635,7 @@ export async function startSandboxCallbackBridgeServer(input: {
directories,
stop: async () => {
const stopResult = await input.runner.execute({
command: "sh",
command: shellCommand,
args: [
"-lc",
[