Stabilize runtime probes and Codex env tests (#5445)

## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Adapters expose a Test action that probes the configured runtime —
install, resolvability, hello — to give operators a fast yes/no on
whether an environment is healthy
> - The Codex test path was running its hello probe directly without
going through the managed-runtime preparation that production runs use,
so a healthy production setup could still report a probe failure
> - The plugin worker manager wasn't surfacing terminated workers
cleanly, leaving the runtime probe waiting on a dead worker until the
request timed out
> - This pull request routes the Codex test probe through
`prepareAdapterExecutionTargetRuntime` (so it sees the same managed
Codex home production sees), exposes `commandCwd` on
`createCommandManagedRuntimeClient` so callers can target a per-probe
directory without leaking the workspace `remoteCwd`, and propagates
plugin-worker termination as a usable error instead of a hang
> - The benefit is the Codex Test action mirrors production behavior
end-to-end, and probes against a terminated plugin worker fail fast
instead of timing out

## What Changed

- `packages/adapter-utils/src/command-managed-runtime.ts`: rename the
`remoteCwd` knob to `commandCwd` so callers can target a per-probe
directory without inheriting the workspace cwd; matching test coverage
in `command-managed-runtime.test.ts`
- `packages/adapter-utils/src/sandbox-callback-bridge.{ts,test.ts}`:
small fixes to keep callback bridge stop semantics deterministic
- `packages/adapters/codex-local/src/server/test.ts`: thread the Codex
hello probe through `prepareAdapterExecutionTargetRuntime` +
`prepareManagedCodexHome` so the probe sees the same managed home
production sees; new `test.remote.test.ts` covers the remote probe path
- `packages/adapters/cursor-local/src/server/execute.ts`: small
probe-side cleanup that aligns with the new commandCwd contract
- `server/src/services/plugin-worker-manager.ts`: surface plugin-worker
termination as a structured error so callers fail fast; new
`plugin-worker-terminated.cjs` fixture and
`plugin-worker-manager.test.ts` cases pin the behavior

## Verification

- `pnpm vitest run --no-coverage --project @paperclipai/adapter-utils
--project @paperclipai/adapter-codex-local --project
@paperclipai/adapter-cursor-local --project @paperclipai/server` —
1749/1750 passing (1 unrelated skip)
- `pnpm typecheck` clean

## Risks

Low–medium. The `remoteCwd → commandCwd` rename is a parameter renaming
on an internal helper used only by adapter test/execute paths in this
repo. The plugin-worker-terminated path was previously a hang; failing
fast may surface latent timeouts as explicit termination errors in
callers that already expected them.

## Model Used

Claude Opus 4.7 (1M context)

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable — new tests cover
commandCwd, plugin-worker termination, and Codex remote test path
- [x] If this change affects the UI, I have included before/after
screenshots — N/A (no UI)
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---

> **Stacked PR.** Sits on top of #5444 which adds the per-run runtime
API surface this PR builds on. Cumulative diff against `master` includes
that PR's content; the files touched by *this* PR's commit are listed
under "What Changed" above. Will rebase onto `master` and force-push
once #5444 merges.
This commit is contained in:
Devin Foley 2026-05-07 14:52:31 -07:00 committed by GitHub
parent 12cb7b40fd
commit fe3904f434
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
12 changed files with 639 additions and 90 deletions

View file

@ -57,7 +57,7 @@ function requireSuccessfulResult(result: RunProcessResult, action: string): void
export function createCommandManagedRuntimeClient(input: {
runner: CommandManagedRuntimeRunner;
remoteCwd: string;
commandCwd: string;
timeoutMs: number;
shellCommand?: "bash" | "sh" | null;
}): SandboxManagedRuntimeClient {
@ -66,7 +66,7 @@ export function createCommandManagedRuntimeClient(input: {
const result = await input.runner.execute({
command: shellCommand,
args: ["-lc", script],
cwd: input.remoteCwd,
cwd: input.commandCwd,
stdin: opts.stdin,
timeoutMs: opts.timeoutMs ?? input.timeoutMs,
});
@ -117,7 +117,7 @@ export function createCommandManagedRuntimeClient(input: {
const result = await input.runner.execute({
command: shellCommand,
args: ["-lc", `rm -rf ${shellQuote(remotePath)}`],
cwd: input.remoteCwd,
cwd: input.commandCwd,
timeoutMs: input.timeoutMs,
});
requireSuccessfulResult(result, `remove ${remotePath}`);
@ -126,7 +126,7 @@ export function createCommandManagedRuntimeClient(input: {
const result = await input.runner.execute({
command: shellCommand,
args: ["-lc", command],
cwd: input.remoteCwd,
cwd: input.commandCwd,
timeoutMs: options.timeoutMs,
});
requireSuccessfulResult(result, command);
@ -149,6 +149,7 @@ export async function prepareCommandManagedRuntime(input: {
}): Promise<PreparedSandboxManagedRuntime> {
const timeoutMs = input.spec.timeoutMs && input.spec.timeoutMs > 0 ? input.spec.timeoutMs : 300_000;
const workspaceRemoteDir = input.workspaceRemoteDir ?? input.spec.remoteCwd;
const commandCwd = input.spec.remoteCwd;
const runtimeSpec: SandboxRemoteExecutionSpec = {
transport: "sandbox",
provider: input.spec.providerKey ?? "sandbox",
@ -159,7 +160,7 @@ export async function prepareCommandManagedRuntime(input: {
};
const client = createCommandManagedRuntimeClient({
runner: input.runner,
remoteCwd: workspaceRemoteDir,
commandCwd,
timeoutMs,
shellCommand: input.spec.shellCommand,
});
@ -176,7 +177,7 @@ export async function prepareCommandManagedRuntime(input: {
const probe = await input.runner.execute({
command: shellCommand,
args: ["-lc", `command -v ${shellQuote(detectCommand)} >/dev/null 2>&1`],
cwd: workspaceRemoteDir,
cwd: commandCwd,
timeoutMs,
});
if (!probe.timedOut && (probe.exitCode ?? 1) === 0) {
@ -195,7 +196,7 @@ export async function prepareCommandManagedRuntime(input: {
const result = await input.runner.execute({
command: shellCommand,
args: ["-lc", installCommand],
cwd: workspaceRemoteDir,
cwd: commandCwd,
timeoutMs,
});
// A failed install is not always fatal: the CLI may already be on PATH