Stabilize runtime probes and Codex env tests (#5445)

## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Adapters expose a Test action that probes the configured runtime — install, resolvability, hello — to give operators a fast yes/no on whether an environment is healthy > - The Codex test path was running its hello probe directly without going through the managed-runtime preparation that production runs use, so a healthy production setup could still report a probe failure > - The plugin worker manager wasn't surfacing terminated workers cleanly, leaving the runtime probe waiting on a dead worker until the request timed out > - This pull request routes the Codex test probe through `prepareAdapterExecutionTargetRuntime` (so it sees the same managed Codex home production sees), exposes `commandCwd` on `createCommandManagedRuntimeClient` so callers can target a per-probe directory without leaking the workspace `remoteCwd`, and propagates plugin-worker termination as a usable error instead of a hang > - The benefit is the Codex Test action mirrors production behavior end-to-end, and probes against a terminated plugin worker fail fast instead of timing out ## What Changed - `packages/adapter-utils/src/command-managed-runtime.ts`: rename the `remoteCwd` knob to `commandCwd` so callers can target a per-probe directory without inheriting the workspace cwd; matching test coverage in `command-managed-runtime.test.ts` - `packages/adapter-utils/src/sandbox-callback-bridge.{ts,test.ts}`: small fixes to keep callback bridge stop semantics deterministic - `packages/adapters/codex-local/src/server/test.ts`: thread the Codex hello probe through `prepareAdapterExecutionTargetRuntime` + `prepareManagedCodexHome` so the probe sees the same managed home production sees; new `test.remote.test.ts` covers the remote probe path - `packages/adapters/cursor-local/src/server/execute.ts`: small probe-side cleanup that aligns with the new commandCwd contract - `server/src/services/plugin-worker-manager.ts`: surface plugin-worker termination as a structured error so callers fail fast; new `plugin-worker-terminated.cjs` fixture and `plugin-worker-manager.test.ts` cases pin the behavior ## Verification - `pnpm vitest run --no-coverage --project @paperclipai/adapter-utils --project @paperclipai/adapter-codex-local --project @paperclipai/adapter-cursor-local --project @paperclipai/server` — 1749/1750 passing (1 unrelated skip) - `pnpm typecheck` clean ## Risks Low–medium. The `remoteCwd → commandCwd` rename is a parameter renaming on an internal helper used only by adapter test/execute paths in this repo. The plugin-worker-terminated path was previously a hang; failing fast may surface latent timeouts as explicit termination errors in callers that already expected them. ## Model Used Claude Opus 4.7 (1M context) ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable — new tests cover commandCwd, plugin-worker termination, and Codex remote test path - [x] If this change affects the UI, I have included before/after screenshots — N/A (no UI) - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --- > **Stacked PR.** Sits on top of #5444 which adds the per-run runtime API surface this PR builds on. Cumulative diff against `master` includes that PR's content; the files touched by *this* PR's commit are listed under "What Changed" above. Will rebase onto `master` and force-push once #5444 merges.
2026-06-14 01:50:39 +09:00 · 2026-05-07 14:52:31 -07:00 · 2026-05-07 14:52:31 -07:00 · fe3904f434
commit fe3904f434
parent 12cb7b40fd
12 changed files with 639 additions and 90 deletions
--- a/packages/adapter-utils/src/command-managed-runtime.test.ts
+++ b/packages/adapter-utils/src/command-managed-runtime.test.ts
@ -131,4 +131,90 @@ describe("command managed runtime", () => {
      .toMatchObject({ code: "ENOENT" });
    expect(calls.every((call) => call.stdin == null)).toBe(true);
  });
  it("runs setup commands from the existing sandbox cwd when staging into a nested remote workspace dir", async () => {
    const rootDir = await mkdtemp(path.join(os.tmpdir(), "paperclip-command-runtime-nested-"));
    cleanupDirs.push(rootDir);
    const localWorkspaceDir = path.join(rootDir, "local-workspace");
    const remoteBaseDir = path.join(rootDir, "remote-base");
    const remoteWorkspaceDir = path.join(remoteBaseDir, ".paperclip-runtime", "runs", "test", "workspace");
    await mkdir(localWorkspaceDir, { recursive: true });
    await mkdir(remoteBaseDir, { recursive: true });
    await writeFile(path.join(localWorkspaceDir, "README.md"), "local workspace\n", "utf8");
    const calls: Array<{
      command: string;
      args?: string[];
      cwd?: string;
      env?: Record<string, string>;
      stdin?: string;
      timeoutMs?: number;
    }> = [];
    const runner = {
      execute: async (input: {
        command: string;
        args?: string[];
        cwd?: string;
        env?: Record<string, string>;
        stdin?: string;
        timeoutMs?: number;
      }): Promise<RunProcessResult> => {
        calls.push({ ...input });
        const startedAt = new Date().toISOString();
        try {
          const result = await execFile(input.command === "sh" ? "/bin/sh" : input.command, input.args ?? [], {
            cwd: input.cwd,
            env: {
              ...process.env,
              ...input.env,
            },
            maxBuffer: 32 * 1024 * 1024,
            timeout: input.timeoutMs,
          });
          return {
            exitCode: 0,
            signal: null,
            timedOut: false,
            stdout: result.stdout,
            stderr: result.stderr,
            pid: null,
            startedAt,
          };
        } catch (error) {
          const err = error as NodeJS.ErrnoException & {
            stdout?: string;
            stderr?: string;
            code?: string | number | null;
            signal?: NodeJS.Signals | null;
            killed?: boolean;
          };
          return {
            exitCode: typeof err.code === "number" ? err.code : null,
            signal: err.signal ?? null,
            timedOut: Boolean(err.killed && input.timeoutMs),
            stdout: err.stdout ?? "",
            stderr: err.stderr ?? "",
            pid: null,
            startedAt,
          };
        }
      },
    };
    await prepareCommandManagedRuntime({
      runner,
      spec: {
        remoteCwd: remoteBaseDir,
        timeoutMs: 30_000,
      },
      adapterKey: "codex",
      workspaceLocalDir: localWorkspaceDir,
      workspaceRemoteDir: remoteWorkspaceDir,
    });
    expect(calls.length).toBeGreaterThan(0);
    expect(calls.every((call) => call.cwd === remoteBaseDir)).toBe(true);
    await expect(readFile(path.join(remoteWorkspaceDir, "README.md"), "utf8")).resolves.toBe("local workspace\n");
  });
 });
--- a/packages/adapter-utils/src/command-managed-runtime.ts
+++ b/packages/adapter-utils/src/command-managed-runtime.ts
@ -57,7 +57,7 @@ function requireSuccessfulResult(result: RunProcessResult, action: string): void
 export function createCommandManagedRuntimeClient(input: {
  runner: CommandManagedRuntimeRunner;
-  remoteCwd: string;
+  commandCwd: string;
  timeoutMs: number;
  shellCommand?: "bash" | "sh" | null;
 }): SandboxManagedRuntimeClient {
@ -66,7 +66,7 @@ export function createCommandManagedRuntimeClient(input: {
    const result = await input.runner.execute({
      command: shellCommand,
      args: ["-lc", script],
-      cwd: input.remoteCwd,
+      cwd: input.commandCwd,
      stdin: opts.stdin,
      timeoutMs: opts.timeoutMs ?? input.timeoutMs,
    });
@ -117,7 +117,7 @@ export function createCommandManagedRuntimeClient(input: {
      const result = await input.runner.execute({
        command: shellCommand,
        args: ["-lc", `rm -rf ${shellQuote(remotePath)}`],
-        cwd: input.remoteCwd,
+        cwd: input.commandCwd,
        timeoutMs: input.timeoutMs,
      });
      requireSuccessfulResult(result, `remove ${remotePath}`);
@ -126,7 +126,7 @@ export function createCommandManagedRuntimeClient(input: {
      const result = await input.runner.execute({
        command: shellCommand,
        args: ["-lc", command],
-        cwd: input.remoteCwd,
+        cwd: input.commandCwd,
        timeoutMs: options.timeoutMs,
      });
      requireSuccessfulResult(result, command);
@ -149,6 +149,7 @@ export async function prepareCommandManagedRuntime(input: {
 }): Promise<PreparedSandboxManagedRuntime> {
  const timeoutMs = input.spec.timeoutMs && input.spec.timeoutMs > 0 ? input.spec.timeoutMs : 300_000;
  const workspaceRemoteDir = input.workspaceRemoteDir ?? input.spec.remoteCwd;
  const commandCwd = input.spec.remoteCwd;
  const runtimeSpec: SandboxRemoteExecutionSpec = {
    transport: "sandbox",
    provider: input.spec.providerKey ?? "sandbox",
@ -159,7 +160,7 @@ export async function prepareCommandManagedRuntime(input: {
  };
  const client = createCommandManagedRuntimeClient({
    runner: input.runner,
-    remoteCwd: workspaceRemoteDir,
+    commandCwd,
    timeoutMs,
    shellCommand: input.spec.shellCommand,
  });
@ -176,7 +177,7 @@ export async function prepareCommandManagedRuntime(input: {
      const probe = await input.runner.execute({
        command: shellCommand,
        args: ["-lc", `command -v ${shellQuote(detectCommand)} >/dev/null 2>&1`],
-        cwd: workspaceRemoteDir,
+        cwd: commandCwd,
        timeoutMs,
      });
      if (!probe.timedOut && (probe.exitCode ?? 1) === 0) {
@ -195,7 +196,7 @@ export async function prepareCommandManagedRuntime(input: {
    const result = await input.runner.execute({
      command: shellCommand,
      args: ["-lc", installCommand],
-      cwd: workspaceRemoteDir,
+      cwd: commandCwd,
      timeoutMs,
    });
    // A failed install is not always fatal: the CLI may already be on PATH
--- a/packages/adapter-utils/src/sandbox-callback-bridge.test.ts
+++ b/packages/adapter-utils/src/sandbox-callback-bridge.test.ts
@ -422,6 +422,53 @@ describe("sandbox callback bridge", () => {
    );
  });
  it("handles SSH queue polling failures without emitting an unhandled rejection", async () => {
    const rootDir = await mkdtemp(path.join(os.tmpdir(), "paperclip-bridge-ssh-failure-"));
    cleanupDirs.push(rootDir);
    const queueDir = path.posix.join(rootDir, "queue");
    const unhandled: unknown[] = [];
    const onUnhandledRejection = (reason: unknown) => {
      unhandled.push(reason);
    };
    process.on("unhandledRejection", onUnhandledRejection);
    try {
      const worker = await startSandboxCallbackBridgeWorker({
        client: {
          makeDir: async () => {},
          listJsonFiles: async () => {
            throw new Error(
              "list /remote/.paperclip-runtime/gemini/paperclip-bridge/queue/requests failed with exit code 255: kex_exchange_identification: read: Connection reset by peer",
            );
          },
          readTextFile: async () => {
            throw new Error("unexpected readTextFile");
          },
          writeTextFile: async () => {
            throw new Error("unexpected writeTextFile");
          },
          rename: async () => {
            throw new Error("unexpected rename");
          },
          remove: async () => {},
        },
        queueDir,
        authorizeRequest: async () => null,
        handleRequest: async () => ({
          status: 200,
          body: "ok",
        }),
      });
      await new Promise((resolve) => setTimeout(resolve, 50));
      await worker.stop();
      expect(unhandled).toEqual([]);
    } finally {
      process.off("unhandledRejection", onUnhandledRejection);
    }
  });
  it("serializes remote response writes so stop does not recreate a late orphaned response", async () => {
    const rootDir = await mkdtemp(path.join(os.tmpdir(), "paperclip-bridge-response-lock-"));
    cleanupDirs.push(rootDir);
--- a/packages/adapter-utils/src/sandbox-callback-bridge.ts
+++ b/packages/adapter-utils/src/sandbox-callback-bridge.ts
@ -610,6 +610,8 @@ export async function startSandboxCallbackBridgeWorker(input: {
  });
  const authorizeRequest = input.authorizeRequest ??
    ((request: SandboxCallbackBridgeRequest) => authorizeSandboxCallbackBridgeRequestWithRoutes(request));
  const buildWorkerFailureMessage = (error: unknown) =>
    `Sandbox callback bridge worker failed: ${error instanceof Error ? error.message : String(error)}`;
  const processRequestFile = async (fileName: string) => {
    const requestPath = path.posix.join(directories.requestsDir, fileName);
@ -725,6 +727,16 @@ export async function startSandboxCallbackBridgeWorker(input: {
          break;
        }
      }
    } catch (error) {
      const message = buildWorkerFailureMessage(error);
      console.warn(`[paperclip] ${message}`);
      try {
        await failPendingRequests(message);
      } catch (failPendingError) {
        console.warn(
          `[paperclip] sandbox callback bridge failed to abort queued requests after worker failure: ${failPendingError instanceof Error ? failPendingError.message : String(failPendingError)}`,
        );
      }
    } finally {
      settled = true;
      if (settleResolve) {
--- a/packages/adapter-utils/src/server-utils.test.ts
+++ b/packages/adapter-utils/src/server-utils.test.ts
@ -848,6 +848,26 @@ describe("rewriteWorkspaceCwdEnvVarsForExecution", () => {
      RANDOM_WORKSPACE_CWD_TOKEN: "/host/workspace",
    });
  });
  it("only rewrites matching *_WORKSPACE_CWD string values", () => {
    const env = rewriteWorkspaceCwdEnvVarsForExecution({
      workspaceCwd: "/host/workspace",
      executionCwd: "/remote/workspace",
      executionTargetIsRemote: true,
      env: {
        MATCHING_WORKSPACE_CWD: "/host/workspace/.",
        DIFFERENT_WORKSPACE_CWD: "/host/other-workspace",
        BLANK_WORKSPACE_CWD: "   ",
        NON_STRING_WORKSPACE_CWD: 42,
      },
    });
    expect(env).toEqual({
      MATCHING_WORKSPACE_CWD: "/remote/workspace",
      DIFFERENT_WORKSPACE_CWD: "/host/other-workspace",
      BLANK_WORKSPACE_CWD: "   ",
    });
  });
 });
 describe("refreshPaperclipWorkspaceEnvForExecution", () => {
--- a/packages/adapter-utils/src/server-utils.ts
+++ b/packages/adapter-utils/src/server-utils.ts
@ -1012,8 +1012,13 @@ export function rewriteWorkspaceCwdEnvVarsForExecution(input: {
  const localWorkspaceCwd = typeof input.workspaceCwd === "string" && input.workspaceCwd.trim().length > 0
    ? path.resolve(input.workspaceCwd)
    : null;
  // executionCwd is a remote path on the target host; we deliberately do not
  // run `path.resolve` against it because that applies host-Node semantics
  // (current working directory, host path separator) to a path that lives on
  // the remote shell. Callers always pass absolute remote paths, so we
  // forward the trimmed value verbatim.
  const remoteWorkspaceCwd = typeof input.executionCwd === "string" && input.executionCwd.trim().length > 0
-    ? path.resolve(input.executionCwd)
+    ? input.executionCwd.trim()
    : null;
  if (!input.executionTargetIsRemote || !localWorkspaceCwd || !remoteWorkspaceCwd) {
--- a/packages/adapters/codex-local/src/server/test.remote.test.ts
+++ b/packages/adapters/codex-local/src/server/test.remote.test.ts
@ -0,0 +1,152 @@
 import fs from "node:fs/promises";
 import os from "node:os";
 import { afterEach, describe, expect, it, vi } from "vitest";
 import type { AdapterExecutionTarget } from "@paperclipai/adapter-utils/execution-target";
 const {
  ensureAdapterExecutionTargetDirectory,
  ensureAdapterExecutionTargetCommandResolvable,
  maybeRunSandboxInstallCommand,
  runAdapterExecutionTargetProcess,
  describeAdapterExecutionTarget,
  resolveAdapterExecutionTargetCwd,
  prepareAdapterExecutionTargetRuntime,
  prepareManagedCodexHome,
  restoreWorkspace,
 } = vi.hoisted(() => {
  const restoreWorkspace = vi.fn(async () => {});
  return {
    ensureAdapterExecutionTargetDirectory: vi.fn(async () => {}),
    ensureAdapterExecutionTargetCommandResolvable: vi.fn(async () => {}),
    maybeRunSandboxInstallCommand: vi.fn(async () => null),
    runAdapterExecutionTargetProcess: vi.fn(async () => ({
      exitCode: 0,
      signal: null,
      timedOut: false,
      stdout: [
        "{\"type\":\"thread.started\",\"thread_id\":\"thread-1\"}",
        "{\"type\":\"item.completed\",\"item\":{\"type\":\"agent_message\",\"text\":\"hello\"}}",
        "{\"type\":\"turn.completed\",\"usage\":{\"input_tokens\":1,\"cached_input_tokens\":0,\"output_tokens\":1}}",
      ].join("\n"),
      stderr: "",
      pid: 123,
      startedAt: new Date().toISOString(),
    })),
    describeAdapterExecutionTarget: vi.fn(() => "QA SSH"),
    resolveAdapterExecutionTargetCwd: vi.fn((target, configuredCwd, fallbackCwd) => {
      if (typeof configuredCwd === "string" && configuredCwd.trim().length > 0) return configuredCwd;
      if (target && typeof target === "object" && "remoteCwd" in target && typeof target.remoteCwd === "string") {
        return target.remoteCwd;
      }
      return fallbackCwd;
    }),
    prepareAdapterExecutionTargetRuntime: vi.fn(async () => ({
      target: null,
      workspaceRemoteDir: "/remote/workspace/.paperclip-runtime/runs/test/workspace",
      runtimeRootDir: "/remote/workspace/.paperclip-runtime/runs/test/workspace/.paperclip-runtime/codex",
      assetDirs: {
        home: "/remote/workspace/.paperclip-runtime/runs/test/workspace/.paperclip-runtime/codex/home",
      },
      restoreWorkspace,
    })),
    prepareManagedCodexHome: vi.fn(async () => "/tmp/paperclip-managed-codex-home"),
    restoreWorkspace,
  };
 });
 vi.mock("@paperclipai/adapter-utils/execution-target", async () => {
  const actual = await vi.importActual<typeof import("@paperclipai/adapter-utils/execution-target")>(
    "@paperclipai/adapter-utils/execution-target",
  );
  return {
    ...actual,
    ensureAdapterExecutionTargetDirectory,
    ensureAdapterExecutionTargetCommandResolvable,
    maybeRunSandboxInstallCommand,
    runAdapterExecutionTargetProcess,
    describeAdapterExecutionTarget,
    resolveAdapterExecutionTargetCwd,
    prepareAdapterExecutionTargetRuntime,
  };
 });
 vi.mock("./codex-home.js", async () => {
  const actual = await vi.importActual<typeof import("./codex-home.js")>("./codex-home.js");
  return {
    ...actual,
    prepareManagedCodexHome,
  };
 });
 import { testEnvironment } from "./test.js";
 describe("codex remote environment diagnostics", () => {
  afterEach(() => {
    vi.clearAllMocks();
  });
  it("stages managed CODEX_HOME in an isolated runtime dir and keeps the probe cwd on the original remote workspace", async () => {
    const remoteTarget: AdapterExecutionTarget = {
      kind: "remote",
      transport: "ssh",
      remoteCwd: "/remote/workspace",
      spec: {
        host: "127.0.0.1",
        port: 22,
        username: "agent",
        privateKey: "PRIVATE KEY",
        knownHosts: "KNOWN HOSTS",
        remoteCwd: "/remote/workspace",
        remoteWorkspacePath: "/remote/workspace",
        strictHostKeyChecking: false,
      },
    };
    const result = await testEnvironment({
      companyId: "company-1",
      adapterType: "codex_local",
      config: {
        command: "codex",
      },
      executionTarget: remoteTarget,
      environmentName: "QA SSH",
    });
    expect(result.status).toBe("pass");
    expect(result.checks.some((check) => check.code === "codex_hello_probe_passed")).toBe(true);
    expect(prepareManagedCodexHome).toHaveBeenCalledTimes(1);
    expect(prepareAdapterExecutionTargetRuntime).toHaveBeenCalledTimes(1);
    const runtimeCalls = prepareAdapterExecutionTargetRuntime.mock.calls as unknown as Array<[
      {
        workspaceLocalDir: string;
        target?: { remoteCwd?: string };
        workspaceRemoteDir?: string;
      },
    ]>;
    const runtimeInput = runtimeCalls[0]?.[0];
    expect(runtimeInput?.workspaceLocalDir).toContain(`${os.tmpdir()}/paperclip-codex-envtest-`);
    expect(runtimeInput?.workspaceLocalDir).not.toBe("/remote/workspace");
    expect(await fs.stat(runtimeInput!.workspaceLocalDir).catch(() => null)).toBeNull();
    expect(runtimeInput?.target?.remoteCwd).toBe("/remote/workspace");
    // `workspaceRemoteDir` is the base path passed to the runtime; the
    // helper's per-run subdirectory is appended internally inside
    // `prepareRemoteManagedRuntime`. Pre-building a per-run prefix here
    // would double-nest the run id in the final path.
    expect(runtimeInput?.workspaceRemoteDir).toBe("/remote/workspace");
    expect(runAdapterExecutionTargetProcess).toHaveBeenCalledTimes(1);
    const probeCall = runAdapterExecutionTargetProcess.mock.calls[0] as unknown as
      | [string, { kind: string; remoteCwd: string }, string, string[], { cwd: string; env: Record<string, string> }]
      | undefined;
    expect(probeCall?.[1]).toMatchObject({
      kind: "remote",
      remoteCwd: "/remote/workspace",
    });
    expect(probeCall?.[4]).toMatchObject({
      cwd: "/remote/workspace",
      env: expect.objectContaining({
        CODEX_HOME: "/remote/workspace/.paperclip-runtime/runs/test/workspace/.paperclip-runtime/codex/home",
      }),
    });
    expect(restoreWorkspace).toHaveBeenCalledTimes(1);
  });
 });
--- a/packages/adapters/codex-local/src/server/test.ts
+++ b/packages/adapters/codex-local/src/server/test.ts
@ -15,13 +15,16 @@ import {
  runAdapterExecutionTargetProcess,
  describeAdapterExecutionTarget,
  resolveAdapterExecutionTargetCwd,
  prepareAdapterExecutionTargetRuntime,
 } from "@paperclipai/adapter-utils/execution-target";
 import fs from "node:fs/promises";
 import path from "node:path";
 import os from "node:os";
 import { parseCodexJsonl } from "./parse.js";
 import { SANDBOX_INSTALL_COMMAND } from "../index.js";
 import { codexHomeDir, readCodexAuthInfo } from "./quota.js";
 import { buildCodexExecArgs } from "./codex-args.js";
 import { prepareManagedCodexHome } from "./codex-home.js";
 function summarizeStatus(checks: AdapterEnvironmentCheck[]): AdapterEnvironmentTestResult["status"] {
  if (checks.some((check) => check.level === "error")) return "fail";
@ -58,6 +61,99 @@ function summarizeProbeDetail(stdout: string, stderr: string, parsedError: strin
 const CODEX_AUTH_REQUIRED_RE =
  /(?:not\s+logged\s+in|login\s+required|authentication\s+required|unauthorized|invalid(?:\s+or\s+missing)?\s+api(?:[_\s-]?key)?|openai[_\s-]?api[_\s-]?key|api[_\s-]?key.*required|please\s+run\s+`?codex\s+login`?)/i;
 async function prepareCodexHelloProbe(input: {
  runId: string;
  companyId: string;
  target: AdapterEnvironmentTestContext["executionTarget"] | null;
  targetIsRemote: boolean;
  cwd: string;
  command: string;
  args: string[];
  env: Record<string, string>;
  probeApiKey: string | null;
 }): Promise<{
  command: string;
  args: string[];
  env: Record<string, string>;
  cleanup: () => Promise<void>;
 }> {
  let preparedRuntime: Awaited<ReturnType<typeof prepareAdapterExecutionTargetRuntime>> | null = null;
  let preparedRuntimeWorkspaceLocalDir: string | null = null;
  const cleanup = async () => {
    await preparedRuntime?.restoreWorkspace().catch(() => {});
    if (preparedRuntimeWorkspaceLocalDir) {
      await fs.rm(preparedRuntimeWorkspaceLocalDir, { recursive: true, force: true }).catch(() => {});
    }
  };
  if (input.targetIsRemote && !input.probeApiKey) {
    const managedHome = await prepareManagedCodexHome(process.env, async () => {}, input.companyId, {
      apiKey: null,
    });
    preparedRuntimeWorkspaceLocalDir = await fs.mkdtemp(
      path.join(os.tmpdir(), `paperclip-codex-envtest-${input.runId}-`),
    );
    preparedRuntime = await prepareAdapterExecutionTargetRuntime({
      runId: input.runId,
      target: input.target,
      adapterKey: "codex",
      workspaceLocalDir: preparedRuntimeWorkspaceLocalDir,
      // Pass `input.cwd` as the base (not a pre-built per-run subdir).
      // `prepareRemoteManagedRuntime` itself appends
      // `.paperclip-runtime/runs/<runId>/workspace` to whatever it gets, so
      // pre-building a per-run path here would double-nest the run ID.
      workspaceRemoteDir: input.cwd,
      installCommand: SANDBOX_INSTALL_COMMAND,
      detectCommand: input.command,
      assets: [
        {
          key: "home",
          localDir: managedHome,
          followSymlinks: true,
        },
      ],
    });
    return {
      command: input.command,
      args: input.args,
      env: preparedRuntime.assetDirs.home
        ? { ...input.env, CODEX_HOME: preparedRuntime.assetDirs.home }
        : { ...input.env },
      cleanup,
    };
  }
  if (input.probeApiKey) {
    const probeHome = input.targetIsRemote
      ? `/tmp/paperclip-codex-probe-${input.runId}`
      : path.join(os.tmpdir(), `paperclip-codex-probe-${input.runId}`);
    return {
      command: "sh",
      args: [
        "-c",
        'set -e; mkdir -p "$CODEX_HOME"; umask 077; printf "%s" "$_PAPERCLIP_CODEX_AUTH_JSON" > "$CODEX_HOME/auth.json"; unset _PAPERCLIP_CODEX_AUTH_JSON; trap \'rm -rf "$CODEX_HOME"\' EXIT INT TERM; "$0" "$@"',
        input.command,
        ...input.args,
      ],
      env: {
        ...input.env,
        CODEX_HOME: probeHome,
        _PAPERCLIP_CODEX_AUTH_JSON: JSON.stringify({ OPENAI_API_KEY: input.probeApiKey }),
      },
      cleanup,
    };
  }
  return {
    command: input.command,
    args: input.args,
    env: { ...input.env },
    cleanup,
  };
 }
 export async function testEnvironment(
  ctx: AdapterEnvironmentTestContext,
 ): Promise<AdapterEnvironmentTestResult> {
@ -196,86 +292,80 @@ export async function testEnvironment(
        : isNonEmpty(hostOpenAiKey)
          ? hostOpenAiKey
          : null;
-      let probeCommand = command;
+      const preparedProbe = await prepareCodexHelloProbe({
      let probeArgs = args;
      const probeEnv: Record<string, string> = { ...env };
      if (probeApiKey) {
        const probeHome = targetIsRemote
          ? `/tmp/paperclip-codex-probe-${runId}`
          : path.join(os.tmpdir(), `paperclip-codex-probe-${runId}`);
        probeEnv.CODEX_HOME = probeHome;
        probeEnv._PAPERCLIP_CODEX_AUTH_JSON = JSON.stringify({ OPENAI_API_KEY: probeApiKey });
        probeCommand = "sh";
        // Trap on EXIT removes the probe home (with the API-key auth.json) on
        // any exit path; we drop `exec` so the wrapper shell stays alive long
        // enough for the trap to fire after the child returns.
        probeArgs = [
          "-c",
          'set -e; mkdir -p "$CODEX_HOME"; umask 077; printf "%s" "$_PAPERCLIP_CODEX_AUTH_JSON" > "$CODEX_HOME/auth.json"; unset _PAPERCLIP_CODEX_AUTH_JSON; trap \'rm -rf "$CODEX_HOME"\' EXIT INT TERM; "$0" "$@"',
          command,
          ...args,
        ];
      }
      const probe = await runAdapterExecutionTargetProcess(
        runId,
        companyId: ctx.companyId,
        target,
-        probeCommand,
+        targetIsRemote,
-        probeArgs,
+        cwd,
-        {
+        command,
-          cwd,
+        args,
-          env: probeEnv,
+        env,
-          timeoutSec: 45,
+        probeApiKey,
-          graceSec: 5,
+      });
-          stdin: "Respond with hello.",
+      try {
-          onLog: async () => {},
+        const probe = await runAdapterExecutionTargetProcess(
-        },
+          runId,
-      );
+          target,
-      const parsed = parseCodexJsonl(probe.stdout);
+          preparedProbe.command,
-      const detail = summarizeProbeDetail(probe.stdout, probe.stderr, parsed.errorMessage);
+          preparedProbe.args,
-      const authEvidence = `${parsed.errorMessage ?? ""}\n${probe.stdout}\n${probe.stderr}`.trim();
+          {
            cwd,
            env: preparedProbe.env,
            timeoutSec: 45,
            graceSec: 5,
            stdin: "Respond with hello.",
            onLog: async () => {},
          },
        );
        const parsed = parseCodexJsonl(probe.stdout);
        const detail = summarizeProbeDetail(probe.stdout, probe.stderr, parsed.errorMessage);
        const authEvidence = `${parsed.errorMessage ?? ""}\n${probe.stdout}\n${probe.stderr}`.trim();
-      if (probe.timedOut) {
+        if (probe.timedOut) {
-        checks.push({
+          checks.push({
-          code: "codex_hello_probe_timed_out",
+            code: "codex_hello_probe_timed_out",
-          level: "warn",
+            level: "warn",
-          message: "Codex hello probe timed out.",
+            message: "Codex hello probe timed out.",
-          hint: "Retry the probe. If this persists, verify Codex can run `Respond with hello` from this directory manually.",
+            hint: "Retry the probe. If this persists, verify Codex can run `Respond with hello` from this directory manually.",
-        });
+          });
-      } else if ((probe.exitCode ?? 1) === 0) {
+        } else if ((probe.exitCode ?? 1) === 0) {
-        const summary = parsed.summary.trim();
+          const summary = parsed.summary.trim();
-        const hasHello = /\bhello\b/i.test(summary);
+          const hasHello = /\bhello\b/i.test(summary);
-        checks.push({
+          checks.push({
-          code: hasHello ? "codex_hello_probe_passed" : "codex_hello_probe_unexpected_output",
+            code: hasHello ? "codex_hello_probe_passed" : "codex_hello_probe_unexpected_output",
-          level: hasHello ? "info" : "warn",
+            level: hasHello ? "info" : "warn",
-          message: hasHello
+            message: hasHello
-            ? "Codex hello probe succeeded."
+              ? "Codex hello probe succeeded."
-            : "Codex probe ran but did not return `hello` as expected.",
+              : "Codex probe ran but did not return `hello` as expected.",
-          ...(summary ? { detail: summary.replace(/\s+/g, " ").trim().slice(0, 240) } : {}),
+            ...(summary ? { detail: summary.replace(/\s+/g, " ").trim().slice(0, 240) } : {}),
-          ...(hasHello
+            ...(hasHello
-            ? {}
+              ? {}
-            : {
+              : {
-                hint: "Try the probe manually (`codex exec --json -` then prompt: Respond with hello) to inspect full output.",
+                  hint: "Try the probe manually (`codex exec --json -` then prompt: Respond with hello) to inspect full output.",
-              }),
+                }),
-        });
+          });
-      } else if (CODEX_AUTH_REQUIRED_RE.test(authEvidence)) {
+        } else if (CODEX_AUTH_REQUIRED_RE.test(authEvidence)) {
-        checks.push({
+          checks.push({
-          code: "codex_hello_probe_auth_required",
+            code: "codex_hello_probe_auth_required",
-          level: "warn",
+            level: "warn",
-          message: "Codex CLI is installed, but authentication is not ready.",
+            message: "Codex CLI is installed, but authentication is not ready.",
-          ...(detail ? { detail } : {}),
+            ...(detail ? { detail } : {}),
-          hint: probeApiKey
+            hint: probeApiKey
-            ? "OPENAI_API_KEY was provided but Codex still rejected the request. Verify the key is valid for the OpenAI Responses API (e.g. `curl -H \"Authorization: Bearer $OPENAI_API_KEY\" https://api.openai.com/v1/models`), or run `codex login` and seed `~/.codex/auth.json`."
+              ? "OPENAI_API_KEY was provided but Codex still rejected the request. Verify the key is valid for the OpenAI Responses API (e.g. `curl -H \"Authorization: Bearer $OPENAI_API_KEY\" https://api.openai.com/v1/models`), or run `codex login` and seed `~/.codex/auth.json`."
-            : "Codex CLI does not read OPENAI_API_KEY from the environment; set OPENAI_API_KEY in this adapter's config (so Paperclip writes it to `$CODEX_HOME/auth.json`) or run `codex login` on the host first.",
+              : "Codex CLI does not read OPENAI_API_KEY from the environment; set OPENAI_API_KEY in this adapter's config (so Paperclip writes it to `$CODEX_HOME/auth.json`) or run `codex login` on the host first.",
-        });
+          });
-      } else {
+        } else {
-        checks.push({
+          checks.push({
-          code: "codex_hello_probe_failed",
+            code: "codex_hello_probe_failed",
-          level: "error",
+            level: "error",
-          message: "Codex hello probe failed.",
+            message: "Codex hello probe failed.",
-          ...(detail ? { detail } : {}),
+            ...(detail ? { detail } : {}),
-          hint: "Run `codex exec --json -` manually in this working directory and prompt `Respond with hello` to debug.",
+            hint: "Run `codex exec --json -` manually in this working directory and prompt `Respond with hello` to debug.",
-        });
+          });
        }
      } finally {
        await preparedProbe.cleanup();
      }
    }
  }
--- a/packages/adapters/gemini-local/src/server/execute.ts
+++ b/packages/adapters/gemini-local/src/server/execute.ts
@ -373,8 +373,8 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
      throw error;
    }
  }
  const runtimeExecutionTarget = overrideAdapterExecutionTargetRemoteCwd(executionTarget, effectiveExecutionCwd);
  if (executionTargetIsRemote && adapterExecutionTargetUsesPaperclipBridge(executionTarget)) {
    const runtimeExecutionTarget = overrideAdapterExecutionTargetRemoteCwd(executionTarget, effectiveExecutionCwd);
    paperclipBridge = await startAdapterExecutionTargetPaperclipBridge({
      runId,
      target: runtimeExecutionTarget,
@ -392,7 +392,6 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
      });
    }
  }
  const runtimeExecutionTarget = overrideAdapterExecutionTargetRemoteCwd(executionTarget, effectiveExecutionCwd);
  const runtimeSessionParams = parseObject(runtime.sessionParams);
  const runtimeSessionId = asString(runtimeSessionParams.sessionId, runtime.sessionId ?? "");
--- a/server/src/tests/fixtures/plugin-worker-terminated.cjs
+++ b/server/src/tests/fixtures/plugin-worker-terminated.cjs
@ -0,0 +1,59 @@
 const readline = require("node:readline");
 function send(message) {
  process.stdout.write(`${JSON.stringify(message)}\n`);
 }
 const rl = readline.createInterface({
  input: process.stdin,
  crlfDelay: Infinity,
 });
 rl.on("line", (line) => {
  if (!line.trim()) return;
  const message = JSON.parse(line);
  const method = message && typeof message.method === "string" ? message.method : null;
  if (method === "initialize") {
    send({
      jsonrpc: "2.0",
      id: message.id,
      result: {
        ok: true,
        supportedMethods: ["environmentExecute"],
      },
    });
    return;
  }
  if (method === "environmentExecute") {
    send({
      jsonrpc: "2.0",
      id: message.id,
      error: {
        code: -32002,
        message: "[unknown] terminated",
      },
    });
    return;
  }
  if (method === "shutdown") {
    send({
      jsonrpc: "2.0",
      id: message.id,
      result: {},
    });
    setImmediate(() => process.exit(0));
    return;
  }
  send({
    jsonrpc: "2.0",
    id: message.id,
    error: {
      code: -32601,
      message: `Unhandled method: ${method}`,
    },
  });
 });
--- a/server/src/tests/plugin-worker-manager.test.ts
+++ b/server/src/tests/plugin-worker-manager.test.ts
@ -1,5 +1,31 @@
-import { describe, expect, it } from "vitest";
+import path from "node:path";
-import { appendStderrExcerpt, formatWorkerFailureMessage } from "../services/plugin-worker-manager.js";
+import { fileURLToPath } from "node:url";
 import { describe, expect, it, vi } from "vitest";
 import type { PaperclipPluginManifestV1 } from "@paperclipai/shared";
 import {
  JsonRpcCallError,
  type HostToWorkerMethods,
 } from "@paperclipai/plugin-sdk";
 import {
  appendStderrExcerpt,
  createPluginWorkerHandle,
  formatWorkerFailureMessage,
 } from "../services/plugin-worker-manager.js";
 const FIXTURES_DIR = path.join(path.dirname(fileURLToPath(import.meta.url)), "fixtures");
 const TERMINATED_WORKER_ENTRYPOINT = path.join(FIXTURES_DIR, "plugin-worker-terminated.cjs");
 const TEST_MANIFEST: PaperclipPluginManifestV1 = {
  id: "test.plugin",
  apiVersion: 1,
  version: "1.0.0",
  displayName: "Test plugin",
  description: "Test plugin",
  author: "Paperclip",
  categories: ["automation"],
  capabilities: [],
  entrypoints: { worker: "dist/worker.js" },
 };
 describe("plugin-worker-manager stderr failure context", () => {
  it("appends worker stderr context to failure messages", () => {
@ -40,4 +66,48 @@ describe("plugin-worker-manager stderr failure context", () => {
    expect(excerpt).not.toContain("second line");
    expect(excerpt.length).toBeLessThanOrEqual(8_000);
  });
  it("does not emit an unhandled rejection when a plugin responds with terminated before callers attach handlers", async () => {
    const unhandledRejection = vi.fn();
    process.on("unhandledRejection", unhandledRejection);
    const handle = createPluginWorkerHandle("test.plugin", {
      entrypointPath: TERMINATED_WORKER_ENTRYPOINT,
      manifest: TEST_MANIFEST,
      config: {},
      instanceInfo: {
        instanceId: "instance-1",
        hostVersion: "1.0.0",
      },
      apiVersion: 1,
      hostHandlers: {},
    });
    try {
      await handle.start();
      const pendingCall = handle.call(
        "environmentExecute" as keyof HostToWorkerMethods,
        {
          driverKey: "e2b",
          companyId: "company-1",
          environmentId: "environment-1",
          config: {},
          lease: { providerLeaseId: "lease-1" },
          command: "echo",
        } as HostToWorkerMethods[keyof HostToWorkerMethods][0],
      );
      await new Promise((resolve) => setImmediate(resolve));
      await expect(pendingCall).rejects.toBeInstanceOf(JsonRpcCallError);
      await expect(pendingCall).rejects.toMatchObject({
        message: expect.stringContaining("terminated"),
      });
      expect(unhandledRejection).not.toHaveBeenCalled();
    } finally {
      process.off("unhandledRejection", unhandledRejection);
      await handle.stop().catch(() => undefined);
    }
  });
 });
--- a/server/src/services/plugin-worker-manager.ts
+++ b/server/src/services/plugin-worker-manager.ts
@ -1006,7 +1006,7 @@ export function createPluginWorkerHandle(
    params: HostToWorkerMethods[M][0],
    timeoutMs?: number,
  ): Promise<HostToWorkerMethods[M][1]> {
-    return new Promise<HostToWorkerMethods[M][1]>((resolve, reject) => {
+    const rpcPromise = new Promise<HostToWorkerMethods[M][1]>((resolve, reject) => {
      if (!childProcess?.stdin?.writable) {
        reject(
          new Error(
@ -1076,6 +1076,14 @@ export function createPluginWorkerHandle(
        );
      }
    });
    // Some call sites hand these promises across async boundaries before
    // attaching their own handlers. Mark the promise as handled here so a
    // worker-side JSON-RPC error can fail the caller without killing the host
    // process via an unhandled rejection.
    void rpcPromise.catch(() => undefined);
    return rpcPromise;
  }
  // -----------------------------------------------------------------------