paperclip/packages/adapters/cursor-local/src/server/execute.remote.test.ts

318 lines
10 KiB
TypeScript
Raw Normal View History

Add SSH environment support (#4358) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The environments subsystem already models execution environments, but before this branch there was no end-to-end SSH-backed runtime path for agents to actually run work against a remote box > - That meant agents could be configured around environment concepts without a reliable way to execute adapter sessions remotely, sync workspace state, and preserve run context across supported adapters > - We also need environment selection to participate in normal Paperclip control-plane behavior: agent defaults, project/issue selection, route validation, and environment probing > - Because this capability is still experimental, the UI surface should be easy to hide and easy to remove later without undoing the underlying implementation > - This pull request adds SSH environment execution support across the runtime, adapters, routes, schema, and tests, then puts the visible environment-management UI behind an experimental flag > - The benefit is that we can validate real SSH-backed agent execution now while keeping the user-facing controls safely gated until the feature is ready to come out of experimentation ## What Changed - Added SSH-backed execution target support in the shared adapter runtime, including remote workspace preparation, skill/runtime asset sync, remote session handling, and workspace restore behavior after runs. - Added SSH execution coverage for supported local adapters, plus remote execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi. - Added environment selection and environment-management backend support needed for SSH execution, including route/service work, validation, probing, and agent default environment persistence. - Added CLI support for SSH environment lab verification and updated related docs/tests. - Added the `enableEnvironments` experimental flag and gated the environment UI behind it on company settings, agent configuration, and project configuration surfaces. ## Verification - `pnpm exec vitest run packages/adapters/claude-local/src/server/execute.remote.test.ts packages/adapters/cursor-local/src/server/execute.remote.test.ts packages/adapters/gemini-local/src/server/execute.remote.test.ts packages/adapters/opencode-local/src/server/execute.remote.test.ts packages/adapters/pi-local/src/server/execute.remote.test.ts` - `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts` - `pnpm exec vitest run server/src/__tests__/instance-settings-routes.test.ts` - `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts ui/src/lib/new-agent-runtime-config.test.ts` - `pnpm -r typecheck` - `pnpm build` - Manual verification on a branch-local dev server: - enabled the experimental flag - created an SSH environment - created a Linux Claude agent using that environment - confirmed a run executed on the Linux box and synced workspace changes back ## Risks - Medium: this touches runtime execution flow across multiple adapters, so regressions would likely show up in remote session setup, workspace sync, or environment selection precedence. - The UI flag reduces exposure, but the underlying runtime and route changes are still substantial and rely on migration correctness. - The change set is broad across adapters, control-plane services, migrations, and UI gating, so review should pay close attention to environment-selection precedence and remote workspace lifecycle behavior. ## Model Used - OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding model with tool use and code execution in the local repo workspace. The local adapter does not surface a more specific public model version string in this branch workflow. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-23 19:15:22 -07:00
import { mkdir, mkdtemp, rm } from "node:fs/promises";
import os from "node:os";
import path from "node:path";
import { afterEach, describe, expect, it, vi } from "vitest";
const {
runChildProcess,
ensureCommandResolvable,
resolveCommandForLogs,
prepareWorkspaceForSshExecution,
restoreWorkspaceFromSshExecution,
runSshCommand,
syncDirectoryToSsh,
Migrate SSH environment callback to bridge (#5116) > **Stacked PR (part 3 of 7).** Depends on: - PR #5114 - PR #5115 > Diff against `master` includes commits from earlier PRs in the stack — the new commit in this PR is the topmost one. ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Agents executing on a remote SSH-backed environment need a way to call back into > the Paperclip control plane (run events, log streaming, signals) > - When the SSH host can't reach the Paperclip host (NAT, firewalls, or simply not > on the same network), the run silently fails or hangs — a recurring class of > failure during SSH testing > - In sandboxed environments we already solved this with a callback bridge that > tunnels back through the existing connection; SSH was the odd one out > - This PR migrates SSH execution to use the same callback bridge, so every > adapter's remote run uses one consistent reverse-channel. Per-adapter SSH glue > is deleted in favour of a shared `CommandManagedRuntimeRunner` built from the > SSH spec > - The benefit is fewer SSH-specific failure modes, a smaller code surface, and > one place to evolve the callback contract going forward ## What Changed - Added `createSshCommandManagedRuntimeRunner` in `packages/adapter-utils/src/ssh.ts` that adapts an SSH spec into a generic command-managed-runtime runner (with cwd, env, and timeout handling) - Removed `paperclipApiUrl` from `SshRemoteExecutionSpec`; the bridge URL now flows through the shared runner - Reworked `execution-target.ts` to use the SSH runner alongside sandbox runners via a unified `CommandManagedRuntimeRunner` interface - Simplified `remote-managed-runtime.ts` and `sandbox-managed-runtime.ts` to consume the shared runner abstraction - Deleted per-adapter SSH callback wiring from claude-local, codex-local, cursor-local, gemini-local, opencode-local, pi-local execute.ts files - Removed `environment-runtime-driver-contract.test.ts` (the contract is now enforced by `environment-execution-target.test.ts`) - Added/updated `execute.remote.test.ts` cases for each adapter to cover the SSH runner path ## Verification - `pnpm --filter @paperclipai/adapter-utils test` - `pnpm test -- execute.remote` (covers all six local adapters' SSH paths) - Manual QA: ran a claude-local agent against an SSH-backed environment, confirmed the agent successfully called back to `/api/agent-callback/*` endpoints during the run ## Risks - Refactor touches all six local adapters. If any adapter had subtle SSH-specific behaviour that wasn't captured in tests, it could regress. Mitigation: each adapter's `execute.remote.test.ts` was extended. - `paperclipApiUrl` removal from `SshRemoteExecutionSpec` is a breaking type change for any internal consumer. Verified no external plugins consume this type. - The new `CommandManagedRuntimeRunner` shape is a public surface in `@paperclipai/adapter-utils`; downstream plugins implementing custom runners may need updates, but no such plugins exist in this repo. ## Model Used - OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI - Provider: OpenAI - Used to author the code changes in this PR ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots — N/A - [ ] I have updated relevant documentation to reflect my changes — N/A - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-05-03 12:43:52 -07:00
startAdapterExecutionTargetPaperclipBridge,
Add SSH environment support (#4358) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The environments subsystem already models execution environments, but before this branch there was no end-to-end SSH-backed runtime path for agents to actually run work against a remote box > - That meant agents could be configured around environment concepts without a reliable way to execute adapter sessions remotely, sync workspace state, and preserve run context across supported adapters > - We also need environment selection to participate in normal Paperclip control-plane behavior: agent defaults, project/issue selection, route validation, and environment probing > - Because this capability is still experimental, the UI surface should be easy to hide and easy to remove later without undoing the underlying implementation > - This pull request adds SSH environment execution support across the runtime, adapters, routes, schema, and tests, then puts the visible environment-management UI behind an experimental flag > - The benefit is that we can validate real SSH-backed agent execution now while keeping the user-facing controls safely gated until the feature is ready to come out of experimentation ## What Changed - Added SSH-backed execution target support in the shared adapter runtime, including remote workspace preparation, skill/runtime asset sync, remote session handling, and workspace restore behavior after runs. - Added SSH execution coverage for supported local adapters, plus remote execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi. - Added environment selection and environment-management backend support needed for SSH execution, including route/service work, validation, probing, and agent default environment persistence. - Added CLI support for SSH environment lab verification and updated related docs/tests. - Added the `enableEnvironments` experimental flag and gated the environment UI behind it on company settings, agent configuration, and project configuration surfaces. ## Verification - `pnpm exec vitest run packages/adapters/claude-local/src/server/execute.remote.test.ts packages/adapters/cursor-local/src/server/execute.remote.test.ts packages/adapters/gemini-local/src/server/execute.remote.test.ts packages/adapters/opencode-local/src/server/execute.remote.test.ts packages/adapters/pi-local/src/server/execute.remote.test.ts` - `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts` - `pnpm exec vitest run server/src/__tests__/instance-settings-routes.test.ts` - `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts ui/src/lib/new-agent-runtime-config.test.ts` - `pnpm -r typecheck` - `pnpm build` - Manual verification on a branch-local dev server: - enabled the experimental flag - created an SSH environment - created a Linux Claude agent using that environment - confirmed a run executed on the Linux box and synced workspace changes back ## Risks - Medium: this touches runtime execution flow across multiple adapters, so regressions would likely show up in remote session setup, workspace sync, or environment selection precedence. - The UI flag reduces exposure, but the underlying runtime and route changes are still substantial and rely on migration correctness. - The change set is broad across adapters, control-plane services, migrations, and UI gating, so review should pay close attention to environment-selection precedence and remote workspace lifecycle behavior. ## Model Used - OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding model with tool use and code execution in the local repo workspace. The local adapter does not surface a more specific public model version string in this branch workflow. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-23 19:15:22 -07:00
} = vi.hoisted(() => ({
runChildProcess: vi.fn(async () => ({
exitCode: 0,
signal: null,
timedOut: false,
stdout: [
JSON.stringify({ type: "system", session_id: "cursor-session-1" }),
JSON.stringify({ type: "assistant", text: "hello" }),
JSON.stringify({ type: "result", is_error: false, result: "hello", session_id: "cursor-session-1" }),
].join("\n"),
stderr: "",
pid: 123,
startedAt: new Date().toISOString(),
})),
ensureCommandResolvable: vi.fn(async () => undefined),
resolveCommandForLogs: vi.fn(async () => "ssh://fixture@127.0.0.1:2222/remote/workspace :: agent"),
prepareWorkspaceForSshExecution: vi.fn(async () => undefined),
restoreWorkspaceFromSshExecution: vi.fn(async () => undefined),
runSshCommand: vi.fn(async () => ({
stdout: "/home/agent",
stderr: "",
exitCode: 0,
})),
syncDirectoryToSsh: vi.fn(async () => undefined),
Migrate SSH environment callback to bridge (#5116) > **Stacked PR (part 3 of 7).** Depends on: - PR #5114 - PR #5115 > Diff against `master` includes commits from earlier PRs in the stack — the new commit in this PR is the topmost one. ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Agents executing on a remote SSH-backed environment need a way to call back into > the Paperclip control plane (run events, log streaming, signals) > - When the SSH host can't reach the Paperclip host (NAT, firewalls, or simply not > on the same network), the run silently fails or hangs — a recurring class of > failure during SSH testing > - In sandboxed environments we already solved this with a callback bridge that > tunnels back through the existing connection; SSH was the odd one out > - This PR migrates SSH execution to use the same callback bridge, so every > adapter's remote run uses one consistent reverse-channel. Per-adapter SSH glue > is deleted in favour of a shared `CommandManagedRuntimeRunner` built from the > SSH spec > - The benefit is fewer SSH-specific failure modes, a smaller code surface, and > one place to evolve the callback contract going forward ## What Changed - Added `createSshCommandManagedRuntimeRunner` in `packages/adapter-utils/src/ssh.ts` that adapts an SSH spec into a generic command-managed-runtime runner (with cwd, env, and timeout handling) - Removed `paperclipApiUrl` from `SshRemoteExecutionSpec`; the bridge URL now flows through the shared runner - Reworked `execution-target.ts` to use the SSH runner alongside sandbox runners via a unified `CommandManagedRuntimeRunner` interface - Simplified `remote-managed-runtime.ts` and `sandbox-managed-runtime.ts` to consume the shared runner abstraction - Deleted per-adapter SSH callback wiring from claude-local, codex-local, cursor-local, gemini-local, opencode-local, pi-local execute.ts files - Removed `environment-runtime-driver-contract.test.ts` (the contract is now enforced by `environment-execution-target.test.ts`) - Added/updated `execute.remote.test.ts` cases for each adapter to cover the SSH runner path ## Verification - `pnpm --filter @paperclipai/adapter-utils test` - `pnpm test -- execute.remote` (covers all six local adapters' SSH paths) - Manual QA: ran a claude-local agent against an SSH-backed environment, confirmed the agent successfully called back to `/api/agent-callback/*` endpoints during the run ## Risks - Refactor touches all six local adapters. If any adapter had subtle SSH-specific behaviour that wasn't captured in tests, it could regress. Mitigation: each adapter's `execute.remote.test.ts` was extended. - `paperclipApiUrl` removal from `SshRemoteExecutionSpec` is a breaking type change for any internal consumer. Verified no external plugins consume this type. - The new `CommandManagedRuntimeRunner` shape is a public surface in `@paperclipai/adapter-utils`; downstream plugins implementing custom runners may need updates, but no such plugins exist in this repo. ## Model Used - OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI - Provider: OpenAI - Used to author the code changes in this PR ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots — N/A - [ ] I have updated relevant documentation to reflect my changes — N/A - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-05-03 12:43:52 -07:00
startAdapterExecutionTargetPaperclipBridge: vi.fn(async () => ({
env: {
PAPERCLIP_API_URL: "http://127.0.0.1:4310",
PAPERCLIP_API_KEY: "bridge-token",
PAPERCLIP_API_BRIDGE_MODE: "queue_v1",
},
stop: async () => {},
})),
Add SSH environment support (#4358) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The environments subsystem already models execution environments, but before this branch there was no end-to-end SSH-backed runtime path for agents to actually run work against a remote box > - That meant agents could be configured around environment concepts without a reliable way to execute adapter sessions remotely, sync workspace state, and preserve run context across supported adapters > - We also need environment selection to participate in normal Paperclip control-plane behavior: agent defaults, project/issue selection, route validation, and environment probing > - Because this capability is still experimental, the UI surface should be easy to hide and easy to remove later without undoing the underlying implementation > - This pull request adds SSH environment execution support across the runtime, adapters, routes, schema, and tests, then puts the visible environment-management UI behind an experimental flag > - The benefit is that we can validate real SSH-backed agent execution now while keeping the user-facing controls safely gated until the feature is ready to come out of experimentation ## What Changed - Added SSH-backed execution target support in the shared adapter runtime, including remote workspace preparation, skill/runtime asset sync, remote session handling, and workspace restore behavior after runs. - Added SSH execution coverage for supported local adapters, plus remote execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi. - Added environment selection and environment-management backend support needed for SSH execution, including route/service work, validation, probing, and agent default environment persistence. - Added CLI support for SSH environment lab verification and updated related docs/tests. - Added the `enableEnvironments` experimental flag and gated the environment UI behind it on company settings, agent configuration, and project configuration surfaces. ## Verification - `pnpm exec vitest run packages/adapters/claude-local/src/server/execute.remote.test.ts packages/adapters/cursor-local/src/server/execute.remote.test.ts packages/adapters/gemini-local/src/server/execute.remote.test.ts packages/adapters/opencode-local/src/server/execute.remote.test.ts packages/adapters/pi-local/src/server/execute.remote.test.ts` - `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts` - `pnpm exec vitest run server/src/__tests__/instance-settings-routes.test.ts` - `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts ui/src/lib/new-agent-runtime-config.test.ts` - `pnpm -r typecheck` - `pnpm build` - Manual verification on a branch-local dev server: - enabled the experimental flag - created an SSH environment - created a Linux Claude agent using that environment - confirmed a run executed on the Linux box and synced workspace changes back ## Risks - Medium: this touches runtime execution flow across multiple adapters, so regressions would likely show up in remote session setup, workspace sync, or environment selection precedence. - The UI flag reduces exposure, but the underlying runtime and route changes are still substantial and rely on migration correctness. - The change set is broad across adapters, control-plane services, migrations, and UI gating, so review should pay close attention to environment-selection precedence and remote workspace lifecycle behavior. ## Model Used - OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding model with tool use and code execution in the local repo workspace. The local adapter does not surface a more specific public model version string in this branch workflow. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-23 19:15:22 -07:00
}));
vi.mock("@paperclipai/adapter-utils/server-utils", async () => {
const actual = await vi.importActual<typeof import("@paperclipai/adapter-utils/server-utils")>(
"@paperclipai/adapter-utils/server-utils",
);
return {
...actual,
ensureCommandResolvable,
resolveCommandForLogs,
runChildProcess,
};
});
vi.mock("@paperclipai/adapter-utils/ssh", async () => {
const actual = await vi.importActual<typeof import("@paperclipai/adapter-utils/ssh")>(
"@paperclipai/adapter-utils/ssh",
);
return {
...actual,
prepareWorkspaceForSshExecution,
restoreWorkspaceFromSshExecution,
runSshCommand,
syncDirectoryToSsh,
};
});
Migrate SSH environment callback to bridge (#5116) > **Stacked PR (part 3 of 7).** Depends on: - PR #5114 - PR #5115 > Diff against `master` includes commits from earlier PRs in the stack — the new commit in this PR is the topmost one. ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Agents executing on a remote SSH-backed environment need a way to call back into > the Paperclip control plane (run events, log streaming, signals) > - When the SSH host can't reach the Paperclip host (NAT, firewalls, or simply not > on the same network), the run silently fails or hangs — a recurring class of > failure during SSH testing > - In sandboxed environments we already solved this with a callback bridge that > tunnels back through the existing connection; SSH was the odd one out > - This PR migrates SSH execution to use the same callback bridge, so every > adapter's remote run uses one consistent reverse-channel. Per-adapter SSH glue > is deleted in favour of a shared `CommandManagedRuntimeRunner` built from the > SSH spec > - The benefit is fewer SSH-specific failure modes, a smaller code surface, and > one place to evolve the callback contract going forward ## What Changed - Added `createSshCommandManagedRuntimeRunner` in `packages/adapter-utils/src/ssh.ts` that adapts an SSH spec into a generic command-managed-runtime runner (with cwd, env, and timeout handling) - Removed `paperclipApiUrl` from `SshRemoteExecutionSpec`; the bridge URL now flows through the shared runner - Reworked `execution-target.ts` to use the SSH runner alongside sandbox runners via a unified `CommandManagedRuntimeRunner` interface - Simplified `remote-managed-runtime.ts` and `sandbox-managed-runtime.ts` to consume the shared runner abstraction - Deleted per-adapter SSH callback wiring from claude-local, codex-local, cursor-local, gemini-local, opencode-local, pi-local execute.ts files - Removed `environment-runtime-driver-contract.test.ts` (the contract is now enforced by `environment-execution-target.test.ts`) - Added/updated `execute.remote.test.ts` cases for each adapter to cover the SSH runner path ## Verification - `pnpm --filter @paperclipai/adapter-utils test` - `pnpm test -- execute.remote` (covers all six local adapters' SSH paths) - Manual QA: ran a claude-local agent against an SSH-backed environment, confirmed the agent successfully called back to `/api/agent-callback/*` endpoints during the run ## Risks - Refactor touches all six local adapters. If any adapter had subtle SSH-specific behaviour that wasn't captured in tests, it could regress. Mitigation: each adapter's `execute.remote.test.ts` was extended. - `paperclipApiUrl` removal from `SshRemoteExecutionSpec` is a breaking type change for any internal consumer. Verified no external plugins consume this type. - The new `CommandManagedRuntimeRunner` shape is a public surface in `@paperclipai/adapter-utils`; downstream plugins implementing custom runners may need updates, but no such plugins exist in this repo. ## Model Used - OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI - Provider: OpenAI - Used to author the code changes in this PR ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots — N/A - [ ] I have updated relevant documentation to reflect my changes — N/A - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-05-03 12:43:52 -07:00
vi.mock("@paperclipai/adapter-utils/execution-target", async () => {
const actual = await vi.importActual<typeof import("@paperclipai/adapter-utils/execution-target")>(
"@paperclipai/adapter-utils/execution-target",
);
return {
...actual,
startAdapterExecutionTargetPaperclipBridge,
};
});
Add SSH environment support (#4358) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The environments subsystem already models execution environments, but before this branch there was no end-to-end SSH-backed runtime path for agents to actually run work against a remote box > - That meant agents could be configured around environment concepts without a reliable way to execute adapter sessions remotely, sync workspace state, and preserve run context across supported adapters > - We also need environment selection to participate in normal Paperclip control-plane behavior: agent defaults, project/issue selection, route validation, and environment probing > - Because this capability is still experimental, the UI surface should be easy to hide and easy to remove later without undoing the underlying implementation > - This pull request adds SSH environment execution support across the runtime, adapters, routes, schema, and tests, then puts the visible environment-management UI behind an experimental flag > - The benefit is that we can validate real SSH-backed agent execution now while keeping the user-facing controls safely gated until the feature is ready to come out of experimentation ## What Changed - Added SSH-backed execution target support in the shared adapter runtime, including remote workspace preparation, skill/runtime asset sync, remote session handling, and workspace restore behavior after runs. - Added SSH execution coverage for supported local adapters, plus remote execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi. - Added environment selection and environment-management backend support needed for SSH execution, including route/service work, validation, probing, and agent default environment persistence. - Added CLI support for SSH environment lab verification and updated related docs/tests. - Added the `enableEnvironments` experimental flag and gated the environment UI behind it on company settings, agent configuration, and project configuration surfaces. ## Verification - `pnpm exec vitest run packages/adapters/claude-local/src/server/execute.remote.test.ts packages/adapters/cursor-local/src/server/execute.remote.test.ts packages/adapters/gemini-local/src/server/execute.remote.test.ts packages/adapters/opencode-local/src/server/execute.remote.test.ts packages/adapters/pi-local/src/server/execute.remote.test.ts` - `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts` - `pnpm exec vitest run server/src/__tests__/instance-settings-routes.test.ts` - `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts ui/src/lib/new-agent-runtime-config.test.ts` - `pnpm -r typecheck` - `pnpm build` - Manual verification on a branch-local dev server: - enabled the experimental flag - created an SSH environment - created a Linux Claude agent using that environment - confirmed a run executed on the Linux box and synced workspace changes back ## Risks - Medium: this touches runtime execution flow across multiple adapters, so regressions would likely show up in remote session setup, workspace sync, or environment selection precedence. - The UI flag reduces exposure, but the underlying runtime and route changes are still substantial and rely on migration correctness. - The change set is broad across adapters, control-plane services, migrations, and UI gating, so review should pay close attention to environment-selection precedence and remote workspace lifecycle behavior. ## Model Used - OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding model with tool use and code execution in the local repo workspace. The local adapter does not surface a more specific public model version string in this branch workflow. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-23 19:15:22 -07:00
import { execute } from "./execute.js";
describe("cursor remote execution", () => {
const cleanupDirs: string[] = [];
afterEach(async () => {
vi.clearAllMocks();
while (cleanupDirs.length > 0) {
const dir = cleanupDirs.pop();
if (!dir) continue;
await rm(dir, { recursive: true, force: true }).catch(() => undefined);
}
});
it("prepares the workspace, syncs Cursor skills, and restores workspace changes for remote SSH execution", async () => {
const rootDir = await mkdtemp(path.join(os.tmpdir(), "paperclip-cursor-remote-"));
cleanupDirs.push(rootDir);
const workspaceDir = path.join(rootDir, "workspace");
Fix remote workspace environment shaping (#5118) > **Stacked PR (part 5 of 7).** Depends on: - PR #5114 - PR #5115 - PR #5116 - PR #5117 > Diff against `master` includes commits from earlier PRs in the stack — the new commit in this PR is the topmost one. ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Agents run with a Paperclip-shaped environment (`PAPERCLIP_WORKSPACE_CWD`, > worktree path, `PAPERCLIP_WORKSPACES_JSON` hints) so the CLI can locate the > correct project tree > - SSH testing reproduced a real failure: a Codex SSH run wrote to > `/tmp/paperclip-env-matrix-...` (the *host* path) instead of the realized > remote workspace at `/home/<user>/paperclip-env-matrix-ssh-claude/...` > because the adapter injected `PAPERCLIP_WORKSPACE_CWD=/tmp/...` into the > remote env > - Code review on the initial codex-only fix asked to roll the same approach > into every other SSH-capable adapter (claude, acpx, cursor, opencode, gemini, > pi) via a shared helper rather than duplicating per-adapter > - This PR adds `shapePaperclipWorkspaceEnvForExecution` in adapter-utils that, > when the execution target is remote: replaces local cwd with the realized > execution cwd, nulls out worktree path (which has no remote meaning), and > rewrites/strips `cwd` entries in workspace hints based on what was actually > synced. Every adapter calls it before invoking the remote runner > - The benefit is that remote runs see the realized remote workspace, host-local > paths stop leaking into remote env, and the rule is unit-tested in one place ## What Changed - Added `shapePaperclipWorkspaceEnvForExecution` to `packages/adapter-utils/src/server-utils.ts` with full unit coverage (`server-utils.test.ts`) - Each of acpx-local, claude-local, codex-local, cursor-local, gemini-local, opencode-local, pi-local now calls the new shaper before issuing the remote command and feeds the shaped values into `applyPaperclipWorkspaceEnv` - Per-adapter `execute.remote.test.ts` files extended to cover the new shaping behaviour: localhost paths replaced with remote cwd, foreign-cwd hints stripped, worktree path nulled out for remote targets - `acpx-local/src/server/execute.test.ts` extended with shaping coverage ## Verification - `pnpm test -- server-utils execute.remote` - `pnpm --filter @paperclipai/adapter-acpx-local test` - Manual QA reproducing the original failure: 1. Provision an E2B sandbox environment for the Paperclip QA company 2. Assign an issue to a remote-targeted claude-local agent and confirm the run starts in the correct remote cwd (no `/Users/...` path leakage in the run logs) 3. Repeat for opencode-local and pi-local ## Risks - Behavioural shift: hints whose `cwd` doesn't match the workspace cwd are now stripped on remote targets. If any adapter relied on a leaked local hint cwd, it will see a missing `cwd` instead. Reviewed all current callers — none do. - Adds a small per-run cost (path resolve + string normalisation) on every remote execution. Negligible. - Worktree path is now nulled out on remote (it has no meaning there). Adapters that previously read the value defensively will continue to work. ## Model Used - OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI - Provider: OpenAI - Used to author the code changes in this PR ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots — N/A - [ ] I have updated relevant documentation to reflect my changes — N/A - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-05-03 13:17:52 -07:00
const alternateWorkspaceDir = path.join(rootDir, "workspace-other");
Add SSH environment support (#4358) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The environments subsystem already models execution environments, but before this branch there was no end-to-end SSH-backed runtime path for agents to actually run work against a remote box > - That meant agents could be configured around environment concepts without a reliable way to execute adapter sessions remotely, sync workspace state, and preserve run context across supported adapters > - We also need environment selection to participate in normal Paperclip control-plane behavior: agent defaults, project/issue selection, route validation, and environment probing > - Because this capability is still experimental, the UI surface should be easy to hide and easy to remove later without undoing the underlying implementation > - This pull request adds SSH environment execution support across the runtime, adapters, routes, schema, and tests, then puts the visible environment-management UI behind an experimental flag > - The benefit is that we can validate real SSH-backed agent execution now while keeping the user-facing controls safely gated until the feature is ready to come out of experimentation ## What Changed - Added SSH-backed execution target support in the shared adapter runtime, including remote workspace preparation, skill/runtime asset sync, remote session handling, and workspace restore behavior after runs. - Added SSH execution coverage for supported local adapters, plus remote execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi. - Added environment selection and environment-management backend support needed for SSH execution, including route/service work, validation, probing, and agent default environment persistence. - Added CLI support for SSH environment lab verification and updated related docs/tests. - Added the `enableEnvironments` experimental flag and gated the environment UI behind it on company settings, agent configuration, and project configuration surfaces. ## Verification - `pnpm exec vitest run packages/adapters/claude-local/src/server/execute.remote.test.ts packages/adapters/cursor-local/src/server/execute.remote.test.ts packages/adapters/gemini-local/src/server/execute.remote.test.ts packages/adapters/opencode-local/src/server/execute.remote.test.ts packages/adapters/pi-local/src/server/execute.remote.test.ts` - `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts` - `pnpm exec vitest run server/src/__tests__/instance-settings-routes.test.ts` - `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts ui/src/lib/new-agent-runtime-config.test.ts` - `pnpm -r typecheck` - `pnpm build` - Manual verification on a branch-local dev server: - enabled the experimental flag - created an SSH environment - created a Linux Claude agent using that environment - confirmed a run executed on the Linux box and synced workspace changes back ## Risks - Medium: this touches runtime execution flow across multiple adapters, so regressions would likely show up in remote session setup, workspace sync, or environment selection precedence. - The UI flag reduces exposure, but the underlying runtime and route changes are still substantial and rely on migration correctness. - The change set is broad across adapters, control-plane services, migrations, and UI gating, so review should pay close attention to environment-selection precedence and remote workspace lifecycle behavior. ## Model Used - OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding model with tool use and code execution in the local repo workspace. The local adapter does not surface a more specific public model version string in this branch workflow. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-23 19:15:22 -07:00
await mkdir(workspaceDir, { recursive: true });
Fix remote workspace environment shaping (#5118) > **Stacked PR (part 5 of 7).** Depends on: - PR #5114 - PR #5115 - PR #5116 - PR #5117 > Diff against `master` includes commits from earlier PRs in the stack — the new commit in this PR is the topmost one. ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Agents run with a Paperclip-shaped environment (`PAPERCLIP_WORKSPACE_CWD`, > worktree path, `PAPERCLIP_WORKSPACES_JSON` hints) so the CLI can locate the > correct project tree > - SSH testing reproduced a real failure: a Codex SSH run wrote to > `/tmp/paperclip-env-matrix-...` (the *host* path) instead of the realized > remote workspace at `/home/<user>/paperclip-env-matrix-ssh-claude/...` > because the adapter injected `PAPERCLIP_WORKSPACE_CWD=/tmp/...` into the > remote env > - Code review on the initial codex-only fix asked to roll the same approach > into every other SSH-capable adapter (claude, acpx, cursor, opencode, gemini, > pi) via a shared helper rather than duplicating per-adapter > - This PR adds `shapePaperclipWorkspaceEnvForExecution` in adapter-utils that, > when the execution target is remote: replaces local cwd with the realized > execution cwd, nulls out worktree path (which has no remote meaning), and > rewrites/strips `cwd` entries in workspace hints based on what was actually > synced. Every adapter calls it before invoking the remote runner > - The benefit is that remote runs see the realized remote workspace, host-local > paths stop leaking into remote env, and the rule is unit-tested in one place ## What Changed - Added `shapePaperclipWorkspaceEnvForExecution` to `packages/adapter-utils/src/server-utils.ts` with full unit coverage (`server-utils.test.ts`) - Each of acpx-local, claude-local, codex-local, cursor-local, gemini-local, opencode-local, pi-local now calls the new shaper before issuing the remote command and feeds the shaped values into `applyPaperclipWorkspaceEnv` - Per-adapter `execute.remote.test.ts` files extended to cover the new shaping behaviour: localhost paths replaced with remote cwd, foreign-cwd hints stripped, worktree path nulled out for remote targets - `acpx-local/src/server/execute.test.ts` extended with shaping coverage ## Verification - `pnpm test -- server-utils execute.remote` - `pnpm --filter @paperclipai/adapter-acpx-local test` - Manual QA reproducing the original failure: 1. Provision an E2B sandbox environment for the Paperclip QA company 2. Assign an issue to a remote-targeted claude-local agent and confirm the run starts in the correct remote cwd (no `/Users/...` path leakage in the run logs) 3. Repeat for opencode-local and pi-local ## Risks - Behavioural shift: hints whose `cwd` doesn't match the workspace cwd are now stripped on remote targets. If any adapter relied on a leaked local hint cwd, it will see a missing `cwd` instead. Reviewed all current callers — none do. - Adds a small per-run cost (path resolve + string normalisation) on every remote execution. Negligible. - Worktree path is now nulled out on remote (it has no meaning there). Adapters that previously read the value defensively will continue to work. ## Model Used - OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI - Provider: OpenAI - Used to author the code changes in this PR ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots — N/A - [ ] I have updated relevant documentation to reflect my changes — N/A - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-05-03 13:17:52 -07:00
await mkdir(alternateWorkspaceDir, { recursive: true });
Add SSH environment support (#4358) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The environments subsystem already models execution environments, but before this branch there was no end-to-end SSH-backed runtime path for agents to actually run work against a remote box > - That meant agents could be configured around environment concepts without a reliable way to execute adapter sessions remotely, sync workspace state, and preserve run context across supported adapters > - We also need environment selection to participate in normal Paperclip control-plane behavior: agent defaults, project/issue selection, route validation, and environment probing > - Because this capability is still experimental, the UI surface should be easy to hide and easy to remove later without undoing the underlying implementation > - This pull request adds SSH environment execution support across the runtime, adapters, routes, schema, and tests, then puts the visible environment-management UI behind an experimental flag > - The benefit is that we can validate real SSH-backed agent execution now while keeping the user-facing controls safely gated until the feature is ready to come out of experimentation ## What Changed - Added SSH-backed execution target support in the shared adapter runtime, including remote workspace preparation, skill/runtime asset sync, remote session handling, and workspace restore behavior after runs. - Added SSH execution coverage for supported local adapters, plus remote execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi. - Added environment selection and environment-management backend support needed for SSH execution, including route/service work, validation, probing, and agent default environment persistence. - Added CLI support for SSH environment lab verification and updated related docs/tests. - Added the `enableEnvironments` experimental flag and gated the environment UI behind it on company settings, agent configuration, and project configuration surfaces. ## Verification - `pnpm exec vitest run packages/adapters/claude-local/src/server/execute.remote.test.ts packages/adapters/cursor-local/src/server/execute.remote.test.ts packages/adapters/gemini-local/src/server/execute.remote.test.ts packages/adapters/opencode-local/src/server/execute.remote.test.ts packages/adapters/pi-local/src/server/execute.remote.test.ts` - `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts` - `pnpm exec vitest run server/src/__tests__/instance-settings-routes.test.ts` - `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts ui/src/lib/new-agent-runtime-config.test.ts` - `pnpm -r typecheck` - `pnpm build` - Manual verification on a branch-local dev server: - enabled the experimental flag - created an SSH environment - created a Linux Claude agent using that environment - confirmed a run executed on the Linux box and synced workspace changes back ## Risks - Medium: this touches runtime execution flow across multiple adapters, so regressions would likely show up in remote session setup, workspace sync, or environment selection precedence. - The UI flag reduces exposure, but the underlying runtime and route changes are still substantial and rely on migration correctness. - The change set is broad across adapters, control-plane services, migrations, and UI gating, so review should pay close attention to environment-selection precedence and remote workspace lifecycle behavior. ## Model Used - OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding model with tool use and code execution in the local repo workspace. The local adapter does not surface a more specific public model version string in this branch workflow. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-23 19:15:22 -07:00
const result = await execute({
runId: "run-1",
agent: {
id: "agent-1",
companyId: "company-1",
name: "Cursor Builder",
adapterType: "cursor",
adapterConfig: {},
},
runtime: {
sessionId: null,
sessionParams: null,
sessionDisplayId: null,
taskKey: null,
},
config: {
command: "agent",
},
context: {
paperclipWorkspace: {
cwd: workspaceDir,
source: "project_primary",
},
Fix remote workspace environment shaping (#5118) > **Stacked PR (part 5 of 7).** Depends on: - PR #5114 - PR #5115 - PR #5116 - PR #5117 > Diff against `master` includes commits from earlier PRs in the stack — the new commit in this PR is the topmost one. ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Agents run with a Paperclip-shaped environment (`PAPERCLIP_WORKSPACE_CWD`, > worktree path, `PAPERCLIP_WORKSPACES_JSON` hints) so the CLI can locate the > correct project tree > - SSH testing reproduced a real failure: a Codex SSH run wrote to > `/tmp/paperclip-env-matrix-...` (the *host* path) instead of the realized > remote workspace at `/home/<user>/paperclip-env-matrix-ssh-claude/...` > because the adapter injected `PAPERCLIP_WORKSPACE_CWD=/tmp/...` into the > remote env > - Code review on the initial codex-only fix asked to roll the same approach > into every other SSH-capable adapter (claude, acpx, cursor, opencode, gemini, > pi) via a shared helper rather than duplicating per-adapter > - This PR adds `shapePaperclipWorkspaceEnvForExecution` in adapter-utils that, > when the execution target is remote: replaces local cwd with the realized > execution cwd, nulls out worktree path (which has no remote meaning), and > rewrites/strips `cwd` entries in workspace hints based on what was actually > synced. Every adapter calls it before invoking the remote runner > - The benefit is that remote runs see the realized remote workspace, host-local > paths stop leaking into remote env, and the rule is unit-tested in one place ## What Changed - Added `shapePaperclipWorkspaceEnvForExecution` to `packages/adapter-utils/src/server-utils.ts` with full unit coverage (`server-utils.test.ts`) - Each of acpx-local, claude-local, codex-local, cursor-local, gemini-local, opencode-local, pi-local now calls the new shaper before issuing the remote command and feeds the shaped values into `applyPaperclipWorkspaceEnv` - Per-adapter `execute.remote.test.ts` files extended to cover the new shaping behaviour: localhost paths replaced with remote cwd, foreign-cwd hints stripped, worktree path nulled out for remote targets - `acpx-local/src/server/execute.test.ts` extended with shaping coverage ## Verification - `pnpm test -- server-utils execute.remote` - `pnpm --filter @paperclipai/adapter-acpx-local test` - Manual QA reproducing the original failure: 1. Provision an E2B sandbox environment for the Paperclip QA company 2. Assign an issue to a remote-targeted claude-local agent and confirm the run starts in the correct remote cwd (no `/Users/...` path leakage in the run logs) 3. Repeat for opencode-local and pi-local ## Risks - Behavioural shift: hints whose `cwd` doesn't match the workspace cwd are now stripped on remote targets. If any adapter relied on a leaked local hint cwd, it will see a missing `cwd` instead. Reviewed all current callers — none do. - Adds a small per-run cost (path resolve + string normalisation) on every remote execution. Negligible. - Worktree path is now nulled out on remote (it has no meaning there). Adapters that previously read the value defensively will continue to work. ## Model Used - OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI - Provider: OpenAI - Used to author the code changes in this PR ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots — N/A - [ ] I have updated relevant documentation to reflect my changes — N/A - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-05-03 13:17:52 -07:00
paperclipWorkspaces: [
{
workspaceId: "workspace-1",
cwd: workspaceDir,
repoUrl: "https://github.com/paperclipai/paperclip.git",
repoRef: "main",
},
{
workspaceId: "workspace-2",
cwd: alternateWorkspaceDir,
repoUrl: "https://github.com/paperclipai/paperclip.git",
repoRef: "feature/other",
},
],
Add SSH environment support (#4358) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The environments subsystem already models execution environments, but before this branch there was no end-to-end SSH-backed runtime path for agents to actually run work against a remote box > - That meant agents could be configured around environment concepts without a reliable way to execute adapter sessions remotely, sync workspace state, and preserve run context across supported adapters > - We also need environment selection to participate in normal Paperclip control-plane behavior: agent defaults, project/issue selection, route validation, and environment probing > - Because this capability is still experimental, the UI surface should be easy to hide and easy to remove later without undoing the underlying implementation > - This pull request adds SSH environment execution support across the runtime, adapters, routes, schema, and tests, then puts the visible environment-management UI behind an experimental flag > - The benefit is that we can validate real SSH-backed agent execution now while keeping the user-facing controls safely gated until the feature is ready to come out of experimentation ## What Changed - Added SSH-backed execution target support in the shared adapter runtime, including remote workspace preparation, skill/runtime asset sync, remote session handling, and workspace restore behavior after runs. - Added SSH execution coverage for supported local adapters, plus remote execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi. - Added environment selection and environment-management backend support needed for SSH execution, including route/service work, validation, probing, and agent default environment persistence. - Added CLI support for SSH environment lab verification and updated related docs/tests. - Added the `enableEnvironments` experimental flag and gated the environment UI behind it on company settings, agent configuration, and project configuration surfaces. ## Verification - `pnpm exec vitest run packages/adapters/claude-local/src/server/execute.remote.test.ts packages/adapters/cursor-local/src/server/execute.remote.test.ts packages/adapters/gemini-local/src/server/execute.remote.test.ts packages/adapters/opencode-local/src/server/execute.remote.test.ts packages/adapters/pi-local/src/server/execute.remote.test.ts` - `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts` - `pnpm exec vitest run server/src/__tests__/instance-settings-routes.test.ts` - `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts ui/src/lib/new-agent-runtime-config.test.ts` - `pnpm -r typecheck` - `pnpm build` - Manual verification on a branch-local dev server: - enabled the experimental flag - created an SSH environment - created a Linux Claude agent using that environment - confirmed a run executed on the Linux box and synced workspace changes back ## Risks - Medium: this touches runtime execution flow across multiple adapters, so regressions would likely show up in remote session setup, workspace sync, or environment selection precedence. - The UI flag reduces exposure, but the underlying runtime and route changes are still substantial and rely on migration correctness. - The change set is broad across adapters, control-plane services, migrations, and UI gating, so review should pay close attention to environment-selection precedence and remote workspace lifecycle behavior. ## Model Used - OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding model with tool use and code execution in the local repo workspace. The local adapter does not surface a more specific public model version string in this branch workflow. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-23 19:15:22 -07:00
},
executionTransport: {
remoteExecution: {
host: "127.0.0.1",
port: 2222,
username: "fixture",
remoteWorkspacePath: "/remote/workspace",
remoteCwd: "/remote/workspace",
privateKey: "PRIVATE KEY",
knownHosts: "[127.0.0.1]:2222 ssh-ed25519 AAAA",
strictHostKeyChecking: true,
},
},
onLog: async () => {},
});
expect(result.sessionParams).toMatchObject({
sessionId: "cursor-session-1",
cwd: "/remote/workspace",
remoteExecution: {
transport: "ssh",
host: "127.0.0.1",
port: 2222,
username: "fixture",
remoteCwd: "/remote/workspace",
},
});
expect(prepareWorkspaceForSshExecution).toHaveBeenCalledTimes(1);
expect(syncDirectoryToSsh).toHaveBeenCalledTimes(1);
expect(syncDirectoryToSsh).toHaveBeenCalledWith(expect.objectContaining({
remoteDir: "/remote/workspace/.paperclip-runtime/cursor/skills",
followSymlinks: true,
}));
expect(runSshCommand).toHaveBeenCalledWith(
expect.anything(),
expect.stringContaining(".cursor/skills"),
expect.anything(),
);
const call = runChildProcess.mock.calls[0] as unknown as
| [string, string, string[], { env: Record<string, string>; remoteExecution?: { remoteCwd: string } | null }]
| undefined;
expect(call?.[2]).toContain("--workspace");
expect(call?.[2]).toContain("/remote/workspace");
Fix remote workspace environment shaping (#5118) > **Stacked PR (part 5 of 7).** Depends on: - PR #5114 - PR #5115 - PR #5116 - PR #5117 > Diff against `master` includes commits from earlier PRs in the stack — the new commit in this PR is the topmost one. ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Agents run with a Paperclip-shaped environment (`PAPERCLIP_WORKSPACE_CWD`, > worktree path, `PAPERCLIP_WORKSPACES_JSON` hints) so the CLI can locate the > correct project tree > - SSH testing reproduced a real failure: a Codex SSH run wrote to > `/tmp/paperclip-env-matrix-...` (the *host* path) instead of the realized > remote workspace at `/home/<user>/paperclip-env-matrix-ssh-claude/...` > because the adapter injected `PAPERCLIP_WORKSPACE_CWD=/tmp/...` into the > remote env > - Code review on the initial codex-only fix asked to roll the same approach > into every other SSH-capable adapter (claude, acpx, cursor, opencode, gemini, > pi) via a shared helper rather than duplicating per-adapter > - This PR adds `shapePaperclipWorkspaceEnvForExecution` in adapter-utils that, > when the execution target is remote: replaces local cwd with the realized > execution cwd, nulls out worktree path (which has no remote meaning), and > rewrites/strips `cwd` entries in workspace hints based on what was actually > synced. Every adapter calls it before invoking the remote runner > - The benefit is that remote runs see the realized remote workspace, host-local > paths stop leaking into remote env, and the rule is unit-tested in one place ## What Changed - Added `shapePaperclipWorkspaceEnvForExecution` to `packages/adapter-utils/src/server-utils.ts` with full unit coverage (`server-utils.test.ts`) - Each of acpx-local, claude-local, codex-local, cursor-local, gemini-local, opencode-local, pi-local now calls the new shaper before issuing the remote command and feeds the shaped values into `applyPaperclipWorkspaceEnv` - Per-adapter `execute.remote.test.ts` files extended to cover the new shaping behaviour: localhost paths replaced with remote cwd, foreign-cwd hints stripped, worktree path nulled out for remote targets - `acpx-local/src/server/execute.test.ts` extended with shaping coverage ## Verification - `pnpm test -- server-utils execute.remote` - `pnpm --filter @paperclipai/adapter-acpx-local test` - Manual QA reproducing the original failure: 1. Provision an E2B sandbox environment for the Paperclip QA company 2. Assign an issue to a remote-targeted claude-local agent and confirm the run starts in the correct remote cwd (no `/Users/...` path leakage in the run logs) 3. Repeat for opencode-local and pi-local ## Risks - Behavioural shift: hints whose `cwd` doesn't match the workspace cwd are now stripped on remote targets. If any adapter relied on a leaked local hint cwd, it will see a missing `cwd` instead. Reviewed all current callers — none do. - Adds a small per-run cost (path resolve + string normalisation) on every remote execution. Negligible. - Worktree path is now nulled out on remote (it has no meaning there). Adapters that previously read the value defensively will continue to work. ## Model Used - OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI - Provider: OpenAI - Used to author the code changes in this PR ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots — N/A - [ ] I have updated relevant documentation to reflect my changes — N/A - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-05-03 13:17:52 -07:00
expect(call?.[3].env.PAPERCLIP_WORKSPACE_CWD).toBe("/remote/workspace");
expect(JSON.parse(call?.[3].env.PAPERCLIP_WORKSPACES_JSON ?? "[]")).toEqual([
{
workspaceId: "workspace-1",
cwd: "/remote/workspace",
repoUrl: "https://github.com/paperclipai/paperclip.git",
repoRef: "main",
},
{
workspaceId: "workspace-2",
repoUrl: "https://github.com/paperclipai/paperclip.git",
repoRef: "feature/other",
},
]);
Migrate SSH environment callback to bridge (#5116) > **Stacked PR (part 3 of 7).** Depends on: - PR #5114 - PR #5115 > Diff against `master` includes commits from earlier PRs in the stack — the new commit in this PR is the topmost one. ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Agents executing on a remote SSH-backed environment need a way to call back into > the Paperclip control plane (run events, log streaming, signals) > - When the SSH host can't reach the Paperclip host (NAT, firewalls, or simply not > on the same network), the run silently fails or hangs — a recurring class of > failure during SSH testing > - In sandboxed environments we already solved this with a callback bridge that > tunnels back through the existing connection; SSH was the odd one out > - This PR migrates SSH execution to use the same callback bridge, so every > adapter's remote run uses one consistent reverse-channel. Per-adapter SSH glue > is deleted in favour of a shared `CommandManagedRuntimeRunner` built from the > SSH spec > - The benefit is fewer SSH-specific failure modes, a smaller code surface, and > one place to evolve the callback contract going forward ## What Changed - Added `createSshCommandManagedRuntimeRunner` in `packages/adapter-utils/src/ssh.ts` that adapts an SSH spec into a generic command-managed-runtime runner (with cwd, env, and timeout handling) - Removed `paperclipApiUrl` from `SshRemoteExecutionSpec`; the bridge URL now flows through the shared runner - Reworked `execution-target.ts` to use the SSH runner alongside sandbox runners via a unified `CommandManagedRuntimeRunner` interface - Simplified `remote-managed-runtime.ts` and `sandbox-managed-runtime.ts` to consume the shared runner abstraction - Deleted per-adapter SSH callback wiring from claude-local, codex-local, cursor-local, gemini-local, opencode-local, pi-local execute.ts files - Removed `environment-runtime-driver-contract.test.ts` (the contract is now enforced by `environment-execution-target.test.ts`) - Added/updated `execute.remote.test.ts` cases for each adapter to cover the SSH runner path ## Verification - `pnpm --filter @paperclipai/adapter-utils test` - `pnpm test -- execute.remote` (covers all six local adapters' SSH paths) - Manual QA: ran a claude-local agent against an SSH-backed environment, confirmed the agent successfully called back to `/api/agent-callback/*` endpoints during the run ## Risks - Refactor touches all six local adapters. If any adapter had subtle SSH-specific behaviour that wasn't captured in tests, it could regress. Mitigation: each adapter's `execute.remote.test.ts` was extended. - `paperclipApiUrl` removal from `SshRemoteExecutionSpec` is a breaking type change for any internal consumer. Verified no external plugins consume this type. - The new `CommandManagedRuntimeRunner` shape is a public surface in `@paperclipai/adapter-utils`; downstream plugins implementing custom runners may need updates, but no such plugins exist in this repo. ## Model Used - OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI - Provider: OpenAI - Used to author the code changes in this PR ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots — N/A - [ ] I have updated relevant documentation to reflect my changes — N/A - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-05-03 12:43:52 -07:00
expect(call?.[3].env.PAPERCLIP_API_URL).toBe("http://127.0.0.1:4310");
expect(call?.[3].env.PAPERCLIP_API_BRIDGE_MODE).toBe("queue_v1");
Add SSH environment support (#4358) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The environments subsystem already models execution environments, but before this branch there was no end-to-end SSH-backed runtime path for agents to actually run work against a remote box > - That meant agents could be configured around environment concepts without a reliable way to execute adapter sessions remotely, sync workspace state, and preserve run context across supported adapters > - We also need environment selection to participate in normal Paperclip control-plane behavior: agent defaults, project/issue selection, route validation, and environment probing > - Because this capability is still experimental, the UI surface should be easy to hide and easy to remove later without undoing the underlying implementation > - This pull request adds SSH environment execution support across the runtime, adapters, routes, schema, and tests, then puts the visible environment-management UI behind an experimental flag > - The benefit is that we can validate real SSH-backed agent execution now while keeping the user-facing controls safely gated until the feature is ready to come out of experimentation ## What Changed - Added SSH-backed execution target support in the shared adapter runtime, including remote workspace preparation, skill/runtime asset sync, remote session handling, and workspace restore behavior after runs. - Added SSH execution coverage for supported local adapters, plus remote execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi. - Added environment selection and environment-management backend support needed for SSH execution, including route/service work, validation, probing, and agent default environment persistence. - Added CLI support for SSH environment lab verification and updated related docs/tests. - Added the `enableEnvironments` experimental flag and gated the environment UI behind it on company settings, agent configuration, and project configuration surfaces. ## Verification - `pnpm exec vitest run packages/adapters/claude-local/src/server/execute.remote.test.ts packages/adapters/cursor-local/src/server/execute.remote.test.ts packages/adapters/gemini-local/src/server/execute.remote.test.ts packages/adapters/opencode-local/src/server/execute.remote.test.ts packages/adapters/pi-local/src/server/execute.remote.test.ts` - `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts` - `pnpm exec vitest run server/src/__tests__/instance-settings-routes.test.ts` - `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts ui/src/lib/new-agent-runtime-config.test.ts` - `pnpm -r typecheck` - `pnpm build` - Manual verification on a branch-local dev server: - enabled the experimental flag - created an SSH environment - created a Linux Claude agent using that environment - confirmed a run executed on the Linux box and synced workspace changes back ## Risks - Medium: this touches runtime execution flow across multiple adapters, so regressions would likely show up in remote session setup, workspace sync, or environment selection precedence. - The UI flag reduces exposure, but the underlying runtime and route changes are still substantial and rely on migration correctness. - The change set is broad across adapters, control-plane services, migrations, and UI gating, so review should pay close attention to environment-selection precedence and remote workspace lifecycle behavior. ## Model Used - OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding model with tool use and code execution in the local repo workspace. The local adapter does not surface a more specific public model version string in this branch workflow. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-23 19:15:22 -07:00
expect(call?.[3].remoteExecution?.remoteCwd).toBe("/remote/workspace");
Migrate SSH environment callback to bridge (#5116) > **Stacked PR (part 3 of 7).** Depends on: - PR #5114 - PR #5115 > Diff against `master` includes commits from earlier PRs in the stack — the new commit in this PR is the topmost one. ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Agents executing on a remote SSH-backed environment need a way to call back into > the Paperclip control plane (run events, log streaming, signals) > - When the SSH host can't reach the Paperclip host (NAT, firewalls, or simply not > on the same network), the run silently fails or hangs — a recurring class of > failure during SSH testing > - In sandboxed environments we already solved this with a callback bridge that > tunnels back through the existing connection; SSH was the odd one out > - This PR migrates SSH execution to use the same callback bridge, so every > adapter's remote run uses one consistent reverse-channel. Per-adapter SSH glue > is deleted in favour of a shared `CommandManagedRuntimeRunner` built from the > SSH spec > - The benefit is fewer SSH-specific failure modes, a smaller code surface, and > one place to evolve the callback contract going forward ## What Changed - Added `createSshCommandManagedRuntimeRunner` in `packages/adapter-utils/src/ssh.ts` that adapts an SSH spec into a generic command-managed-runtime runner (with cwd, env, and timeout handling) - Removed `paperclipApiUrl` from `SshRemoteExecutionSpec`; the bridge URL now flows through the shared runner - Reworked `execution-target.ts` to use the SSH runner alongside sandbox runners via a unified `CommandManagedRuntimeRunner` interface - Simplified `remote-managed-runtime.ts` and `sandbox-managed-runtime.ts` to consume the shared runner abstraction - Deleted per-adapter SSH callback wiring from claude-local, codex-local, cursor-local, gemini-local, opencode-local, pi-local execute.ts files - Removed `environment-runtime-driver-contract.test.ts` (the contract is now enforced by `environment-execution-target.test.ts`) - Added/updated `execute.remote.test.ts` cases for each adapter to cover the SSH runner path ## Verification - `pnpm --filter @paperclipai/adapter-utils test` - `pnpm test -- execute.remote` (covers all six local adapters' SSH paths) - Manual QA: ran a claude-local agent against an SSH-backed environment, confirmed the agent successfully called back to `/api/agent-callback/*` endpoints during the run ## Risks - Refactor touches all six local adapters. If any adapter had subtle SSH-specific behaviour that wasn't captured in tests, it could regress. Mitigation: each adapter's `execute.remote.test.ts` was extended. - `paperclipApiUrl` removal from `SshRemoteExecutionSpec` is a breaking type change for any internal consumer. Verified no external plugins consume this type. - The new `CommandManagedRuntimeRunner` shape is a public surface in `@paperclipai/adapter-utils`; downstream plugins implementing custom runners may need updates, but no such plugins exist in this repo. ## Model Used - OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI - Provider: OpenAI - Used to author the code changes in this PR ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots — N/A - [ ] I have updated relevant documentation to reflect my changes — N/A - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-05-03 12:43:52 -07:00
expect(startAdapterExecutionTargetPaperclipBridge).toHaveBeenCalledTimes(1);
Add SSH environment support (#4358) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The environments subsystem already models execution environments, but before this branch there was no end-to-end SSH-backed runtime path for agents to actually run work against a remote box > - That meant agents could be configured around environment concepts without a reliable way to execute adapter sessions remotely, sync workspace state, and preserve run context across supported adapters > - We also need environment selection to participate in normal Paperclip control-plane behavior: agent defaults, project/issue selection, route validation, and environment probing > - Because this capability is still experimental, the UI surface should be easy to hide and easy to remove later without undoing the underlying implementation > - This pull request adds SSH environment execution support across the runtime, adapters, routes, schema, and tests, then puts the visible environment-management UI behind an experimental flag > - The benefit is that we can validate real SSH-backed agent execution now while keeping the user-facing controls safely gated until the feature is ready to come out of experimentation ## What Changed - Added SSH-backed execution target support in the shared adapter runtime, including remote workspace preparation, skill/runtime asset sync, remote session handling, and workspace restore behavior after runs. - Added SSH execution coverage for supported local adapters, plus remote execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi. - Added environment selection and environment-management backend support needed for SSH execution, including route/service work, validation, probing, and agent default environment persistence. - Added CLI support for SSH environment lab verification and updated related docs/tests. - Added the `enableEnvironments` experimental flag and gated the environment UI behind it on company settings, agent configuration, and project configuration surfaces. ## Verification - `pnpm exec vitest run packages/adapters/claude-local/src/server/execute.remote.test.ts packages/adapters/cursor-local/src/server/execute.remote.test.ts packages/adapters/gemini-local/src/server/execute.remote.test.ts packages/adapters/opencode-local/src/server/execute.remote.test.ts packages/adapters/pi-local/src/server/execute.remote.test.ts` - `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts` - `pnpm exec vitest run server/src/__tests__/instance-settings-routes.test.ts` - `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts ui/src/lib/new-agent-runtime-config.test.ts` - `pnpm -r typecheck` - `pnpm build` - Manual verification on a branch-local dev server: - enabled the experimental flag - created an SSH environment - created a Linux Claude agent using that environment - confirmed a run executed on the Linux box and synced workspace changes back ## Risks - Medium: this touches runtime execution flow across multiple adapters, so regressions would likely show up in remote session setup, workspace sync, or environment selection precedence. - The UI flag reduces exposure, but the underlying runtime and route changes are still substantial and rely on migration correctness. - The change set is broad across adapters, control-plane services, migrations, and UI gating, so review should pay close attention to environment-selection precedence and remote workspace lifecycle behavior. ## Model Used - OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding model with tool use and code execution in the local repo workspace. The local adapter does not surface a more specific public model version string in this branch workflow. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-23 19:15:22 -07:00
expect(restoreWorkspaceFromSshExecution).toHaveBeenCalledTimes(1);
});
it("resumes saved Cursor sessions for remote SSH execution only when the identity matches", async () => {
const rootDir = await mkdtemp(path.join(os.tmpdir(), "paperclip-cursor-remote-resume-"));
cleanupDirs.push(rootDir);
const workspaceDir = path.join(rootDir, "workspace");
await mkdir(workspaceDir, { recursive: true });
await execute({
runId: "run-ssh-resume",
agent: {
id: "agent-1",
companyId: "company-1",
name: "Cursor Builder",
adapterType: "cursor",
adapterConfig: {},
},
runtime: {
sessionId: "session-123",
sessionParams: {
sessionId: "session-123",
cwd: "/remote/workspace",
remoteExecution: {
transport: "ssh",
host: "127.0.0.1",
port: 2222,
username: "fixture",
remoteCwd: "/remote/workspace",
},
},
sessionDisplayId: "session-123",
taskKey: null,
},
config: {
command: "agent",
},
context: {
paperclipWorkspace: {
cwd: workspaceDir,
source: "project_primary",
},
},
executionTransport: {
remoteExecution: {
host: "127.0.0.1",
port: 2222,
username: "fixture",
remoteWorkspacePath: "/remote/workspace",
remoteCwd: "/remote/workspace",
privateKey: "PRIVATE KEY",
knownHosts: "[127.0.0.1]:2222 ssh-ed25519 AAAA",
strictHostKeyChecking: true,
},
},
onLog: async () => {},
});
const call = runChildProcess.mock.calls[0] as unknown as [string, string, string[]] | undefined;
expect(call?.[2]).toContain("--resume");
expect(call?.[2]).toContain("session-123");
});
it("restores the remote workspace if skills sync fails after workspace prep", async () => {
const rootDir = await mkdtemp(path.join(os.tmpdir(), "paperclip-cursor-remote-sync-fail-"));
cleanupDirs.push(rootDir);
const workspaceDir = path.join(rootDir, "workspace");
await mkdir(workspaceDir, { recursive: true });
syncDirectoryToSsh.mockRejectedValueOnce(new Error("sync failed"));
await expect(execute({
runId: "run-sync-fail",
agent: {
id: "agent-1",
companyId: "company-1",
name: "Cursor Builder",
adapterType: "cursor",
adapterConfig: {},
},
runtime: {
sessionId: null,
sessionParams: null,
sessionDisplayId: null,
taskKey: null,
},
config: {
command: "agent",
},
context: {
paperclipWorkspace: {
cwd: workspaceDir,
source: "project_primary",
},
},
executionTransport: {
remoteExecution: {
host: "127.0.0.1",
port: 2222,
username: "fixture",
remoteWorkspacePath: "/remote/workspace",
remoteCwd: "/remote/workspace",
privateKey: "PRIVATE KEY",
knownHosts: "[127.0.0.1]:2222 ssh-ed25519 AAAA",
strictHostKeyChecking: true,
},
},
onLog: async () => {},
})).rejects.toThrow("sync failed");
expect(prepareWorkspaceForSshExecution).toHaveBeenCalledTimes(1);
expect(restoreWorkspaceFromSshExecution).toHaveBeenCalledTimes(1);
expect(runChildProcess).not.toHaveBeenCalled();
});
});