paperclip/packages/adapters/cursor-local/src/server/execute.remote.test.ts

320 lines
10 KiB
TypeScript
Raw Normal View History

Add SSH environment support (#4358) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The environments subsystem already models execution environments, but before this branch there was no end-to-end SSH-backed runtime path for agents to actually run work against a remote box > - That meant agents could be configured around environment concepts without a reliable way to execute adapter sessions remotely, sync workspace state, and preserve run context across supported adapters > - We also need environment selection to participate in normal Paperclip control-plane behavior: agent defaults, project/issue selection, route validation, and environment probing > - Because this capability is still experimental, the UI surface should be easy to hide and easy to remove later without undoing the underlying implementation > - This pull request adds SSH environment execution support across the runtime, adapters, routes, schema, and tests, then puts the visible environment-management UI behind an experimental flag > - The benefit is that we can validate real SSH-backed agent execution now while keeping the user-facing controls safely gated until the feature is ready to come out of experimentation ## What Changed - Added SSH-backed execution target support in the shared adapter runtime, including remote workspace preparation, skill/runtime asset sync, remote session handling, and workspace restore behavior after runs. - Added SSH execution coverage for supported local adapters, plus remote execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi. - Added environment selection and environment-management backend support needed for SSH execution, including route/service work, validation, probing, and agent default environment persistence. - Added CLI support for SSH environment lab verification and updated related docs/tests. - Added the `enableEnvironments` experimental flag and gated the environment UI behind it on company settings, agent configuration, and project configuration surfaces. ## Verification - `pnpm exec vitest run packages/adapters/claude-local/src/server/execute.remote.test.ts packages/adapters/cursor-local/src/server/execute.remote.test.ts packages/adapters/gemini-local/src/server/execute.remote.test.ts packages/adapters/opencode-local/src/server/execute.remote.test.ts packages/adapters/pi-local/src/server/execute.remote.test.ts` - `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts` - `pnpm exec vitest run server/src/__tests__/instance-settings-routes.test.ts` - `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts ui/src/lib/new-agent-runtime-config.test.ts` - `pnpm -r typecheck` - `pnpm build` - Manual verification on a branch-local dev server: - enabled the experimental flag - created an SSH environment - created a Linux Claude agent using that environment - confirmed a run executed on the Linux box and synced workspace changes back ## Risks - Medium: this touches runtime execution flow across multiple adapters, so regressions would likely show up in remote session setup, workspace sync, or environment selection precedence. - The UI flag reduces exposure, but the underlying runtime and route changes are still substantial and rely on migration correctness. - The change set is broad across adapters, control-plane services, migrations, and UI gating, so review should pay close attention to environment-selection precedence and remote workspace lifecycle behavior. ## Model Used - OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding model with tool use and code execution in the local repo workspace. The local adapter does not surface a more specific public model version string in this branch workflow. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-23 19:15:22 -07:00
import { mkdir, mkdtemp, rm } from "node:fs/promises";
import os from "node:os";
import path from "node:path";
import { afterEach, describe, expect, it, vi } from "vitest";
const {
runChildProcess,
ensureCommandResolvable,
resolveCommandForLogs,
prepareWorkspaceForSshExecution,
restoreWorkspaceFromSshExecution,
runSshCommand,
syncDirectoryToSsh,
Migrate SSH environment callback to bridge (#5116) > **Stacked PR (part 3 of 7).** Depends on: - PR #5114 - PR #5115 > Diff against `master` includes commits from earlier PRs in the stack — the new commit in this PR is the topmost one. ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Agents executing on a remote SSH-backed environment need a way to call back into > the Paperclip control plane (run events, log streaming, signals) > - When the SSH host can't reach the Paperclip host (NAT, firewalls, or simply not > on the same network), the run silently fails or hangs — a recurring class of > failure during SSH testing > - In sandboxed environments we already solved this with a callback bridge that > tunnels back through the existing connection; SSH was the odd one out > - This PR migrates SSH execution to use the same callback bridge, so every > adapter's remote run uses one consistent reverse-channel. Per-adapter SSH glue > is deleted in favour of a shared `CommandManagedRuntimeRunner` built from the > SSH spec > - The benefit is fewer SSH-specific failure modes, a smaller code surface, and > one place to evolve the callback contract going forward ## What Changed - Added `createSshCommandManagedRuntimeRunner` in `packages/adapter-utils/src/ssh.ts` that adapts an SSH spec into a generic command-managed-runtime runner (with cwd, env, and timeout handling) - Removed `paperclipApiUrl` from `SshRemoteExecutionSpec`; the bridge URL now flows through the shared runner - Reworked `execution-target.ts` to use the SSH runner alongside sandbox runners via a unified `CommandManagedRuntimeRunner` interface - Simplified `remote-managed-runtime.ts` and `sandbox-managed-runtime.ts` to consume the shared runner abstraction - Deleted per-adapter SSH callback wiring from claude-local, codex-local, cursor-local, gemini-local, opencode-local, pi-local execute.ts files - Removed `environment-runtime-driver-contract.test.ts` (the contract is now enforced by `environment-execution-target.test.ts`) - Added/updated `execute.remote.test.ts` cases for each adapter to cover the SSH runner path ## Verification - `pnpm --filter @paperclipai/adapter-utils test` - `pnpm test -- execute.remote` (covers all six local adapters' SSH paths) - Manual QA: ran a claude-local agent against an SSH-backed environment, confirmed the agent successfully called back to `/api/agent-callback/*` endpoints during the run ## Risks - Refactor touches all six local adapters. If any adapter had subtle SSH-specific behaviour that wasn't captured in tests, it could regress. Mitigation: each adapter's `execute.remote.test.ts` was extended. - `paperclipApiUrl` removal from `SshRemoteExecutionSpec` is a breaking type change for any internal consumer. Verified no external plugins consume this type. - The new `CommandManagedRuntimeRunner` shape is a public surface in `@paperclipai/adapter-utils`; downstream plugins implementing custom runners may need updates, but no such plugins exist in this repo. ## Model Used - OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI - Provider: OpenAI - Used to author the code changes in this PR ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots — N/A - [ ] I have updated relevant documentation to reflect my changes — N/A - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-05-03 12:43:52 -07:00
startAdapterExecutionTargetPaperclipBridge,
Add SSH environment support (#4358) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The environments subsystem already models execution environments, but before this branch there was no end-to-end SSH-backed runtime path for agents to actually run work against a remote box > - That meant agents could be configured around environment concepts without a reliable way to execute adapter sessions remotely, sync workspace state, and preserve run context across supported adapters > - We also need environment selection to participate in normal Paperclip control-plane behavior: agent defaults, project/issue selection, route validation, and environment probing > - Because this capability is still experimental, the UI surface should be easy to hide and easy to remove later without undoing the underlying implementation > - This pull request adds SSH environment execution support across the runtime, adapters, routes, schema, and tests, then puts the visible environment-management UI behind an experimental flag > - The benefit is that we can validate real SSH-backed agent execution now while keeping the user-facing controls safely gated until the feature is ready to come out of experimentation ## What Changed - Added SSH-backed execution target support in the shared adapter runtime, including remote workspace preparation, skill/runtime asset sync, remote session handling, and workspace restore behavior after runs. - Added SSH execution coverage for supported local adapters, plus remote execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi. - Added environment selection and environment-management backend support needed for SSH execution, including route/service work, validation, probing, and agent default environment persistence. - Added CLI support for SSH environment lab verification and updated related docs/tests. - Added the `enableEnvironments` experimental flag and gated the environment UI behind it on company settings, agent configuration, and project configuration surfaces. ## Verification - `pnpm exec vitest run packages/adapters/claude-local/src/server/execute.remote.test.ts packages/adapters/cursor-local/src/server/execute.remote.test.ts packages/adapters/gemini-local/src/server/execute.remote.test.ts packages/adapters/opencode-local/src/server/execute.remote.test.ts packages/adapters/pi-local/src/server/execute.remote.test.ts` - `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts` - `pnpm exec vitest run server/src/__tests__/instance-settings-routes.test.ts` - `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts ui/src/lib/new-agent-runtime-config.test.ts` - `pnpm -r typecheck` - `pnpm build` - Manual verification on a branch-local dev server: - enabled the experimental flag - created an SSH environment - created a Linux Claude agent using that environment - confirmed a run executed on the Linux box and synced workspace changes back ## Risks - Medium: this touches runtime execution flow across multiple adapters, so regressions would likely show up in remote session setup, workspace sync, or environment selection precedence. - The UI flag reduces exposure, but the underlying runtime and route changes are still substantial and rely on migration correctness. - The change set is broad across adapters, control-plane services, migrations, and UI gating, so review should pay close attention to environment-selection precedence and remote workspace lifecycle behavior. ## Model Used - OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding model with tool use and code execution in the local repo workspace. The local adapter does not surface a more specific public model version string in this branch workflow. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-23 19:15:22 -07:00
} = vi.hoisted(() => ({
runChildProcess: vi.fn(async () => ({
exitCode: 0,
signal: null,
timedOut: false,
stdout: [
JSON.stringify({ type: "system", session_id: "cursor-session-1" }),
JSON.stringify({ type: "assistant", text: "hello" }),
JSON.stringify({ type: "result", is_error: false, result: "hello", session_id: "cursor-session-1" }),
].join("\n"),
stderr: "",
pid: 123,
startedAt: new Date().toISOString(),
})),
ensureCommandResolvable: vi.fn(async () => undefined),
resolveCommandForLogs: vi.fn(async () => "ssh://fixture@127.0.0.1:2222/remote/workspace :: agent"),
Harden remote workspace sync and restore flows (#5444) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - When an agent runs against a remote target, Paperclip syncs the workspace out to the remote at run start and restores changes back to the local workspace at run end > - The previous restore flow naïvely overwrote local files with whatever the remote returned, so files that the remote run never touched but had timestamp/mode drift could be needlessly rewritten — and a single static `refs/paperclip/ssh-sync/imported` ref made concurrent SSH workspace exports race on the same git ref > - This pull request adds a `workspace-restore-merge` module that diffs a pre-run snapshot against the post-run remote state and only writes back files the remote actually changed; SSH workspace exports now use a per-import unique ref so concurrent runs can't trample each other > - Every adapter's execute path threads the snapshot through `prepareAdapterExecutionTargetRuntime` so the merge has the baseline it needs > - The benefit is workspace restores no longer churn untouched files, and concurrent SSH runs no longer collide on the import ref ## What Changed - `packages/adapter-utils/src/workspace-restore-merge.{ts,test.ts}`: new module — directory snapshot (kind/mode/sha256/symlink target) plus snapshot-aware merge that writes only the files the remote changed - `packages/adapter-utils/src/ssh.ts`: SSH workspace export uses a per-import unique ref (`refs/paperclip/ssh-sync/imported/<uuid>`); restore goes through the new merge helper; `ssh-fixture.test.ts` covers the unique-ref + merge paths - `packages/adapter-utils/src/sandbox-managed-runtime.ts` + `remote-managed-runtime.ts`: thread the snapshot/merge through the sandbox and SSH paths - `packages/adapter-utils/src/server-utils.{ts,test.ts}` + `execution-target.ts`: helpers for capturing the pre-run snapshot; `prepareAdapterExecutionTargetRuntime` gains required `runId` and optional `workspaceRemoteDir`, and returns the realized `workspaceRemoteDir` - Each adapter's `execute.ts` (acpx, claude, codex, cursor, gemini, opencode, pi) takes the snapshot at run start and passes it through to the runtime restore - Remote execute test mocks updated to match the new `prepareWorkspaceForSshExecution` return shape and the per-run `${managedRemoteWorkspace}` cwd subdirectory ## Verification - `pnpm vitest run --no-coverage --project @paperclipai/adapter-utils --project @paperclipai/adapter-acpx-local --project @paperclipai/adapter-claude-local --project @paperclipai/adapter-codex-local --project @paperclipai/adapter-cursor-local --project @paperclipai/adapter-gemini-local --project @paperclipai/adapter-opencode-local --project @paperclipai/adapter-pi-local` — 196/196 passing - `pnpm typecheck` clean across the workspace ## Risks Medium. The restore path now writes a strict subset of what it previously did — files the remote did not touch are no longer rewritten. If any flow was relying on a touch-without-content-change being copied back (timestamp or permission propagation only), that behavior is now skipped. Snapshot capture adds an O(N-files-in-workspace) hash pass at run start; the cost is bounded by the existing exclude list. The `runId` parameter on `prepareAdapterExecutionTargetRuntime` is now required — every in-tree caller is updated; out-of-tree adapter authors need to pass it. ## Model Used Claude Opus 4.7 (1M context) ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable — new module + every adapter execute path covered - [x] If this change affects the UI, I have included before/after screenshots — N/A (no UI) - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-05-07 14:44:45 -07:00
prepareWorkspaceForSshExecution: vi.fn(async () => ({ gitBacked: false })),
Add SSH environment support (#4358) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The environments subsystem already models execution environments, but before this branch there was no end-to-end SSH-backed runtime path for agents to actually run work against a remote box > - That meant agents could be configured around environment concepts without a reliable way to execute adapter sessions remotely, sync workspace state, and preserve run context across supported adapters > - We also need environment selection to participate in normal Paperclip control-plane behavior: agent defaults, project/issue selection, route validation, and environment probing > - Because this capability is still experimental, the UI surface should be easy to hide and easy to remove later without undoing the underlying implementation > - This pull request adds SSH environment execution support across the runtime, adapters, routes, schema, and tests, then puts the visible environment-management UI behind an experimental flag > - The benefit is that we can validate real SSH-backed agent execution now while keeping the user-facing controls safely gated until the feature is ready to come out of experimentation ## What Changed - Added SSH-backed execution target support in the shared adapter runtime, including remote workspace preparation, skill/runtime asset sync, remote session handling, and workspace restore behavior after runs. - Added SSH execution coverage for supported local adapters, plus remote execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi. - Added environment selection and environment-management backend support needed for SSH execution, including route/service work, validation, probing, and agent default environment persistence. - Added CLI support for SSH environment lab verification and updated related docs/tests. - Added the `enableEnvironments` experimental flag and gated the environment UI behind it on company settings, agent configuration, and project configuration surfaces. ## Verification - `pnpm exec vitest run packages/adapters/claude-local/src/server/execute.remote.test.ts packages/adapters/cursor-local/src/server/execute.remote.test.ts packages/adapters/gemini-local/src/server/execute.remote.test.ts packages/adapters/opencode-local/src/server/execute.remote.test.ts packages/adapters/pi-local/src/server/execute.remote.test.ts` - `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts` - `pnpm exec vitest run server/src/__tests__/instance-settings-routes.test.ts` - `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts ui/src/lib/new-agent-runtime-config.test.ts` - `pnpm -r typecheck` - `pnpm build` - Manual verification on a branch-local dev server: - enabled the experimental flag - created an SSH environment - created a Linux Claude agent using that environment - confirmed a run executed on the Linux box and synced workspace changes back ## Risks - Medium: this touches runtime execution flow across multiple adapters, so regressions would likely show up in remote session setup, workspace sync, or environment selection precedence. - The UI flag reduces exposure, but the underlying runtime and route changes are still substantial and rely on migration correctness. - The change set is broad across adapters, control-plane services, migrations, and UI gating, so review should pay close attention to environment-selection precedence and remote workspace lifecycle behavior. ## Model Used - OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding model with tool use and code execution in the local repo workspace. The local adapter does not surface a more specific public model version string in this branch workflow. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-23 19:15:22 -07:00
restoreWorkspaceFromSshExecution: vi.fn(async () => undefined),
runSshCommand: vi.fn(async () => ({
stdout: "/home/agent",
stderr: "",
exitCode: 0,
})),
syncDirectoryToSsh: vi.fn(async () => undefined),
Migrate SSH environment callback to bridge (#5116) > **Stacked PR (part 3 of 7).** Depends on: - PR #5114 - PR #5115 > Diff against `master` includes commits from earlier PRs in the stack — the new commit in this PR is the topmost one. ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Agents executing on a remote SSH-backed environment need a way to call back into > the Paperclip control plane (run events, log streaming, signals) > - When the SSH host can't reach the Paperclip host (NAT, firewalls, or simply not > on the same network), the run silently fails or hangs — a recurring class of > failure during SSH testing > - In sandboxed environments we already solved this with a callback bridge that > tunnels back through the existing connection; SSH was the odd one out > - This PR migrates SSH execution to use the same callback bridge, so every > adapter's remote run uses one consistent reverse-channel. Per-adapter SSH glue > is deleted in favour of a shared `CommandManagedRuntimeRunner` built from the > SSH spec > - The benefit is fewer SSH-specific failure modes, a smaller code surface, and > one place to evolve the callback contract going forward ## What Changed - Added `createSshCommandManagedRuntimeRunner` in `packages/adapter-utils/src/ssh.ts` that adapts an SSH spec into a generic command-managed-runtime runner (with cwd, env, and timeout handling) - Removed `paperclipApiUrl` from `SshRemoteExecutionSpec`; the bridge URL now flows through the shared runner - Reworked `execution-target.ts` to use the SSH runner alongside sandbox runners via a unified `CommandManagedRuntimeRunner` interface - Simplified `remote-managed-runtime.ts` and `sandbox-managed-runtime.ts` to consume the shared runner abstraction - Deleted per-adapter SSH callback wiring from claude-local, codex-local, cursor-local, gemini-local, opencode-local, pi-local execute.ts files - Removed `environment-runtime-driver-contract.test.ts` (the contract is now enforced by `environment-execution-target.test.ts`) - Added/updated `execute.remote.test.ts` cases for each adapter to cover the SSH runner path ## Verification - `pnpm --filter @paperclipai/adapter-utils test` - `pnpm test -- execute.remote` (covers all six local adapters' SSH paths) - Manual QA: ran a claude-local agent against an SSH-backed environment, confirmed the agent successfully called back to `/api/agent-callback/*` endpoints during the run ## Risks - Refactor touches all six local adapters. If any adapter had subtle SSH-specific behaviour that wasn't captured in tests, it could regress. Mitigation: each adapter's `execute.remote.test.ts` was extended. - `paperclipApiUrl` removal from `SshRemoteExecutionSpec` is a breaking type change for any internal consumer. Verified no external plugins consume this type. - The new `CommandManagedRuntimeRunner` shape is a public surface in `@paperclipai/adapter-utils`; downstream plugins implementing custom runners may need updates, but no such plugins exist in this repo. ## Model Used - OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI - Provider: OpenAI - Used to author the code changes in this PR ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots — N/A - [ ] I have updated relevant documentation to reflect my changes — N/A - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-05-03 12:43:52 -07:00
startAdapterExecutionTargetPaperclipBridge: vi.fn(async () => ({
env: {
PAPERCLIP_API_URL: "http://127.0.0.1:4310",
PAPERCLIP_API_KEY: "bridge-token",
PAPERCLIP_API_BRIDGE_MODE: "queue_v1",
},
stop: async () => {},
})),
Add SSH environment support (#4358) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The environments subsystem already models execution environments, but before this branch there was no end-to-end SSH-backed runtime path for agents to actually run work against a remote box > - That meant agents could be configured around environment concepts without a reliable way to execute adapter sessions remotely, sync workspace state, and preserve run context across supported adapters > - We also need environment selection to participate in normal Paperclip control-plane behavior: agent defaults, project/issue selection, route validation, and environment probing > - Because this capability is still experimental, the UI surface should be easy to hide and easy to remove later without undoing the underlying implementation > - This pull request adds SSH environment execution support across the runtime, adapters, routes, schema, and tests, then puts the visible environment-management UI behind an experimental flag > - The benefit is that we can validate real SSH-backed agent execution now while keeping the user-facing controls safely gated until the feature is ready to come out of experimentation ## What Changed - Added SSH-backed execution target support in the shared adapter runtime, including remote workspace preparation, skill/runtime asset sync, remote session handling, and workspace restore behavior after runs. - Added SSH execution coverage for supported local adapters, plus remote execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi. - Added environment selection and environment-management backend support needed for SSH execution, including route/service work, validation, probing, and agent default environment persistence. - Added CLI support for SSH environment lab verification and updated related docs/tests. - Added the `enableEnvironments` experimental flag and gated the environment UI behind it on company settings, agent configuration, and project configuration surfaces. ## Verification - `pnpm exec vitest run packages/adapters/claude-local/src/server/execute.remote.test.ts packages/adapters/cursor-local/src/server/execute.remote.test.ts packages/adapters/gemini-local/src/server/execute.remote.test.ts packages/adapters/opencode-local/src/server/execute.remote.test.ts packages/adapters/pi-local/src/server/execute.remote.test.ts` - `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts` - `pnpm exec vitest run server/src/__tests__/instance-settings-routes.test.ts` - `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts ui/src/lib/new-agent-runtime-config.test.ts` - `pnpm -r typecheck` - `pnpm build` - Manual verification on a branch-local dev server: - enabled the experimental flag - created an SSH environment - created a Linux Claude agent using that environment - confirmed a run executed on the Linux box and synced workspace changes back ## Risks - Medium: this touches runtime execution flow across multiple adapters, so regressions would likely show up in remote session setup, workspace sync, or environment selection precedence. - The UI flag reduces exposure, but the underlying runtime and route changes are still substantial and rely on migration correctness. - The change set is broad across adapters, control-plane services, migrations, and UI gating, so review should pay close attention to environment-selection precedence and remote workspace lifecycle behavior. ## Model Used - OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding model with tool use and code execution in the local repo workspace. The local adapter does not surface a more specific public model version string in this branch workflow. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-23 19:15:22 -07:00
}));
vi.mock("@paperclipai/adapter-utils/server-utils", async () => {
const actual = await vi.importActual<typeof import("@paperclipai/adapter-utils/server-utils")>(
"@paperclipai/adapter-utils/server-utils",
);
return {
...actual,
ensureCommandResolvable,
resolveCommandForLogs,
runChildProcess,
};
});
vi.mock("@paperclipai/adapter-utils/ssh", async () => {
const actual = await vi.importActual<typeof import("@paperclipai/adapter-utils/ssh")>(
"@paperclipai/adapter-utils/ssh",
);
return {
...actual,
prepareWorkspaceForSshExecution,
restoreWorkspaceFromSshExecution,
runSshCommand,
syncDirectoryToSsh,
};
});
Migrate SSH environment callback to bridge (#5116) > **Stacked PR (part 3 of 7).** Depends on: - PR #5114 - PR #5115 > Diff against `master` includes commits from earlier PRs in the stack — the new commit in this PR is the topmost one. ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Agents executing on a remote SSH-backed environment need a way to call back into > the Paperclip control plane (run events, log streaming, signals) > - When the SSH host can't reach the Paperclip host (NAT, firewalls, or simply not > on the same network), the run silently fails or hangs — a recurring class of > failure during SSH testing > - In sandboxed environments we already solved this with a callback bridge that > tunnels back through the existing connection; SSH was the odd one out > - This PR migrates SSH execution to use the same callback bridge, so every > adapter's remote run uses one consistent reverse-channel. Per-adapter SSH glue > is deleted in favour of a shared `CommandManagedRuntimeRunner` built from the > SSH spec > - The benefit is fewer SSH-specific failure modes, a smaller code surface, and > one place to evolve the callback contract going forward ## What Changed - Added `createSshCommandManagedRuntimeRunner` in `packages/adapter-utils/src/ssh.ts` that adapts an SSH spec into a generic command-managed-runtime runner (with cwd, env, and timeout handling) - Removed `paperclipApiUrl` from `SshRemoteExecutionSpec`; the bridge URL now flows through the shared runner - Reworked `execution-target.ts` to use the SSH runner alongside sandbox runners via a unified `CommandManagedRuntimeRunner` interface - Simplified `remote-managed-runtime.ts` and `sandbox-managed-runtime.ts` to consume the shared runner abstraction - Deleted per-adapter SSH callback wiring from claude-local, codex-local, cursor-local, gemini-local, opencode-local, pi-local execute.ts files - Removed `environment-runtime-driver-contract.test.ts` (the contract is now enforced by `environment-execution-target.test.ts`) - Added/updated `execute.remote.test.ts` cases for each adapter to cover the SSH runner path ## Verification - `pnpm --filter @paperclipai/adapter-utils test` - `pnpm test -- execute.remote` (covers all six local adapters' SSH paths) - Manual QA: ran a claude-local agent against an SSH-backed environment, confirmed the agent successfully called back to `/api/agent-callback/*` endpoints during the run ## Risks - Refactor touches all six local adapters. If any adapter had subtle SSH-specific behaviour that wasn't captured in tests, it could regress. Mitigation: each adapter's `execute.remote.test.ts` was extended. - `paperclipApiUrl` removal from `SshRemoteExecutionSpec` is a breaking type change for any internal consumer. Verified no external plugins consume this type. - The new `CommandManagedRuntimeRunner` shape is a public surface in `@paperclipai/adapter-utils`; downstream plugins implementing custom runners may need updates, but no such plugins exist in this repo. ## Model Used - OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI - Provider: OpenAI - Used to author the code changes in this PR ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots — N/A - [ ] I have updated relevant documentation to reflect my changes — N/A - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-05-03 12:43:52 -07:00
vi.mock("@paperclipai/adapter-utils/execution-target", async () => {
const actual = await vi.importActual<typeof import("@paperclipai/adapter-utils/execution-target")>(
"@paperclipai/adapter-utils/execution-target",
);
return {
...actual,
startAdapterExecutionTargetPaperclipBridge,
};
});
Add SSH environment support (#4358) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The environments subsystem already models execution environments, but before this branch there was no end-to-end SSH-backed runtime path for agents to actually run work against a remote box > - That meant agents could be configured around environment concepts without a reliable way to execute adapter sessions remotely, sync workspace state, and preserve run context across supported adapters > - We also need environment selection to participate in normal Paperclip control-plane behavior: agent defaults, project/issue selection, route validation, and environment probing > - Because this capability is still experimental, the UI surface should be easy to hide and easy to remove later without undoing the underlying implementation > - This pull request adds SSH environment execution support across the runtime, adapters, routes, schema, and tests, then puts the visible environment-management UI behind an experimental flag > - The benefit is that we can validate real SSH-backed agent execution now while keeping the user-facing controls safely gated until the feature is ready to come out of experimentation ## What Changed - Added SSH-backed execution target support in the shared adapter runtime, including remote workspace preparation, skill/runtime asset sync, remote session handling, and workspace restore behavior after runs. - Added SSH execution coverage for supported local adapters, plus remote execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi. - Added environment selection and environment-management backend support needed for SSH execution, including route/service work, validation, probing, and agent default environment persistence. - Added CLI support for SSH environment lab verification and updated related docs/tests. - Added the `enableEnvironments` experimental flag and gated the environment UI behind it on company settings, agent configuration, and project configuration surfaces. ## Verification - `pnpm exec vitest run packages/adapters/claude-local/src/server/execute.remote.test.ts packages/adapters/cursor-local/src/server/execute.remote.test.ts packages/adapters/gemini-local/src/server/execute.remote.test.ts packages/adapters/opencode-local/src/server/execute.remote.test.ts packages/adapters/pi-local/src/server/execute.remote.test.ts` - `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts` - `pnpm exec vitest run server/src/__tests__/instance-settings-routes.test.ts` - `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts ui/src/lib/new-agent-runtime-config.test.ts` - `pnpm -r typecheck` - `pnpm build` - Manual verification on a branch-local dev server: - enabled the experimental flag - created an SSH environment - created a Linux Claude agent using that environment - confirmed a run executed on the Linux box and synced workspace changes back ## Risks - Medium: this touches runtime execution flow across multiple adapters, so regressions would likely show up in remote session setup, workspace sync, or environment selection precedence. - The UI flag reduces exposure, but the underlying runtime and route changes are still substantial and rely on migration correctness. - The change set is broad across adapters, control-plane services, migrations, and UI gating, so review should pay close attention to environment-selection precedence and remote workspace lifecycle behavior. ## Model Used - OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding model with tool use and code execution in the local repo workspace. The local adapter does not surface a more specific public model version string in this branch workflow. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-23 19:15:22 -07:00
import { execute } from "./execute.js";
describe("cursor remote execution", () => {
const cleanupDirs: string[] = [];
afterEach(async () => {
vi.clearAllMocks();
while (cleanupDirs.length > 0) {
const dir = cleanupDirs.pop();
if (!dir) continue;
await rm(dir, { recursive: true, force: true }).catch(() => undefined);
}
});
it("prepares the workspace, syncs Cursor skills, and restores workspace changes for remote SSH execution", async () => {
const rootDir = await mkdtemp(path.join(os.tmpdir(), "paperclip-cursor-remote-"));
cleanupDirs.push(rootDir);
const workspaceDir = path.join(rootDir, "workspace");
Fix remote workspace environment shaping (#5118) > **Stacked PR (part 5 of 7).** Depends on: - PR #5114 - PR #5115 - PR #5116 - PR #5117 > Diff against `master` includes commits from earlier PRs in the stack — the new commit in this PR is the topmost one. ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Agents run with a Paperclip-shaped environment (`PAPERCLIP_WORKSPACE_CWD`, > worktree path, `PAPERCLIP_WORKSPACES_JSON` hints) so the CLI can locate the > correct project tree > - SSH testing reproduced a real failure: a Codex SSH run wrote to > `/tmp/paperclip-env-matrix-...` (the *host* path) instead of the realized > remote workspace at `/home/<user>/paperclip-env-matrix-ssh-claude/...` > because the adapter injected `PAPERCLIP_WORKSPACE_CWD=/tmp/...` into the > remote env > - Code review on the initial codex-only fix asked to roll the same approach > into every other SSH-capable adapter (claude, acpx, cursor, opencode, gemini, > pi) via a shared helper rather than duplicating per-adapter > - This PR adds `shapePaperclipWorkspaceEnvForExecution` in adapter-utils that, > when the execution target is remote: replaces local cwd with the realized > execution cwd, nulls out worktree path (which has no remote meaning), and > rewrites/strips `cwd` entries in workspace hints based on what was actually > synced. Every adapter calls it before invoking the remote runner > - The benefit is that remote runs see the realized remote workspace, host-local > paths stop leaking into remote env, and the rule is unit-tested in one place ## What Changed - Added `shapePaperclipWorkspaceEnvForExecution` to `packages/adapter-utils/src/server-utils.ts` with full unit coverage (`server-utils.test.ts`) - Each of acpx-local, claude-local, codex-local, cursor-local, gemini-local, opencode-local, pi-local now calls the new shaper before issuing the remote command and feeds the shaped values into `applyPaperclipWorkspaceEnv` - Per-adapter `execute.remote.test.ts` files extended to cover the new shaping behaviour: localhost paths replaced with remote cwd, foreign-cwd hints stripped, worktree path nulled out for remote targets - `acpx-local/src/server/execute.test.ts` extended with shaping coverage ## Verification - `pnpm test -- server-utils execute.remote` - `pnpm --filter @paperclipai/adapter-acpx-local test` - Manual QA reproducing the original failure: 1. Provision an E2B sandbox environment for the Paperclip QA company 2. Assign an issue to a remote-targeted claude-local agent and confirm the run starts in the correct remote cwd (no `/Users/...` path leakage in the run logs) 3. Repeat for opencode-local and pi-local ## Risks - Behavioural shift: hints whose `cwd` doesn't match the workspace cwd are now stripped on remote targets. If any adapter relied on a leaked local hint cwd, it will see a missing `cwd` instead. Reviewed all current callers — none do. - Adds a small per-run cost (path resolve + string normalisation) on every remote execution. Negligible. - Worktree path is now nulled out on remote (it has no meaning there). Adapters that previously read the value defensively will continue to work. ## Model Used - OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI - Provider: OpenAI - Used to author the code changes in this PR ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots — N/A - [ ] I have updated relevant documentation to reflect my changes — N/A - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-05-03 13:17:52 -07:00
const alternateWorkspaceDir = path.join(rootDir, "workspace-other");
Add SSH environment support (#4358) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The environments subsystem already models execution environments, but before this branch there was no end-to-end SSH-backed runtime path for agents to actually run work against a remote box > - That meant agents could be configured around environment concepts without a reliable way to execute adapter sessions remotely, sync workspace state, and preserve run context across supported adapters > - We also need environment selection to participate in normal Paperclip control-plane behavior: agent defaults, project/issue selection, route validation, and environment probing > - Because this capability is still experimental, the UI surface should be easy to hide and easy to remove later without undoing the underlying implementation > - This pull request adds SSH environment execution support across the runtime, adapters, routes, schema, and tests, then puts the visible environment-management UI behind an experimental flag > - The benefit is that we can validate real SSH-backed agent execution now while keeping the user-facing controls safely gated until the feature is ready to come out of experimentation ## What Changed - Added SSH-backed execution target support in the shared adapter runtime, including remote workspace preparation, skill/runtime asset sync, remote session handling, and workspace restore behavior after runs. - Added SSH execution coverage for supported local adapters, plus remote execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi. - Added environment selection and environment-management backend support needed for SSH execution, including route/service work, validation, probing, and agent default environment persistence. - Added CLI support for SSH environment lab verification and updated related docs/tests. - Added the `enableEnvironments` experimental flag and gated the environment UI behind it on company settings, agent configuration, and project configuration surfaces. ## Verification - `pnpm exec vitest run packages/adapters/claude-local/src/server/execute.remote.test.ts packages/adapters/cursor-local/src/server/execute.remote.test.ts packages/adapters/gemini-local/src/server/execute.remote.test.ts packages/adapters/opencode-local/src/server/execute.remote.test.ts packages/adapters/pi-local/src/server/execute.remote.test.ts` - `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts` - `pnpm exec vitest run server/src/__tests__/instance-settings-routes.test.ts` - `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts ui/src/lib/new-agent-runtime-config.test.ts` - `pnpm -r typecheck` - `pnpm build` - Manual verification on a branch-local dev server: - enabled the experimental flag - created an SSH environment - created a Linux Claude agent using that environment - confirmed a run executed on the Linux box and synced workspace changes back ## Risks - Medium: this touches runtime execution flow across multiple adapters, so regressions would likely show up in remote session setup, workspace sync, or environment selection precedence. - The UI flag reduces exposure, but the underlying runtime and route changes are still substantial and rely on migration correctness. - The change set is broad across adapters, control-plane services, migrations, and UI gating, so review should pay close attention to environment-selection precedence and remote workspace lifecycle behavior. ## Model Used - OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding model with tool use and code execution in the local repo workspace. The local adapter does not surface a more specific public model version string in this branch workflow. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-23 19:15:22 -07:00
await mkdir(workspaceDir, { recursive: true });
Fix remote workspace environment shaping (#5118) > **Stacked PR (part 5 of 7).** Depends on: - PR #5114 - PR #5115 - PR #5116 - PR #5117 > Diff against `master` includes commits from earlier PRs in the stack — the new commit in this PR is the topmost one. ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Agents run with a Paperclip-shaped environment (`PAPERCLIP_WORKSPACE_CWD`, > worktree path, `PAPERCLIP_WORKSPACES_JSON` hints) so the CLI can locate the > correct project tree > - SSH testing reproduced a real failure: a Codex SSH run wrote to > `/tmp/paperclip-env-matrix-...` (the *host* path) instead of the realized > remote workspace at `/home/<user>/paperclip-env-matrix-ssh-claude/...` > because the adapter injected `PAPERCLIP_WORKSPACE_CWD=/tmp/...` into the > remote env > - Code review on the initial codex-only fix asked to roll the same approach > into every other SSH-capable adapter (claude, acpx, cursor, opencode, gemini, > pi) via a shared helper rather than duplicating per-adapter > - This PR adds `shapePaperclipWorkspaceEnvForExecution` in adapter-utils that, > when the execution target is remote: replaces local cwd with the realized > execution cwd, nulls out worktree path (which has no remote meaning), and > rewrites/strips `cwd` entries in workspace hints based on what was actually > synced. Every adapter calls it before invoking the remote runner > - The benefit is that remote runs see the realized remote workspace, host-local > paths stop leaking into remote env, and the rule is unit-tested in one place ## What Changed - Added `shapePaperclipWorkspaceEnvForExecution` to `packages/adapter-utils/src/server-utils.ts` with full unit coverage (`server-utils.test.ts`) - Each of acpx-local, claude-local, codex-local, cursor-local, gemini-local, opencode-local, pi-local now calls the new shaper before issuing the remote command and feeds the shaped values into `applyPaperclipWorkspaceEnv` - Per-adapter `execute.remote.test.ts` files extended to cover the new shaping behaviour: localhost paths replaced with remote cwd, foreign-cwd hints stripped, worktree path nulled out for remote targets - `acpx-local/src/server/execute.test.ts` extended with shaping coverage ## Verification - `pnpm test -- server-utils execute.remote` - `pnpm --filter @paperclipai/adapter-acpx-local test` - Manual QA reproducing the original failure: 1. Provision an E2B sandbox environment for the Paperclip QA company 2. Assign an issue to a remote-targeted claude-local agent and confirm the run starts in the correct remote cwd (no `/Users/...` path leakage in the run logs) 3. Repeat for opencode-local and pi-local ## Risks - Behavioural shift: hints whose `cwd` doesn't match the workspace cwd are now stripped on remote targets. If any adapter relied on a leaked local hint cwd, it will see a missing `cwd` instead. Reviewed all current callers — none do. - Adds a small per-run cost (path resolve + string normalisation) on every remote execution. Negligible. - Worktree path is now nulled out on remote (it has no meaning there). Adapters that previously read the value defensively will continue to work. ## Model Used - OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI - Provider: OpenAI - Used to author the code changes in this PR ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots — N/A - [ ] I have updated relevant documentation to reflect my changes — N/A - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-05-03 13:17:52 -07:00
await mkdir(alternateWorkspaceDir, { recursive: true });
Add SSH environment support (#4358) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The environments subsystem already models execution environments, but before this branch there was no end-to-end SSH-backed runtime path for agents to actually run work against a remote box > - That meant agents could be configured around environment concepts without a reliable way to execute adapter sessions remotely, sync workspace state, and preserve run context across supported adapters > - We also need environment selection to participate in normal Paperclip control-plane behavior: agent defaults, project/issue selection, route validation, and environment probing > - Because this capability is still experimental, the UI surface should be easy to hide and easy to remove later without undoing the underlying implementation > - This pull request adds SSH environment execution support across the runtime, adapters, routes, schema, and tests, then puts the visible environment-management UI behind an experimental flag > - The benefit is that we can validate real SSH-backed agent execution now while keeping the user-facing controls safely gated until the feature is ready to come out of experimentation ## What Changed - Added SSH-backed execution target support in the shared adapter runtime, including remote workspace preparation, skill/runtime asset sync, remote session handling, and workspace restore behavior after runs. - Added SSH execution coverage for supported local adapters, plus remote execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi. - Added environment selection and environment-management backend support needed for SSH execution, including route/service work, validation, probing, and agent default environment persistence. - Added CLI support for SSH environment lab verification and updated related docs/tests. - Added the `enableEnvironments` experimental flag and gated the environment UI behind it on company settings, agent configuration, and project configuration surfaces. ## Verification - `pnpm exec vitest run packages/adapters/claude-local/src/server/execute.remote.test.ts packages/adapters/cursor-local/src/server/execute.remote.test.ts packages/adapters/gemini-local/src/server/execute.remote.test.ts packages/adapters/opencode-local/src/server/execute.remote.test.ts packages/adapters/pi-local/src/server/execute.remote.test.ts` - `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts` - `pnpm exec vitest run server/src/__tests__/instance-settings-routes.test.ts` - `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts ui/src/lib/new-agent-runtime-config.test.ts` - `pnpm -r typecheck` - `pnpm build` - Manual verification on a branch-local dev server: - enabled the experimental flag - created an SSH environment - created a Linux Claude agent using that environment - confirmed a run executed on the Linux box and synced workspace changes back ## Risks - Medium: this touches runtime execution flow across multiple adapters, so regressions would likely show up in remote session setup, workspace sync, or environment selection precedence. - The UI flag reduces exposure, but the underlying runtime and route changes are still substantial and rely on migration correctness. - The change set is broad across adapters, control-plane services, migrations, and UI gating, so review should pay close attention to environment-selection precedence and remote workspace lifecycle behavior. ## Model Used - OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding model with tool use and code execution in the local repo workspace. The local adapter does not surface a more specific public model version string in this branch workflow. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-23 19:15:22 -07:00
Stabilize Cursor sandbox runtime resolution (#5446) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The Cursor adapter spawns the Cursor CLI against local, SSH, and sandbox execution targets; on a fresh sandbox lease, it has to resolve where Cursor was installed > - The previous resolver only looked for `~/.local/bin/cursor-agent` even though the official installer (and the adapter's own `SANDBOX_INSTALL_COMMAND`) sometimes lays the binary down as `~/.local/bin/agent`, so a sandbox where the install ran successfully would still fail to find the CLI > - This pull request lets the resolver accept either basename and lets the caller pass an optional `remoteSystemHomeDirHint` so a probe doesn't pay the cost of a remote `printf $HOME` round-trip when the home directory is already known > - The benefit is sandboxed Cursor runs find the binary that the install actually produced, and runtime probes are cheaper when the home dir is already resolved ## What Changed - `packages/adapters/cursor-local/src/server/remote-command.ts`: accept either `agent` or `cursor-agent` as the preferred basename; new optional `remoteSystemHomeDirHint` short-circuits the home-dir probe - `packages/adapters/cursor-local/src/server/execute.ts`: thread the home-dir hint through, prefer the resolved binary path, and shift the effective execution cwd to the per-run managed subdirectory once the runtime is prepared - New `remote-command.test.ts` and `execute.test.ts` cover both basenames, the hint short-circuit, and the cwd shift - `packages/adapters/cursor-local/src/index.ts`: update doc string to reflect the broader resolution - `execute.remote.test.ts` updated to expect the managed-subdirectory cwd shape introduced by the cwd shift ## Verification - `pnpm vitest run --no-coverage --project @paperclipai/adapter-cursor-local` — 6/6 passing - `pnpm typecheck` clean - Manual: a fresh sandbox lease with `npm install -g …`-installed Cursor (binary lands as `~/.local/bin/agent`) now runs cleanly through the adapter ## Risks Low. Resolver is strictly broader (matches a superset of paths); existing setups with `~/.local/bin/cursor-agent` continue to work. The home-dir hint is opt-in; callers that don't pass it get the existing probe behavior. Cursor's effective execution cwd now matches the rest of the adapters (per-run managed subdirectory) — sessions previously rooted at the workspace root will land in the new subdirectory. ## Model Used Claude Opus 4.7 (1M context) ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable — new tests cover both basenames + hint short-circuit + cwd shift - [x] If this change affects the UI, I have included before/after screenshots — N/A (no UI) - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --- > **Stacked PR.** Sits on top of #5445 (which sits on #5444). Cumulative diff against `master` includes both of those PRs' content; the files touched by *this* PR's commit are listed under "What Changed" above. Will rebase onto `master` and force-push once the prerequisite PRs merge.
2026-05-07 15:00:28 -07:00
const managedRemoteWorkspace = "/remote/workspace/.paperclip-runtime/runs/run-1/workspace";
Add SSH environment support (#4358) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The environments subsystem already models execution environments, but before this branch there was no end-to-end SSH-backed runtime path for agents to actually run work against a remote box > - That meant agents could be configured around environment concepts without a reliable way to execute adapter sessions remotely, sync workspace state, and preserve run context across supported adapters > - We also need environment selection to participate in normal Paperclip control-plane behavior: agent defaults, project/issue selection, route validation, and environment probing > - Because this capability is still experimental, the UI surface should be easy to hide and easy to remove later without undoing the underlying implementation > - This pull request adds SSH environment execution support across the runtime, adapters, routes, schema, and tests, then puts the visible environment-management UI behind an experimental flag > - The benefit is that we can validate real SSH-backed agent execution now while keeping the user-facing controls safely gated until the feature is ready to come out of experimentation ## What Changed - Added SSH-backed execution target support in the shared adapter runtime, including remote workspace preparation, skill/runtime asset sync, remote session handling, and workspace restore behavior after runs. - Added SSH execution coverage for supported local adapters, plus remote execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi. - Added environment selection and environment-management backend support needed for SSH execution, including route/service work, validation, probing, and agent default environment persistence. - Added CLI support for SSH environment lab verification and updated related docs/tests. - Added the `enableEnvironments` experimental flag and gated the environment UI behind it on company settings, agent configuration, and project configuration surfaces. ## Verification - `pnpm exec vitest run packages/adapters/claude-local/src/server/execute.remote.test.ts packages/adapters/cursor-local/src/server/execute.remote.test.ts packages/adapters/gemini-local/src/server/execute.remote.test.ts packages/adapters/opencode-local/src/server/execute.remote.test.ts packages/adapters/pi-local/src/server/execute.remote.test.ts` - `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts` - `pnpm exec vitest run server/src/__tests__/instance-settings-routes.test.ts` - `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts ui/src/lib/new-agent-runtime-config.test.ts` - `pnpm -r typecheck` - `pnpm build` - Manual verification on a branch-local dev server: - enabled the experimental flag - created an SSH environment - created a Linux Claude agent using that environment - confirmed a run executed on the Linux box and synced workspace changes back ## Risks - Medium: this touches runtime execution flow across multiple adapters, so regressions would likely show up in remote session setup, workspace sync, or environment selection precedence. - The UI flag reduces exposure, but the underlying runtime and route changes are still substantial and rely on migration correctness. - The change set is broad across adapters, control-plane services, migrations, and UI gating, so review should pay close attention to environment-selection precedence and remote workspace lifecycle behavior. ## Model Used - OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding model with tool use and code execution in the local repo workspace. The local adapter does not surface a more specific public model version string in this branch workflow. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-23 19:15:22 -07:00
const result = await execute({
runId: "run-1",
agent: {
id: "agent-1",
companyId: "company-1",
name: "Cursor Builder",
adapterType: "cursor",
adapterConfig: {},
},
runtime: {
sessionId: null,
sessionParams: null,
sessionDisplayId: null,
taskKey: null,
},
config: {
command: "agent",
},
context: {
paperclipWorkspace: {
cwd: workspaceDir,
source: "project_primary",
},
Fix remote workspace environment shaping (#5118) > **Stacked PR (part 5 of 7).** Depends on: - PR #5114 - PR #5115 - PR #5116 - PR #5117 > Diff against `master` includes commits from earlier PRs in the stack — the new commit in this PR is the topmost one. ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Agents run with a Paperclip-shaped environment (`PAPERCLIP_WORKSPACE_CWD`, > worktree path, `PAPERCLIP_WORKSPACES_JSON` hints) so the CLI can locate the > correct project tree > - SSH testing reproduced a real failure: a Codex SSH run wrote to > `/tmp/paperclip-env-matrix-...` (the *host* path) instead of the realized > remote workspace at `/home/<user>/paperclip-env-matrix-ssh-claude/...` > because the adapter injected `PAPERCLIP_WORKSPACE_CWD=/tmp/...` into the > remote env > - Code review on the initial codex-only fix asked to roll the same approach > into every other SSH-capable adapter (claude, acpx, cursor, opencode, gemini, > pi) via a shared helper rather than duplicating per-adapter > - This PR adds `shapePaperclipWorkspaceEnvForExecution` in adapter-utils that, > when the execution target is remote: replaces local cwd with the realized > execution cwd, nulls out worktree path (which has no remote meaning), and > rewrites/strips `cwd` entries in workspace hints based on what was actually > synced. Every adapter calls it before invoking the remote runner > - The benefit is that remote runs see the realized remote workspace, host-local > paths stop leaking into remote env, and the rule is unit-tested in one place ## What Changed - Added `shapePaperclipWorkspaceEnvForExecution` to `packages/adapter-utils/src/server-utils.ts` with full unit coverage (`server-utils.test.ts`) - Each of acpx-local, claude-local, codex-local, cursor-local, gemini-local, opencode-local, pi-local now calls the new shaper before issuing the remote command and feeds the shaped values into `applyPaperclipWorkspaceEnv` - Per-adapter `execute.remote.test.ts` files extended to cover the new shaping behaviour: localhost paths replaced with remote cwd, foreign-cwd hints stripped, worktree path nulled out for remote targets - `acpx-local/src/server/execute.test.ts` extended with shaping coverage ## Verification - `pnpm test -- server-utils execute.remote` - `pnpm --filter @paperclipai/adapter-acpx-local test` - Manual QA reproducing the original failure: 1. Provision an E2B sandbox environment for the Paperclip QA company 2. Assign an issue to a remote-targeted claude-local agent and confirm the run starts in the correct remote cwd (no `/Users/...` path leakage in the run logs) 3. Repeat for opencode-local and pi-local ## Risks - Behavioural shift: hints whose `cwd` doesn't match the workspace cwd are now stripped on remote targets. If any adapter relied on a leaked local hint cwd, it will see a missing `cwd` instead. Reviewed all current callers — none do. - Adds a small per-run cost (path resolve + string normalisation) on every remote execution. Negligible. - Worktree path is now nulled out on remote (it has no meaning there). Adapters that previously read the value defensively will continue to work. ## Model Used - OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI - Provider: OpenAI - Used to author the code changes in this PR ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots — N/A - [ ] I have updated relevant documentation to reflect my changes — N/A - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-05-03 13:17:52 -07:00
paperclipWorkspaces: [
{
workspaceId: "workspace-1",
cwd: workspaceDir,
repoUrl: "https://github.com/paperclipai/paperclip.git",
repoRef: "main",
},
{
workspaceId: "workspace-2",
cwd: alternateWorkspaceDir,
repoUrl: "https://github.com/paperclipai/paperclip.git",
repoRef: "feature/other",
},
],
Add SSH environment support (#4358) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The environments subsystem already models execution environments, but before this branch there was no end-to-end SSH-backed runtime path for agents to actually run work against a remote box > - That meant agents could be configured around environment concepts without a reliable way to execute adapter sessions remotely, sync workspace state, and preserve run context across supported adapters > - We also need environment selection to participate in normal Paperclip control-plane behavior: agent defaults, project/issue selection, route validation, and environment probing > - Because this capability is still experimental, the UI surface should be easy to hide and easy to remove later without undoing the underlying implementation > - This pull request adds SSH environment execution support across the runtime, adapters, routes, schema, and tests, then puts the visible environment-management UI behind an experimental flag > - The benefit is that we can validate real SSH-backed agent execution now while keeping the user-facing controls safely gated until the feature is ready to come out of experimentation ## What Changed - Added SSH-backed execution target support in the shared adapter runtime, including remote workspace preparation, skill/runtime asset sync, remote session handling, and workspace restore behavior after runs. - Added SSH execution coverage for supported local adapters, plus remote execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi. - Added environment selection and environment-management backend support needed for SSH execution, including route/service work, validation, probing, and agent default environment persistence. - Added CLI support for SSH environment lab verification and updated related docs/tests. - Added the `enableEnvironments` experimental flag and gated the environment UI behind it on company settings, agent configuration, and project configuration surfaces. ## Verification - `pnpm exec vitest run packages/adapters/claude-local/src/server/execute.remote.test.ts packages/adapters/cursor-local/src/server/execute.remote.test.ts packages/adapters/gemini-local/src/server/execute.remote.test.ts packages/adapters/opencode-local/src/server/execute.remote.test.ts packages/adapters/pi-local/src/server/execute.remote.test.ts` - `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts` - `pnpm exec vitest run server/src/__tests__/instance-settings-routes.test.ts` - `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts ui/src/lib/new-agent-runtime-config.test.ts` - `pnpm -r typecheck` - `pnpm build` - Manual verification on a branch-local dev server: - enabled the experimental flag - created an SSH environment - created a Linux Claude agent using that environment - confirmed a run executed on the Linux box and synced workspace changes back ## Risks - Medium: this touches runtime execution flow across multiple adapters, so regressions would likely show up in remote session setup, workspace sync, or environment selection precedence. - The UI flag reduces exposure, but the underlying runtime and route changes are still substantial and rely on migration correctness. - The change set is broad across adapters, control-plane services, migrations, and UI gating, so review should pay close attention to environment-selection precedence and remote workspace lifecycle behavior. ## Model Used - OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding model with tool use and code execution in the local repo workspace. The local adapter does not surface a more specific public model version string in this branch workflow. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-23 19:15:22 -07:00
},
executionTransport: {
remoteExecution: {
host: "127.0.0.1",
port: 2222,
username: "fixture",
remoteWorkspacePath: "/remote/workspace",
remoteCwd: "/remote/workspace",
privateKey: "PRIVATE KEY",
knownHosts: "[127.0.0.1]:2222 ssh-ed25519 AAAA",
strictHostKeyChecking: true,
},
},
onLog: async () => {},
});
expect(result.sessionParams).toMatchObject({
sessionId: "cursor-session-1",
Stabilize Cursor sandbox runtime resolution (#5446) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The Cursor adapter spawns the Cursor CLI against local, SSH, and sandbox execution targets; on a fresh sandbox lease, it has to resolve where Cursor was installed > - The previous resolver only looked for `~/.local/bin/cursor-agent` even though the official installer (and the adapter's own `SANDBOX_INSTALL_COMMAND`) sometimes lays the binary down as `~/.local/bin/agent`, so a sandbox where the install ran successfully would still fail to find the CLI > - This pull request lets the resolver accept either basename and lets the caller pass an optional `remoteSystemHomeDirHint` so a probe doesn't pay the cost of a remote `printf $HOME` round-trip when the home directory is already known > - The benefit is sandboxed Cursor runs find the binary that the install actually produced, and runtime probes are cheaper when the home dir is already resolved ## What Changed - `packages/adapters/cursor-local/src/server/remote-command.ts`: accept either `agent` or `cursor-agent` as the preferred basename; new optional `remoteSystemHomeDirHint` short-circuits the home-dir probe - `packages/adapters/cursor-local/src/server/execute.ts`: thread the home-dir hint through, prefer the resolved binary path, and shift the effective execution cwd to the per-run managed subdirectory once the runtime is prepared - New `remote-command.test.ts` and `execute.test.ts` cover both basenames, the hint short-circuit, and the cwd shift - `packages/adapters/cursor-local/src/index.ts`: update doc string to reflect the broader resolution - `execute.remote.test.ts` updated to expect the managed-subdirectory cwd shape introduced by the cwd shift ## Verification - `pnpm vitest run --no-coverage --project @paperclipai/adapter-cursor-local` — 6/6 passing - `pnpm typecheck` clean - Manual: a fresh sandbox lease with `npm install -g …`-installed Cursor (binary lands as `~/.local/bin/agent`) now runs cleanly through the adapter ## Risks Low. Resolver is strictly broader (matches a superset of paths); existing setups with `~/.local/bin/cursor-agent` continue to work. The home-dir hint is opt-in; callers that don't pass it get the existing probe behavior. Cursor's effective execution cwd now matches the rest of the adapters (per-run managed subdirectory) — sessions previously rooted at the workspace root will land in the new subdirectory. ## Model Used Claude Opus 4.7 (1M context) ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable — new tests cover both basenames + hint short-circuit + cwd shift - [x] If this change affects the UI, I have included before/after screenshots — N/A (no UI) - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --- > **Stacked PR.** Sits on top of #5445 (which sits on #5444). Cumulative diff against `master` includes both of those PRs' content; the files touched by *this* PR's commit are listed under "What Changed" above. Will rebase onto `master` and force-push once the prerequisite PRs merge.
2026-05-07 15:00:28 -07:00
cwd: managedRemoteWorkspace,
Add SSH environment support (#4358) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The environments subsystem already models execution environments, but before this branch there was no end-to-end SSH-backed runtime path for agents to actually run work against a remote box > - That meant agents could be configured around environment concepts without a reliable way to execute adapter sessions remotely, sync workspace state, and preserve run context across supported adapters > - We also need environment selection to participate in normal Paperclip control-plane behavior: agent defaults, project/issue selection, route validation, and environment probing > - Because this capability is still experimental, the UI surface should be easy to hide and easy to remove later without undoing the underlying implementation > - This pull request adds SSH environment execution support across the runtime, adapters, routes, schema, and tests, then puts the visible environment-management UI behind an experimental flag > - The benefit is that we can validate real SSH-backed agent execution now while keeping the user-facing controls safely gated until the feature is ready to come out of experimentation ## What Changed - Added SSH-backed execution target support in the shared adapter runtime, including remote workspace preparation, skill/runtime asset sync, remote session handling, and workspace restore behavior after runs. - Added SSH execution coverage for supported local adapters, plus remote execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi. - Added environment selection and environment-management backend support needed for SSH execution, including route/service work, validation, probing, and agent default environment persistence. - Added CLI support for SSH environment lab verification and updated related docs/tests. - Added the `enableEnvironments` experimental flag and gated the environment UI behind it on company settings, agent configuration, and project configuration surfaces. ## Verification - `pnpm exec vitest run packages/adapters/claude-local/src/server/execute.remote.test.ts packages/adapters/cursor-local/src/server/execute.remote.test.ts packages/adapters/gemini-local/src/server/execute.remote.test.ts packages/adapters/opencode-local/src/server/execute.remote.test.ts packages/adapters/pi-local/src/server/execute.remote.test.ts` - `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts` - `pnpm exec vitest run server/src/__tests__/instance-settings-routes.test.ts` - `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts ui/src/lib/new-agent-runtime-config.test.ts` - `pnpm -r typecheck` - `pnpm build` - Manual verification on a branch-local dev server: - enabled the experimental flag - created an SSH environment - created a Linux Claude agent using that environment - confirmed a run executed on the Linux box and synced workspace changes back ## Risks - Medium: this touches runtime execution flow across multiple adapters, so regressions would likely show up in remote session setup, workspace sync, or environment selection precedence. - The UI flag reduces exposure, but the underlying runtime and route changes are still substantial and rely on migration correctness. - The change set is broad across adapters, control-plane services, migrations, and UI gating, so review should pay close attention to environment-selection precedence and remote workspace lifecycle behavior. ## Model Used - OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding model with tool use and code execution in the local repo workspace. The local adapter does not surface a more specific public model version string in this branch workflow. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-23 19:15:22 -07:00
remoteExecution: {
transport: "ssh",
host: "127.0.0.1",
port: 2222,
username: "fixture",
Stabilize Cursor sandbox runtime resolution (#5446) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The Cursor adapter spawns the Cursor CLI against local, SSH, and sandbox execution targets; on a fresh sandbox lease, it has to resolve where Cursor was installed > - The previous resolver only looked for `~/.local/bin/cursor-agent` even though the official installer (and the adapter's own `SANDBOX_INSTALL_COMMAND`) sometimes lays the binary down as `~/.local/bin/agent`, so a sandbox where the install ran successfully would still fail to find the CLI > - This pull request lets the resolver accept either basename and lets the caller pass an optional `remoteSystemHomeDirHint` so a probe doesn't pay the cost of a remote `printf $HOME` round-trip when the home directory is already known > - The benefit is sandboxed Cursor runs find the binary that the install actually produced, and runtime probes are cheaper when the home dir is already resolved ## What Changed - `packages/adapters/cursor-local/src/server/remote-command.ts`: accept either `agent` or `cursor-agent` as the preferred basename; new optional `remoteSystemHomeDirHint` short-circuits the home-dir probe - `packages/adapters/cursor-local/src/server/execute.ts`: thread the home-dir hint through, prefer the resolved binary path, and shift the effective execution cwd to the per-run managed subdirectory once the runtime is prepared - New `remote-command.test.ts` and `execute.test.ts` cover both basenames, the hint short-circuit, and the cwd shift - `packages/adapters/cursor-local/src/index.ts`: update doc string to reflect the broader resolution - `execute.remote.test.ts` updated to expect the managed-subdirectory cwd shape introduced by the cwd shift ## Verification - `pnpm vitest run --no-coverage --project @paperclipai/adapter-cursor-local` — 6/6 passing - `pnpm typecheck` clean - Manual: a fresh sandbox lease with `npm install -g …`-installed Cursor (binary lands as `~/.local/bin/agent`) now runs cleanly through the adapter ## Risks Low. Resolver is strictly broader (matches a superset of paths); existing setups with `~/.local/bin/cursor-agent` continue to work. The home-dir hint is opt-in; callers that don't pass it get the existing probe behavior. Cursor's effective execution cwd now matches the rest of the adapters (per-run managed subdirectory) — sessions previously rooted at the workspace root will land in the new subdirectory. ## Model Used Claude Opus 4.7 (1M context) ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable — new tests cover both basenames + hint short-circuit + cwd shift - [x] If this change affects the UI, I have included before/after screenshots — N/A (no UI) - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --- > **Stacked PR.** Sits on top of #5445 (which sits on #5444). Cumulative diff against `master` includes both of those PRs' content; the files touched by *this* PR's commit are listed under "What Changed" above. Will rebase onto `master` and force-push once the prerequisite PRs merge.
2026-05-07 15:00:28 -07:00
remoteCwd: managedRemoteWorkspace,
Add SSH environment support (#4358) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The environments subsystem already models execution environments, but before this branch there was no end-to-end SSH-backed runtime path for agents to actually run work against a remote box > - That meant agents could be configured around environment concepts without a reliable way to execute adapter sessions remotely, sync workspace state, and preserve run context across supported adapters > - We also need environment selection to participate in normal Paperclip control-plane behavior: agent defaults, project/issue selection, route validation, and environment probing > - Because this capability is still experimental, the UI surface should be easy to hide and easy to remove later without undoing the underlying implementation > - This pull request adds SSH environment execution support across the runtime, adapters, routes, schema, and tests, then puts the visible environment-management UI behind an experimental flag > - The benefit is that we can validate real SSH-backed agent execution now while keeping the user-facing controls safely gated until the feature is ready to come out of experimentation ## What Changed - Added SSH-backed execution target support in the shared adapter runtime, including remote workspace preparation, skill/runtime asset sync, remote session handling, and workspace restore behavior after runs. - Added SSH execution coverage for supported local adapters, plus remote execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi. - Added environment selection and environment-management backend support needed for SSH execution, including route/service work, validation, probing, and agent default environment persistence. - Added CLI support for SSH environment lab verification and updated related docs/tests. - Added the `enableEnvironments` experimental flag and gated the environment UI behind it on company settings, agent configuration, and project configuration surfaces. ## Verification - `pnpm exec vitest run packages/adapters/claude-local/src/server/execute.remote.test.ts packages/adapters/cursor-local/src/server/execute.remote.test.ts packages/adapters/gemini-local/src/server/execute.remote.test.ts packages/adapters/opencode-local/src/server/execute.remote.test.ts packages/adapters/pi-local/src/server/execute.remote.test.ts` - `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts` - `pnpm exec vitest run server/src/__tests__/instance-settings-routes.test.ts` - `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts ui/src/lib/new-agent-runtime-config.test.ts` - `pnpm -r typecheck` - `pnpm build` - Manual verification on a branch-local dev server: - enabled the experimental flag - created an SSH environment - created a Linux Claude agent using that environment - confirmed a run executed on the Linux box and synced workspace changes back ## Risks - Medium: this touches runtime execution flow across multiple adapters, so regressions would likely show up in remote session setup, workspace sync, or environment selection precedence. - The UI flag reduces exposure, but the underlying runtime and route changes are still substantial and rely on migration correctness. - The change set is broad across adapters, control-plane services, migrations, and UI gating, so review should pay close attention to environment-selection precedence and remote workspace lifecycle behavior. ## Model Used - OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding model with tool use and code execution in the local repo workspace. The local adapter does not surface a more specific public model version string in this branch workflow. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-23 19:15:22 -07:00
},
});
expect(prepareWorkspaceForSshExecution).toHaveBeenCalledTimes(1);
expect(syncDirectoryToSsh).toHaveBeenCalledTimes(1);
expect(syncDirectoryToSsh).toHaveBeenCalledWith(expect.objectContaining({
Stabilize Cursor sandbox runtime resolution (#5446) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The Cursor adapter spawns the Cursor CLI against local, SSH, and sandbox execution targets; on a fresh sandbox lease, it has to resolve where Cursor was installed > - The previous resolver only looked for `~/.local/bin/cursor-agent` even though the official installer (and the adapter's own `SANDBOX_INSTALL_COMMAND`) sometimes lays the binary down as `~/.local/bin/agent`, so a sandbox where the install ran successfully would still fail to find the CLI > - This pull request lets the resolver accept either basename and lets the caller pass an optional `remoteSystemHomeDirHint` so a probe doesn't pay the cost of a remote `printf $HOME` round-trip when the home directory is already known > - The benefit is sandboxed Cursor runs find the binary that the install actually produced, and runtime probes are cheaper when the home dir is already resolved ## What Changed - `packages/adapters/cursor-local/src/server/remote-command.ts`: accept either `agent` or `cursor-agent` as the preferred basename; new optional `remoteSystemHomeDirHint` short-circuits the home-dir probe - `packages/adapters/cursor-local/src/server/execute.ts`: thread the home-dir hint through, prefer the resolved binary path, and shift the effective execution cwd to the per-run managed subdirectory once the runtime is prepared - New `remote-command.test.ts` and `execute.test.ts` cover both basenames, the hint short-circuit, and the cwd shift - `packages/adapters/cursor-local/src/index.ts`: update doc string to reflect the broader resolution - `execute.remote.test.ts` updated to expect the managed-subdirectory cwd shape introduced by the cwd shift ## Verification - `pnpm vitest run --no-coverage --project @paperclipai/adapter-cursor-local` — 6/6 passing - `pnpm typecheck` clean - Manual: a fresh sandbox lease with `npm install -g …`-installed Cursor (binary lands as `~/.local/bin/agent`) now runs cleanly through the adapter ## Risks Low. Resolver is strictly broader (matches a superset of paths); existing setups with `~/.local/bin/cursor-agent` continue to work. The home-dir hint is opt-in; callers that don't pass it get the existing probe behavior. Cursor's effective execution cwd now matches the rest of the adapters (per-run managed subdirectory) — sessions previously rooted at the workspace root will land in the new subdirectory. ## Model Used Claude Opus 4.7 (1M context) ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable — new tests cover both basenames + hint short-circuit + cwd shift - [x] If this change affects the UI, I have included before/after screenshots — N/A (no UI) - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --- > **Stacked PR.** Sits on top of #5445 (which sits on #5444). Cumulative diff against `master` includes both of those PRs' content; the files touched by *this* PR's commit are listed under "What Changed" above. Will rebase onto `master` and force-push once the prerequisite PRs merge.
2026-05-07 15:00:28 -07:00
remoteDir: `${managedRemoteWorkspace}/.paperclip-runtime/cursor/skills`,
Add SSH environment support (#4358) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The environments subsystem already models execution environments, but before this branch there was no end-to-end SSH-backed runtime path for agents to actually run work against a remote box > - That meant agents could be configured around environment concepts without a reliable way to execute adapter sessions remotely, sync workspace state, and preserve run context across supported adapters > - We also need environment selection to participate in normal Paperclip control-plane behavior: agent defaults, project/issue selection, route validation, and environment probing > - Because this capability is still experimental, the UI surface should be easy to hide and easy to remove later without undoing the underlying implementation > - This pull request adds SSH environment execution support across the runtime, adapters, routes, schema, and tests, then puts the visible environment-management UI behind an experimental flag > - The benefit is that we can validate real SSH-backed agent execution now while keeping the user-facing controls safely gated until the feature is ready to come out of experimentation ## What Changed - Added SSH-backed execution target support in the shared adapter runtime, including remote workspace preparation, skill/runtime asset sync, remote session handling, and workspace restore behavior after runs. - Added SSH execution coverage for supported local adapters, plus remote execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi. - Added environment selection and environment-management backend support needed for SSH execution, including route/service work, validation, probing, and agent default environment persistence. - Added CLI support for SSH environment lab verification and updated related docs/tests. - Added the `enableEnvironments` experimental flag and gated the environment UI behind it on company settings, agent configuration, and project configuration surfaces. ## Verification - `pnpm exec vitest run packages/adapters/claude-local/src/server/execute.remote.test.ts packages/adapters/cursor-local/src/server/execute.remote.test.ts packages/adapters/gemini-local/src/server/execute.remote.test.ts packages/adapters/opencode-local/src/server/execute.remote.test.ts packages/adapters/pi-local/src/server/execute.remote.test.ts` - `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts` - `pnpm exec vitest run server/src/__tests__/instance-settings-routes.test.ts` - `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts ui/src/lib/new-agent-runtime-config.test.ts` - `pnpm -r typecheck` - `pnpm build` - Manual verification on a branch-local dev server: - enabled the experimental flag - created an SSH environment - created a Linux Claude agent using that environment - confirmed a run executed on the Linux box and synced workspace changes back ## Risks - Medium: this touches runtime execution flow across multiple adapters, so regressions would likely show up in remote session setup, workspace sync, or environment selection precedence. - The UI flag reduces exposure, but the underlying runtime and route changes are still substantial and rely on migration correctness. - The change set is broad across adapters, control-plane services, migrations, and UI gating, so review should pay close attention to environment-selection precedence and remote workspace lifecycle behavior. ## Model Used - OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding model with tool use and code execution in the local repo workspace. The local adapter does not surface a more specific public model version string in this branch workflow. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-23 19:15:22 -07:00
followSymlinks: true,
}));
expect(runSshCommand).toHaveBeenCalledWith(
expect.anything(),
expect.stringContaining(".cursor/skills"),
expect.anything(),
);
const call = runChildProcess.mock.calls[0] as unknown as
| [string, string, string[], { env: Record<string, string>; remoteExecution?: { remoteCwd: string } | null }]
| undefined;
expect(call?.[2]).toContain("--workspace");
Stabilize Cursor sandbox runtime resolution (#5446) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The Cursor adapter spawns the Cursor CLI against local, SSH, and sandbox execution targets; on a fresh sandbox lease, it has to resolve where Cursor was installed > - The previous resolver only looked for `~/.local/bin/cursor-agent` even though the official installer (and the adapter's own `SANDBOX_INSTALL_COMMAND`) sometimes lays the binary down as `~/.local/bin/agent`, so a sandbox where the install ran successfully would still fail to find the CLI > - This pull request lets the resolver accept either basename and lets the caller pass an optional `remoteSystemHomeDirHint` so a probe doesn't pay the cost of a remote `printf $HOME` round-trip when the home directory is already known > - The benefit is sandboxed Cursor runs find the binary that the install actually produced, and runtime probes are cheaper when the home dir is already resolved ## What Changed - `packages/adapters/cursor-local/src/server/remote-command.ts`: accept either `agent` or `cursor-agent` as the preferred basename; new optional `remoteSystemHomeDirHint` short-circuits the home-dir probe - `packages/adapters/cursor-local/src/server/execute.ts`: thread the home-dir hint through, prefer the resolved binary path, and shift the effective execution cwd to the per-run managed subdirectory once the runtime is prepared - New `remote-command.test.ts` and `execute.test.ts` cover both basenames, the hint short-circuit, and the cwd shift - `packages/adapters/cursor-local/src/index.ts`: update doc string to reflect the broader resolution - `execute.remote.test.ts` updated to expect the managed-subdirectory cwd shape introduced by the cwd shift ## Verification - `pnpm vitest run --no-coverage --project @paperclipai/adapter-cursor-local` — 6/6 passing - `pnpm typecheck` clean - Manual: a fresh sandbox lease with `npm install -g …`-installed Cursor (binary lands as `~/.local/bin/agent`) now runs cleanly through the adapter ## Risks Low. Resolver is strictly broader (matches a superset of paths); existing setups with `~/.local/bin/cursor-agent` continue to work. The home-dir hint is opt-in; callers that don't pass it get the existing probe behavior. Cursor's effective execution cwd now matches the rest of the adapters (per-run managed subdirectory) — sessions previously rooted at the workspace root will land in the new subdirectory. ## Model Used Claude Opus 4.7 (1M context) ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable — new tests cover both basenames + hint short-circuit + cwd shift - [x] If this change affects the UI, I have included before/after screenshots — N/A (no UI) - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --- > **Stacked PR.** Sits on top of #5445 (which sits on #5444). Cumulative diff against `master` includes both of those PRs' content; the files touched by *this* PR's commit are listed under "What Changed" above. Will rebase onto `master` and force-push once the prerequisite PRs merge.
2026-05-07 15:00:28 -07:00
expect(call?.[2]).toContain(managedRemoteWorkspace);
expect(call?.[3].env.PAPERCLIP_WORKSPACE_CWD).toBe(managedRemoteWorkspace);
Fix remote workspace environment shaping (#5118) > **Stacked PR (part 5 of 7).** Depends on: - PR #5114 - PR #5115 - PR #5116 - PR #5117 > Diff against `master` includes commits from earlier PRs in the stack — the new commit in this PR is the topmost one. ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Agents run with a Paperclip-shaped environment (`PAPERCLIP_WORKSPACE_CWD`, > worktree path, `PAPERCLIP_WORKSPACES_JSON` hints) so the CLI can locate the > correct project tree > - SSH testing reproduced a real failure: a Codex SSH run wrote to > `/tmp/paperclip-env-matrix-...` (the *host* path) instead of the realized > remote workspace at `/home/<user>/paperclip-env-matrix-ssh-claude/...` > because the adapter injected `PAPERCLIP_WORKSPACE_CWD=/tmp/...` into the > remote env > - Code review on the initial codex-only fix asked to roll the same approach > into every other SSH-capable adapter (claude, acpx, cursor, opencode, gemini, > pi) via a shared helper rather than duplicating per-adapter > - This PR adds `shapePaperclipWorkspaceEnvForExecution` in adapter-utils that, > when the execution target is remote: replaces local cwd with the realized > execution cwd, nulls out worktree path (which has no remote meaning), and > rewrites/strips `cwd` entries in workspace hints based on what was actually > synced. Every adapter calls it before invoking the remote runner > - The benefit is that remote runs see the realized remote workspace, host-local > paths stop leaking into remote env, and the rule is unit-tested in one place ## What Changed - Added `shapePaperclipWorkspaceEnvForExecution` to `packages/adapter-utils/src/server-utils.ts` with full unit coverage (`server-utils.test.ts`) - Each of acpx-local, claude-local, codex-local, cursor-local, gemini-local, opencode-local, pi-local now calls the new shaper before issuing the remote command and feeds the shaped values into `applyPaperclipWorkspaceEnv` - Per-adapter `execute.remote.test.ts` files extended to cover the new shaping behaviour: localhost paths replaced with remote cwd, foreign-cwd hints stripped, worktree path nulled out for remote targets - `acpx-local/src/server/execute.test.ts` extended with shaping coverage ## Verification - `pnpm test -- server-utils execute.remote` - `pnpm --filter @paperclipai/adapter-acpx-local test` - Manual QA reproducing the original failure: 1. Provision an E2B sandbox environment for the Paperclip QA company 2. Assign an issue to a remote-targeted claude-local agent and confirm the run starts in the correct remote cwd (no `/Users/...` path leakage in the run logs) 3. Repeat for opencode-local and pi-local ## Risks - Behavioural shift: hints whose `cwd` doesn't match the workspace cwd are now stripped on remote targets. If any adapter relied on a leaked local hint cwd, it will see a missing `cwd` instead. Reviewed all current callers — none do. - Adds a small per-run cost (path resolve + string normalisation) on every remote execution. Negligible. - Worktree path is now nulled out on remote (it has no meaning there). Adapters that previously read the value defensively will continue to work. ## Model Used - OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI - Provider: OpenAI - Used to author the code changes in this PR ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots — N/A - [ ] I have updated relevant documentation to reflect my changes — N/A - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-05-03 13:17:52 -07:00
expect(JSON.parse(call?.[3].env.PAPERCLIP_WORKSPACES_JSON ?? "[]")).toEqual([
{
workspaceId: "workspace-1",
Stabilize Cursor sandbox runtime resolution (#5446) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The Cursor adapter spawns the Cursor CLI against local, SSH, and sandbox execution targets; on a fresh sandbox lease, it has to resolve where Cursor was installed > - The previous resolver only looked for `~/.local/bin/cursor-agent` even though the official installer (and the adapter's own `SANDBOX_INSTALL_COMMAND`) sometimes lays the binary down as `~/.local/bin/agent`, so a sandbox where the install ran successfully would still fail to find the CLI > - This pull request lets the resolver accept either basename and lets the caller pass an optional `remoteSystemHomeDirHint` so a probe doesn't pay the cost of a remote `printf $HOME` round-trip when the home directory is already known > - The benefit is sandboxed Cursor runs find the binary that the install actually produced, and runtime probes are cheaper when the home dir is already resolved ## What Changed - `packages/adapters/cursor-local/src/server/remote-command.ts`: accept either `agent` or `cursor-agent` as the preferred basename; new optional `remoteSystemHomeDirHint` short-circuits the home-dir probe - `packages/adapters/cursor-local/src/server/execute.ts`: thread the home-dir hint through, prefer the resolved binary path, and shift the effective execution cwd to the per-run managed subdirectory once the runtime is prepared - New `remote-command.test.ts` and `execute.test.ts` cover both basenames, the hint short-circuit, and the cwd shift - `packages/adapters/cursor-local/src/index.ts`: update doc string to reflect the broader resolution - `execute.remote.test.ts` updated to expect the managed-subdirectory cwd shape introduced by the cwd shift ## Verification - `pnpm vitest run --no-coverage --project @paperclipai/adapter-cursor-local` — 6/6 passing - `pnpm typecheck` clean - Manual: a fresh sandbox lease with `npm install -g …`-installed Cursor (binary lands as `~/.local/bin/agent`) now runs cleanly through the adapter ## Risks Low. Resolver is strictly broader (matches a superset of paths); existing setups with `~/.local/bin/cursor-agent` continue to work. The home-dir hint is opt-in; callers that don't pass it get the existing probe behavior. Cursor's effective execution cwd now matches the rest of the adapters (per-run managed subdirectory) — sessions previously rooted at the workspace root will land in the new subdirectory. ## Model Used Claude Opus 4.7 (1M context) ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable — new tests cover both basenames + hint short-circuit + cwd shift - [x] If this change affects the UI, I have included before/after screenshots — N/A (no UI) - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --- > **Stacked PR.** Sits on top of #5445 (which sits on #5444). Cumulative diff against `master` includes both of those PRs' content; the files touched by *this* PR's commit are listed under "What Changed" above. Will rebase onto `master` and force-push once the prerequisite PRs merge.
2026-05-07 15:00:28 -07:00
cwd: managedRemoteWorkspace,
Fix remote workspace environment shaping (#5118) > **Stacked PR (part 5 of 7).** Depends on: - PR #5114 - PR #5115 - PR #5116 - PR #5117 > Diff against `master` includes commits from earlier PRs in the stack — the new commit in this PR is the topmost one. ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Agents run with a Paperclip-shaped environment (`PAPERCLIP_WORKSPACE_CWD`, > worktree path, `PAPERCLIP_WORKSPACES_JSON` hints) so the CLI can locate the > correct project tree > - SSH testing reproduced a real failure: a Codex SSH run wrote to > `/tmp/paperclip-env-matrix-...` (the *host* path) instead of the realized > remote workspace at `/home/<user>/paperclip-env-matrix-ssh-claude/...` > because the adapter injected `PAPERCLIP_WORKSPACE_CWD=/tmp/...` into the > remote env > - Code review on the initial codex-only fix asked to roll the same approach > into every other SSH-capable adapter (claude, acpx, cursor, opencode, gemini, > pi) via a shared helper rather than duplicating per-adapter > - This PR adds `shapePaperclipWorkspaceEnvForExecution` in adapter-utils that, > when the execution target is remote: replaces local cwd with the realized > execution cwd, nulls out worktree path (which has no remote meaning), and > rewrites/strips `cwd` entries in workspace hints based on what was actually > synced. Every adapter calls it before invoking the remote runner > - The benefit is that remote runs see the realized remote workspace, host-local > paths stop leaking into remote env, and the rule is unit-tested in one place ## What Changed - Added `shapePaperclipWorkspaceEnvForExecution` to `packages/adapter-utils/src/server-utils.ts` with full unit coverage (`server-utils.test.ts`) - Each of acpx-local, claude-local, codex-local, cursor-local, gemini-local, opencode-local, pi-local now calls the new shaper before issuing the remote command and feeds the shaped values into `applyPaperclipWorkspaceEnv` - Per-adapter `execute.remote.test.ts` files extended to cover the new shaping behaviour: localhost paths replaced with remote cwd, foreign-cwd hints stripped, worktree path nulled out for remote targets - `acpx-local/src/server/execute.test.ts` extended with shaping coverage ## Verification - `pnpm test -- server-utils execute.remote` - `pnpm --filter @paperclipai/adapter-acpx-local test` - Manual QA reproducing the original failure: 1. Provision an E2B sandbox environment for the Paperclip QA company 2. Assign an issue to a remote-targeted claude-local agent and confirm the run starts in the correct remote cwd (no `/Users/...` path leakage in the run logs) 3. Repeat for opencode-local and pi-local ## Risks - Behavioural shift: hints whose `cwd` doesn't match the workspace cwd are now stripped on remote targets. If any adapter relied on a leaked local hint cwd, it will see a missing `cwd` instead. Reviewed all current callers — none do. - Adds a small per-run cost (path resolve + string normalisation) on every remote execution. Negligible. - Worktree path is now nulled out on remote (it has no meaning there). Adapters that previously read the value defensively will continue to work. ## Model Used - OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI - Provider: OpenAI - Used to author the code changes in this PR ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots — N/A - [ ] I have updated relevant documentation to reflect my changes — N/A - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-05-03 13:17:52 -07:00
repoUrl: "https://github.com/paperclipai/paperclip.git",
repoRef: "main",
},
{
workspaceId: "workspace-2",
repoUrl: "https://github.com/paperclipai/paperclip.git",
repoRef: "feature/other",
},
]);
Migrate SSH environment callback to bridge (#5116) > **Stacked PR (part 3 of 7).** Depends on: - PR #5114 - PR #5115 > Diff against `master` includes commits from earlier PRs in the stack — the new commit in this PR is the topmost one. ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Agents executing on a remote SSH-backed environment need a way to call back into > the Paperclip control plane (run events, log streaming, signals) > - When the SSH host can't reach the Paperclip host (NAT, firewalls, or simply not > on the same network), the run silently fails or hangs — a recurring class of > failure during SSH testing > - In sandboxed environments we already solved this with a callback bridge that > tunnels back through the existing connection; SSH was the odd one out > - This PR migrates SSH execution to use the same callback bridge, so every > adapter's remote run uses one consistent reverse-channel. Per-adapter SSH glue > is deleted in favour of a shared `CommandManagedRuntimeRunner` built from the > SSH spec > - The benefit is fewer SSH-specific failure modes, a smaller code surface, and > one place to evolve the callback contract going forward ## What Changed - Added `createSshCommandManagedRuntimeRunner` in `packages/adapter-utils/src/ssh.ts` that adapts an SSH spec into a generic command-managed-runtime runner (with cwd, env, and timeout handling) - Removed `paperclipApiUrl` from `SshRemoteExecutionSpec`; the bridge URL now flows through the shared runner - Reworked `execution-target.ts` to use the SSH runner alongside sandbox runners via a unified `CommandManagedRuntimeRunner` interface - Simplified `remote-managed-runtime.ts` and `sandbox-managed-runtime.ts` to consume the shared runner abstraction - Deleted per-adapter SSH callback wiring from claude-local, codex-local, cursor-local, gemini-local, opencode-local, pi-local execute.ts files - Removed `environment-runtime-driver-contract.test.ts` (the contract is now enforced by `environment-execution-target.test.ts`) - Added/updated `execute.remote.test.ts` cases for each adapter to cover the SSH runner path ## Verification - `pnpm --filter @paperclipai/adapter-utils test` - `pnpm test -- execute.remote` (covers all six local adapters' SSH paths) - Manual QA: ran a claude-local agent against an SSH-backed environment, confirmed the agent successfully called back to `/api/agent-callback/*` endpoints during the run ## Risks - Refactor touches all six local adapters. If any adapter had subtle SSH-specific behaviour that wasn't captured in tests, it could regress. Mitigation: each adapter's `execute.remote.test.ts` was extended. - `paperclipApiUrl` removal from `SshRemoteExecutionSpec` is a breaking type change for any internal consumer. Verified no external plugins consume this type. - The new `CommandManagedRuntimeRunner` shape is a public surface in `@paperclipai/adapter-utils`; downstream plugins implementing custom runners may need updates, but no such plugins exist in this repo. ## Model Used - OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI - Provider: OpenAI - Used to author the code changes in this PR ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots — N/A - [ ] I have updated relevant documentation to reflect my changes — N/A - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-05-03 12:43:52 -07:00
expect(call?.[3].env.PAPERCLIP_API_URL).toBe("http://127.0.0.1:4310");
expect(call?.[3].env.PAPERCLIP_API_BRIDGE_MODE).toBe("queue_v1");
Stabilize Cursor sandbox runtime resolution (#5446) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The Cursor adapter spawns the Cursor CLI against local, SSH, and sandbox execution targets; on a fresh sandbox lease, it has to resolve where Cursor was installed > - The previous resolver only looked for `~/.local/bin/cursor-agent` even though the official installer (and the adapter's own `SANDBOX_INSTALL_COMMAND`) sometimes lays the binary down as `~/.local/bin/agent`, so a sandbox where the install ran successfully would still fail to find the CLI > - This pull request lets the resolver accept either basename and lets the caller pass an optional `remoteSystemHomeDirHint` so a probe doesn't pay the cost of a remote `printf $HOME` round-trip when the home directory is already known > - The benefit is sandboxed Cursor runs find the binary that the install actually produced, and runtime probes are cheaper when the home dir is already resolved ## What Changed - `packages/adapters/cursor-local/src/server/remote-command.ts`: accept either `agent` or `cursor-agent` as the preferred basename; new optional `remoteSystemHomeDirHint` short-circuits the home-dir probe - `packages/adapters/cursor-local/src/server/execute.ts`: thread the home-dir hint through, prefer the resolved binary path, and shift the effective execution cwd to the per-run managed subdirectory once the runtime is prepared - New `remote-command.test.ts` and `execute.test.ts` cover both basenames, the hint short-circuit, and the cwd shift - `packages/adapters/cursor-local/src/index.ts`: update doc string to reflect the broader resolution - `execute.remote.test.ts` updated to expect the managed-subdirectory cwd shape introduced by the cwd shift ## Verification - `pnpm vitest run --no-coverage --project @paperclipai/adapter-cursor-local` — 6/6 passing - `pnpm typecheck` clean - Manual: a fresh sandbox lease with `npm install -g …`-installed Cursor (binary lands as `~/.local/bin/agent`) now runs cleanly through the adapter ## Risks Low. Resolver is strictly broader (matches a superset of paths); existing setups with `~/.local/bin/cursor-agent` continue to work. The home-dir hint is opt-in; callers that don't pass it get the existing probe behavior. Cursor's effective execution cwd now matches the rest of the adapters (per-run managed subdirectory) — sessions previously rooted at the workspace root will land in the new subdirectory. ## Model Used Claude Opus 4.7 (1M context) ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable — new tests cover both basenames + hint short-circuit + cwd shift - [x] If this change affects the UI, I have included before/after screenshots — N/A (no UI) - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --- > **Stacked PR.** Sits on top of #5445 (which sits on #5444). Cumulative diff against `master` includes both of those PRs' content; the files touched by *this* PR's commit are listed under "What Changed" above. Will rebase onto `master` and force-push once the prerequisite PRs merge.
2026-05-07 15:00:28 -07:00
expect(call?.[3].remoteExecution?.remoteCwd).toBe(managedRemoteWorkspace);
Migrate SSH environment callback to bridge (#5116) > **Stacked PR (part 3 of 7).** Depends on: - PR #5114 - PR #5115 > Diff against `master` includes commits from earlier PRs in the stack — the new commit in this PR is the topmost one. ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Agents executing on a remote SSH-backed environment need a way to call back into > the Paperclip control plane (run events, log streaming, signals) > - When the SSH host can't reach the Paperclip host (NAT, firewalls, or simply not > on the same network), the run silently fails or hangs — a recurring class of > failure during SSH testing > - In sandboxed environments we already solved this with a callback bridge that > tunnels back through the existing connection; SSH was the odd one out > - This PR migrates SSH execution to use the same callback bridge, so every > adapter's remote run uses one consistent reverse-channel. Per-adapter SSH glue > is deleted in favour of a shared `CommandManagedRuntimeRunner` built from the > SSH spec > - The benefit is fewer SSH-specific failure modes, a smaller code surface, and > one place to evolve the callback contract going forward ## What Changed - Added `createSshCommandManagedRuntimeRunner` in `packages/adapter-utils/src/ssh.ts` that adapts an SSH spec into a generic command-managed-runtime runner (with cwd, env, and timeout handling) - Removed `paperclipApiUrl` from `SshRemoteExecutionSpec`; the bridge URL now flows through the shared runner - Reworked `execution-target.ts` to use the SSH runner alongside sandbox runners via a unified `CommandManagedRuntimeRunner` interface - Simplified `remote-managed-runtime.ts` and `sandbox-managed-runtime.ts` to consume the shared runner abstraction - Deleted per-adapter SSH callback wiring from claude-local, codex-local, cursor-local, gemini-local, opencode-local, pi-local execute.ts files - Removed `environment-runtime-driver-contract.test.ts` (the contract is now enforced by `environment-execution-target.test.ts`) - Added/updated `execute.remote.test.ts` cases for each adapter to cover the SSH runner path ## Verification - `pnpm --filter @paperclipai/adapter-utils test` - `pnpm test -- execute.remote` (covers all six local adapters' SSH paths) - Manual QA: ran a claude-local agent against an SSH-backed environment, confirmed the agent successfully called back to `/api/agent-callback/*` endpoints during the run ## Risks - Refactor touches all six local adapters. If any adapter had subtle SSH-specific behaviour that wasn't captured in tests, it could regress. Mitigation: each adapter's `execute.remote.test.ts` was extended. - `paperclipApiUrl` removal from `SshRemoteExecutionSpec` is a breaking type change for any internal consumer. Verified no external plugins consume this type. - The new `CommandManagedRuntimeRunner` shape is a public surface in `@paperclipai/adapter-utils`; downstream plugins implementing custom runners may need updates, but no such plugins exist in this repo. ## Model Used - OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI - Provider: OpenAI - Used to author the code changes in this PR ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots — N/A - [ ] I have updated relevant documentation to reflect my changes — N/A - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-05-03 12:43:52 -07:00
expect(startAdapterExecutionTargetPaperclipBridge).toHaveBeenCalledTimes(1);
Add SSH environment support (#4358) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The environments subsystem already models execution environments, but before this branch there was no end-to-end SSH-backed runtime path for agents to actually run work against a remote box > - That meant agents could be configured around environment concepts without a reliable way to execute adapter sessions remotely, sync workspace state, and preserve run context across supported adapters > - We also need environment selection to participate in normal Paperclip control-plane behavior: agent defaults, project/issue selection, route validation, and environment probing > - Because this capability is still experimental, the UI surface should be easy to hide and easy to remove later without undoing the underlying implementation > - This pull request adds SSH environment execution support across the runtime, adapters, routes, schema, and tests, then puts the visible environment-management UI behind an experimental flag > - The benefit is that we can validate real SSH-backed agent execution now while keeping the user-facing controls safely gated until the feature is ready to come out of experimentation ## What Changed - Added SSH-backed execution target support in the shared adapter runtime, including remote workspace preparation, skill/runtime asset sync, remote session handling, and workspace restore behavior after runs. - Added SSH execution coverage for supported local adapters, plus remote execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi. - Added environment selection and environment-management backend support needed for SSH execution, including route/service work, validation, probing, and agent default environment persistence. - Added CLI support for SSH environment lab verification and updated related docs/tests. - Added the `enableEnvironments` experimental flag and gated the environment UI behind it on company settings, agent configuration, and project configuration surfaces. ## Verification - `pnpm exec vitest run packages/adapters/claude-local/src/server/execute.remote.test.ts packages/adapters/cursor-local/src/server/execute.remote.test.ts packages/adapters/gemini-local/src/server/execute.remote.test.ts packages/adapters/opencode-local/src/server/execute.remote.test.ts packages/adapters/pi-local/src/server/execute.remote.test.ts` - `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts` - `pnpm exec vitest run server/src/__tests__/instance-settings-routes.test.ts` - `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts ui/src/lib/new-agent-runtime-config.test.ts` - `pnpm -r typecheck` - `pnpm build` - Manual verification on a branch-local dev server: - enabled the experimental flag - created an SSH environment - created a Linux Claude agent using that environment - confirmed a run executed on the Linux box and synced workspace changes back ## Risks - Medium: this touches runtime execution flow across multiple adapters, so regressions would likely show up in remote session setup, workspace sync, or environment selection precedence. - The UI flag reduces exposure, but the underlying runtime and route changes are still substantial and rely on migration correctness. - The change set is broad across adapters, control-plane services, migrations, and UI gating, so review should pay close attention to environment-selection precedence and remote workspace lifecycle behavior. ## Model Used - OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding model with tool use and code execution in the local repo workspace. The local adapter does not surface a more specific public model version string in this branch workflow. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-23 19:15:22 -07:00
expect(restoreWorkspaceFromSshExecution).toHaveBeenCalledTimes(1);
});
it("resumes saved Cursor sessions for remote SSH execution only when the identity matches", async () => {
const rootDir = await mkdtemp(path.join(os.tmpdir(), "paperclip-cursor-remote-resume-"));
cleanupDirs.push(rootDir);
const workspaceDir = path.join(rootDir, "workspace");
await mkdir(workspaceDir, { recursive: true });
Stabilize Cursor sandbox runtime resolution (#5446) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The Cursor adapter spawns the Cursor CLI against local, SSH, and sandbox execution targets; on a fresh sandbox lease, it has to resolve where Cursor was installed > - The previous resolver only looked for `~/.local/bin/cursor-agent` even though the official installer (and the adapter's own `SANDBOX_INSTALL_COMMAND`) sometimes lays the binary down as `~/.local/bin/agent`, so a sandbox where the install ran successfully would still fail to find the CLI > - This pull request lets the resolver accept either basename and lets the caller pass an optional `remoteSystemHomeDirHint` so a probe doesn't pay the cost of a remote `printf $HOME` round-trip when the home directory is already known > - The benefit is sandboxed Cursor runs find the binary that the install actually produced, and runtime probes are cheaper when the home dir is already resolved ## What Changed - `packages/adapters/cursor-local/src/server/remote-command.ts`: accept either `agent` or `cursor-agent` as the preferred basename; new optional `remoteSystemHomeDirHint` short-circuits the home-dir probe - `packages/adapters/cursor-local/src/server/execute.ts`: thread the home-dir hint through, prefer the resolved binary path, and shift the effective execution cwd to the per-run managed subdirectory once the runtime is prepared - New `remote-command.test.ts` and `execute.test.ts` cover both basenames, the hint short-circuit, and the cwd shift - `packages/adapters/cursor-local/src/index.ts`: update doc string to reflect the broader resolution - `execute.remote.test.ts` updated to expect the managed-subdirectory cwd shape introduced by the cwd shift ## Verification - `pnpm vitest run --no-coverage --project @paperclipai/adapter-cursor-local` — 6/6 passing - `pnpm typecheck` clean - Manual: a fresh sandbox lease with `npm install -g …`-installed Cursor (binary lands as `~/.local/bin/agent`) now runs cleanly through the adapter ## Risks Low. Resolver is strictly broader (matches a superset of paths); existing setups with `~/.local/bin/cursor-agent` continue to work. The home-dir hint is opt-in; callers that don't pass it get the existing probe behavior. Cursor's effective execution cwd now matches the rest of the adapters (per-run managed subdirectory) — sessions previously rooted at the workspace root will land in the new subdirectory. ## Model Used Claude Opus 4.7 (1M context) ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable — new tests cover both basenames + hint short-circuit + cwd shift - [x] If this change affects the UI, I have included before/after screenshots — N/A (no UI) - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --- > **Stacked PR.** Sits on top of #5445 (which sits on #5444). Cumulative diff against `master` includes both of those PRs' content; the files touched by *this* PR's commit are listed under "What Changed" above. Will rebase onto `master` and force-push once the prerequisite PRs merge.
2026-05-07 15:00:28 -07:00
const managedRemoteWorkspace = "/remote/workspace/.paperclip-runtime/runs/run-ssh-resume/workspace";
Add SSH environment support (#4358) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The environments subsystem already models execution environments, but before this branch there was no end-to-end SSH-backed runtime path for agents to actually run work against a remote box > - That meant agents could be configured around environment concepts without a reliable way to execute adapter sessions remotely, sync workspace state, and preserve run context across supported adapters > - We also need environment selection to participate in normal Paperclip control-plane behavior: agent defaults, project/issue selection, route validation, and environment probing > - Because this capability is still experimental, the UI surface should be easy to hide and easy to remove later without undoing the underlying implementation > - This pull request adds SSH environment execution support across the runtime, adapters, routes, schema, and tests, then puts the visible environment-management UI behind an experimental flag > - The benefit is that we can validate real SSH-backed agent execution now while keeping the user-facing controls safely gated until the feature is ready to come out of experimentation ## What Changed - Added SSH-backed execution target support in the shared adapter runtime, including remote workspace preparation, skill/runtime asset sync, remote session handling, and workspace restore behavior after runs. - Added SSH execution coverage for supported local adapters, plus remote execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi. - Added environment selection and environment-management backend support needed for SSH execution, including route/service work, validation, probing, and agent default environment persistence. - Added CLI support for SSH environment lab verification and updated related docs/tests. - Added the `enableEnvironments` experimental flag and gated the environment UI behind it on company settings, agent configuration, and project configuration surfaces. ## Verification - `pnpm exec vitest run packages/adapters/claude-local/src/server/execute.remote.test.ts packages/adapters/cursor-local/src/server/execute.remote.test.ts packages/adapters/gemini-local/src/server/execute.remote.test.ts packages/adapters/opencode-local/src/server/execute.remote.test.ts packages/adapters/pi-local/src/server/execute.remote.test.ts` - `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts` - `pnpm exec vitest run server/src/__tests__/instance-settings-routes.test.ts` - `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts ui/src/lib/new-agent-runtime-config.test.ts` - `pnpm -r typecheck` - `pnpm build` - Manual verification on a branch-local dev server: - enabled the experimental flag - created an SSH environment - created a Linux Claude agent using that environment - confirmed a run executed on the Linux box and synced workspace changes back ## Risks - Medium: this touches runtime execution flow across multiple adapters, so regressions would likely show up in remote session setup, workspace sync, or environment selection precedence. - The UI flag reduces exposure, but the underlying runtime and route changes are still substantial and rely on migration correctness. - The change set is broad across adapters, control-plane services, migrations, and UI gating, so review should pay close attention to environment-selection precedence and remote workspace lifecycle behavior. ## Model Used - OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding model with tool use and code execution in the local repo workspace. The local adapter does not surface a more specific public model version string in this branch workflow. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-23 19:15:22 -07:00
await execute({
runId: "run-ssh-resume",
agent: {
id: "agent-1",
companyId: "company-1",
name: "Cursor Builder",
adapterType: "cursor",
adapterConfig: {},
},
runtime: {
sessionId: "session-123",
sessionParams: {
sessionId: "session-123",
Stabilize Cursor sandbox runtime resolution (#5446) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The Cursor adapter spawns the Cursor CLI against local, SSH, and sandbox execution targets; on a fresh sandbox lease, it has to resolve where Cursor was installed > - The previous resolver only looked for `~/.local/bin/cursor-agent` even though the official installer (and the adapter's own `SANDBOX_INSTALL_COMMAND`) sometimes lays the binary down as `~/.local/bin/agent`, so a sandbox where the install ran successfully would still fail to find the CLI > - This pull request lets the resolver accept either basename and lets the caller pass an optional `remoteSystemHomeDirHint` so a probe doesn't pay the cost of a remote `printf $HOME` round-trip when the home directory is already known > - The benefit is sandboxed Cursor runs find the binary that the install actually produced, and runtime probes are cheaper when the home dir is already resolved ## What Changed - `packages/adapters/cursor-local/src/server/remote-command.ts`: accept either `agent` or `cursor-agent` as the preferred basename; new optional `remoteSystemHomeDirHint` short-circuits the home-dir probe - `packages/adapters/cursor-local/src/server/execute.ts`: thread the home-dir hint through, prefer the resolved binary path, and shift the effective execution cwd to the per-run managed subdirectory once the runtime is prepared - New `remote-command.test.ts` and `execute.test.ts` cover both basenames, the hint short-circuit, and the cwd shift - `packages/adapters/cursor-local/src/index.ts`: update doc string to reflect the broader resolution - `execute.remote.test.ts` updated to expect the managed-subdirectory cwd shape introduced by the cwd shift ## Verification - `pnpm vitest run --no-coverage --project @paperclipai/adapter-cursor-local` — 6/6 passing - `pnpm typecheck` clean - Manual: a fresh sandbox lease with `npm install -g …`-installed Cursor (binary lands as `~/.local/bin/agent`) now runs cleanly through the adapter ## Risks Low. Resolver is strictly broader (matches a superset of paths); existing setups with `~/.local/bin/cursor-agent` continue to work. The home-dir hint is opt-in; callers that don't pass it get the existing probe behavior. Cursor's effective execution cwd now matches the rest of the adapters (per-run managed subdirectory) — sessions previously rooted at the workspace root will land in the new subdirectory. ## Model Used Claude Opus 4.7 (1M context) ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable — new tests cover both basenames + hint short-circuit + cwd shift - [x] If this change affects the UI, I have included before/after screenshots — N/A (no UI) - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --- > **Stacked PR.** Sits on top of #5445 (which sits on #5444). Cumulative diff against `master` includes both of those PRs' content; the files touched by *this* PR's commit are listed under "What Changed" above. Will rebase onto `master` and force-push once the prerequisite PRs merge.
2026-05-07 15:00:28 -07:00
cwd: managedRemoteWorkspace,
Add SSH environment support (#4358) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The environments subsystem already models execution environments, but before this branch there was no end-to-end SSH-backed runtime path for agents to actually run work against a remote box > - That meant agents could be configured around environment concepts without a reliable way to execute adapter sessions remotely, sync workspace state, and preserve run context across supported adapters > - We also need environment selection to participate in normal Paperclip control-plane behavior: agent defaults, project/issue selection, route validation, and environment probing > - Because this capability is still experimental, the UI surface should be easy to hide and easy to remove later without undoing the underlying implementation > - This pull request adds SSH environment execution support across the runtime, adapters, routes, schema, and tests, then puts the visible environment-management UI behind an experimental flag > - The benefit is that we can validate real SSH-backed agent execution now while keeping the user-facing controls safely gated until the feature is ready to come out of experimentation ## What Changed - Added SSH-backed execution target support in the shared adapter runtime, including remote workspace preparation, skill/runtime asset sync, remote session handling, and workspace restore behavior after runs. - Added SSH execution coverage for supported local adapters, plus remote execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi. - Added environment selection and environment-management backend support needed for SSH execution, including route/service work, validation, probing, and agent default environment persistence. - Added CLI support for SSH environment lab verification and updated related docs/tests. - Added the `enableEnvironments` experimental flag and gated the environment UI behind it on company settings, agent configuration, and project configuration surfaces. ## Verification - `pnpm exec vitest run packages/adapters/claude-local/src/server/execute.remote.test.ts packages/adapters/cursor-local/src/server/execute.remote.test.ts packages/adapters/gemini-local/src/server/execute.remote.test.ts packages/adapters/opencode-local/src/server/execute.remote.test.ts packages/adapters/pi-local/src/server/execute.remote.test.ts` - `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts` - `pnpm exec vitest run server/src/__tests__/instance-settings-routes.test.ts` - `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts ui/src/lib/new-agent-runtime-config.test.ts` - `pnpm -r typecheck` - `pnpm build` - Manual verification on a branch-local dev server: - enabled the experimental flag - created an SSH environment - created a Linux Claude agent using that environment - confirmed a run executed on the Linux box and synced workspace changes back ## Risks - Medium: this touches runtime execution flow across multiple adapters, so regressions would likely show up in remote session setup, workspace sync, or environment selection precedence. - The UI flag reduces exposure, but the underlying runtime and route changes are still substantial and rely on migration correctness. - The change set is broad across adapters, control-plane services, migrations, and UI gating, so review should pay close attention to environment-selection precedence and remote workspace lifecycle behavior. ## Model Used - OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding model with tool use and code execution in the local repo workspace. The local adapter does not surface a more specific public model version string in this branch workflow. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-23 19:15:22 -07:00
remoteExecution: {
transport: "ssh",
host: "127.0.0.1",
port: 2222,
username: "fixture",
Stabilize Cursor sandbox runtime resolution (#5446) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The Cursor adapter spawns the Cursor CLI against local, SSH, and sandbox execution targets; on a fresh sandbox lease, it has to resolve where Cursor was installed > - The previous resolver only looked for `~/.local/bin/cursor-agent` even though the official installer (and the adapter's own `SANDBOX_INSTALL_COMMAND`) sometimes lays the binary down as `~/.local/bin/agent`, so a sandbox where the install ran successfully would still fail to find the CLI > - This pull request lets the resolver accept either basename and lets the caller pass an optional `remoteSystemHomeDirHint` so a probe doesn't pay the cost of a remote `printf $HOME` round-trip when the home directory is already known > - The benefit is sandboxed Cursor runs find the binary that the install actually produced, and runtime probes are cheaper when the home dir is already resolved ## What Changed - `packages/adapters/cursor-local/src/server/remote-command.ts`: accept either `agent` or `cursor-agent` as the preferred basename; new optional `remoteSystemHomeDirHint` short-circuits the home-dir probe - `packages/adapters/cursor-local/src/server/execute.ts`: thread the home-dir hint through, prefer the resolved binary path, and shift the effective execution cwd to the per-run managed subdirectory once the runtime is prepared - New `remote-command.test.ts` and `execute.test.ts` cover both basenames, the hint short-circuit, and the cwd shift - `packages/adapters/cursor-local/src/index.ts`: update doc string to reflect the broader resolution - `execute.remote.test.ts` updated to expect the managed-subdirectory cwd shape introduced by the cwd shift ## Verification - `pnpm vitest run --no-coverage --project @paperclipai/adapter-cursor-local` — 6/6 passing - `pnpm typecheck` clean - Manual: a fresh sandbox lease with `npm install -g …`-installed Cursor (binary lands as `~/.local/bin/agent`) now runs cleanly through the adapter ## Risks Low. Resolver is strictly broader (matches a superset of paths); existing setups with `~/.local/bin/cursor-agent` continue to work. The home-dir hint is opt-in; callers that don't pass it get the existing probe behavior. Cursor's effective execution cwd now matches the rest of the adapters (per-run managed subdirectory) — sessions previously rooted at the workspace root will land in the new subdirectory. ## Model Used Claude Opus 4.7 (1M context) ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable — new tests cover both basenames + hint short-circuit + cwd shift - [x] If this change affects the UI, I have included before/after screenshots — N/A (no UI) - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --- > **Stacked PR.** Sits on top of #5445 (which sits on #5444). Cumulative diff against `master` includes both of those PRs' content; the files touched by *this* PR's commit are listed under "What Changed" above. Will rebase onto `master` and force-push once the prerequisite PRs merge.
2026-05-07 15:00:28 -07:00
remoteCwd: managedRemoteWorkspace,
Add SSH environment support (#4358) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The environments subsystem already models execution environments, but before this branch there was no end-to-end SSH-backed runtime path for agents to actually run work against a remote box > - That meant agents could be configured around environment concepts without a reliable way to execute adapter sessions remotely, sync workspace state, and preserve run context across supported adapters > - We also need environment selection to participate in normal Paperclip control-plane behavior: agent defaults, project/issue selection, route validation, and environment probing > - Because this capability is still experimental, the UI surface should be easy to hide and easy to remove later without undoing the underlying implementation > - This pull request adds SSH environment execution support across the runtime, adapters, routes, schema, and tests, then puts the visible environment-management UI behind an experimental flag > - The benefit is that we can validate real SSH-backed agent execution now while keeping the user-facing controls safely gated until the feature is ready to come out of experimentation ## What Changed - Added SSH-backed execution target support in the shared adapter runtime, including remote workspace preparation, skill/runtime asset sync, remote session handling, and workspace restore behavior after runs. - Added SSH execution coverage for supported local adapters, plus remote execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi. - Added environment selection and environment-management backend support needed for SSH execution, including route/service work, validation, probing, and agent default environment persistence. - Added CLI support for SSH environment lab verification and updated related docs/tests. - Added the `enableEnvironments` experimental flag and gated the environment UI behind it on company settings, agent configuration, and project configuration surfaces. ## Verification - `pnpm exec vitest run packages/adapters/claude-local/src/server/execute.remote.test.ts packages/adapters/cursor-local/src/server/execute.remote.test.ts packages/adapters/gemini-local/src/server/execute.remote.test.ts packages/adapters/opencode-local/src/server/execute.remote.test.ts packages/adapters/pi-local/src/server/execute.remote.test.ts` - `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts` - `pnpm exec vitest run server/src/__tests__/instance-settings-routes.test.ts` - `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts ui/src/lib/new-agent-runtime-config.test.ts` - `pnpm -r typecheck` - `pnpm build` - Manual verification on a branch-local dev server: - enabled the experimental flag - created an SSH environment - created a Linux Claude agent using that environment - confirmed a run executed on the Linux box and synced workspace changes back ## Risks - Medium: this touches runtime execution flow across multiple adapters, so regressions would likely show up in remote session setup, workspace sync, or environment selection precedence. - The UI flag reduces exposure, but the underlying runtime and route changes are still substantial and rely on migration correctness. - The change set is broad across adapters, control-plane services, migrations, and UI gating, so review should pay close attention to environment-selection precedence and remote workspace lifecycle behavior. ## Model Used - OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding model with tool use and code execution in the local repo workspace. The local adapter does not surface a more specific public model version string in this branch workflow. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-23 19:15:22 -07:00
},
},
sessionDisplayId: "session-123",
taskKey: null,
},
config: {
command: "agent",
},
context: {
paperclipWorkspace: {
cwd: workspaceDir,
source: "project_primary",
},
},
executionTransport: {
remoteExecution: {
host: "127.0.0.1",
port: 2222,
username: "fixture",
remoteWorkspacePath: "/remote/workspace",
remoteCwd: "/remote/workspace",
privateKey: "PRIVATE KEY",
knownHosts: "[127.0.0.1]:2222 ssh-ed25519 AAAA",
strictHostKeyChecking: true,
},
},
onLog: async () => {},
});
const call = runChildProcess.mock.calls[0] as unknown as [string, string, string[]] | undefined;
expect(call?.[2]).toContain("--resume");
expect(call?.[2]).toContain("session-123");
});
it("restores the remote workspace if skills sync fails after workspace prep", async () => {
const rootDir = await mkdtemp(path.join(os.tmpdir(), "paperclip-cursor-remote-sync-fail-"));
cleanupDirs.push(rootDir);
const workspaceDir = path.join(rootDir, "workspace");
await mkdir(workspaceDir, { recursive: true });
syncDirectoryToSsh.mockRejectedValueOnce(new Error("sync failed"));
await expect(execute({
runId: "run-sync-fail",
agent: {
id: "agent-1",
companyId: "company-1",
name: "Cursor Builder",
adapterType: "cursor",
adapterConfig: {},
},
runtime: {
sessionId: null,
sessionParams: null,
sessionDisplayId: null,
taskKey: null,
},
config: {
command: "agent",
},
context: {
paperclipWorkspace: {
cwd: workspaceDir,
source: "project_primary",
},
},
executionTransport: {
remoteExecution: {
host: "127.0.0.1",
port: 2222,
username: "fixture",
remoteWorkspacePath: "/remote/workspace",
remoteCwd: "/remote/workspace",
privateKey: "PRIVATE KEY",
knownHosts: "[127.0.0.1]:2222 ssh-ed25519 AAAA",
strictHostKeyChecking: true,
},
},
onLog: async () => {},
})).rejects.toThrow("sync failed");
expect(prepareWorkspaceForSshExecution).toHaveBeenCalledTimes(1);
expect(restoreWorkspaceFromSshExecution).toHaveBeenCalledTimes(1);
expect(runChildProcess).not.toHaveBeenCalled();
});
});