paperclip/packages/adapters/cursor-local/src/server/execute.ts

720 lines
27 KiB
TypeScript
Raw Normal View History

import fs from "node:fs/promises";
import os from "node:os";
import path from "node:path";
import { fileURLToPath } from "node:url";
import { inferOpenAiCompatibleBiller, type AdapterExecutionContext, type AdapterExecutionResult } from "@paperclipai/adapter-utils";
Add SSH environment support (#4358) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The environments subsystem already models execution environments, but before this branch there was no end-to-end SSH-backed runtime path for agents to actually run work against a remote box > - That meant agents could be configured around environment concepts without a reliable way to execute adapter sessions remotely, sync workspace state, and preserve run context across supported adapters > - We also need environment selection to participate in normal Paperclip control-plane behavior: agent defaults, project/issue selection, route validation, and environment probing > - Because this capability is still experimental, the UI surface should be easy to hide and easy to remove later without undoing the underlying implementation > - This pull request adds SSH environment execution support across the runtime, adapters, routes, schema, and tests, then puts the visible environment-management UI behind an experimental flag > - The benefit is that we can validate real SSH-backed agent execution now while keeping the user-facing controls safely gated until the feature is ready to come out of experimentation ## What Changed - Added SSH-backed execution target support in the shared adapter runtime, including remote workspace preparation, skill/runtime asset sync, remote session handling, and workspace restore behavior after runs. - Added SSH execution coverage for supported local adapters, plus remote execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi. - Added environment selection and environment-management backend support needed for SSH execution, including route/service work, validation, probing, and agent default environment persistence. - Added CLI support for SSH environment lab verification and updated related docs/tests. - Added the `enableEnvironments` experimental flag and gated the environment UI behind it on company settings, agent configuration, and project configuration surfaces. ## Verification - `pnpm exec vitest run packages/adapters/claude-local/src/server/execute.remote.test.ts packages/adapters/cursor-local/src/server/execute.remote.test.ts packages/adapters/gemini-local/src/server/execute.remote.test.ts packages/adapters/opencode-local/src/server/execute.remote.test.ts packages/adapters/pi-local/src/server/execute.remote.test.ts` - `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts` - `pnpm exec vitest run server/src/__tests__/instance-settings-routes.test.ts` - `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts ui/src/lib/new-agent-runtime-config.test.ts` - `pnpm -r typecheck` - `pnpm build` - Manual verification on a branch-local dev server: - enabled the experimental flag - created an SSH environment - created a Linux Claude agent using that environment - confirmed a run executed on the Linux box and synced workspace changes back ## Risks - Medium: this touches runtime execution flow across multiple adapters, so regressions would likely show up in remote session setup, workspace sync, or environment selection precedence. - The UI flag reduces exposure, but the underlying runtime and route changes are still substantial and rely on migration correctness. - The change set is broad across adapters, control-plane services, migrations, and UI gating, so review should pay close attention to environment-selection precedence and remote workspace lifecycle behavior. ## Model Used - OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding model with tool use and code execution in the local repo workspace. The local adapter does not surface a more specific public model version string in this branch workflow. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-23 19:15:22 -07:00
import {
adapterExecutionTargetIsRemote,
adapterExecutionTargetRemoteCwd,
adapterExecutionTargetSessionIdentity,
adapterExecutionTargetSessionMatches,
adapterExecutionTargetUsesManagedHome,
Add sandbox callback bridge for remote environment API access (#4801) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Agents can run inside sandboxed environments like E2B, which are isolated from the host network > - Sandboxed agents need to call back to the Paperclip API to report progress, post comments, and update issue status > - But sandbox environments cannot reach the Paperclip server directly because they run in isolated network namespaces > - This PR adds a callback bridge that proxies API requests from the sandbox to the Paperclip server, running as a local HTTP server on the host that forwards authenticated requests > - The bridge is started automatically when an adapter launches a sandbox execution, and torn down when the run completes > - The benefit is sandboxed agents can interact with the Paperclip API without requiring network-level access to the host, enabling E2B and similar providers to work end-to-end ## What Changed - Added `sandbox-callback-bridge.ts` in `packages/adapter-utils/` — a lightweight HTTP bridge server that accepts requests from sandbox environments and proxies them to the Paperclip API with authentication - Added request validation and security policy: the bridge only forwards requests to the configured API URL, validates content types, enforces size limits, and rejects non-API paths - Wired the bridge into all remote adapter execute paths (claude, codex, cursor, gemini, pi) — the bridge starts before the agent process and the bridge URL is passed via environment variables - Updated `environment-execution-target.ts` to prefer the explicit API URL from environment lease metadata for sandbox callback routing - Fixed Claude sandbox runtime setup to work with the bridge configuration - Added comprehensive test coverage for bridge request handling, policy enforcement, and sandbox execution integration - Fixed browser bundling — the bridge module is excluded from the frontend bundle via the adapter-utils index export ## Verification - `pnpm test` — all existing and new tests pass, including bridge unit tests and sandbox execution integration tests - `pnpm typecheck` — clean - Manual: configure an E2B environment, run an agent task, verify the agent can post comments and update issue status through the bridge ## Risks - Medium. This is a new network-facing component (HTTP server on localhost). The security policy restricts forwarding to the configured API URL only and validates all requests, but any proxy introduces attack surface. The bridge binds to localhost only and is scoped to the lifetime of a single agent run. ## Model Used Codex GPT 5.4 high via Paperclip. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-29 16:37:34 -07:00
adapterExecutionTargetUsesPaperclipBridge,
Add SSH environment support (#4358) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The environments subsystem already models execution environments, but before this branch there was no end-to-end SSH-backed runtime path for agents to actually run work against a remote box > - That meant agents could be configured around environment concepts without a reliable way to execute adapter sessions remotely, sync workspace state, and preserve run context across supported adapters > - We also need environment selection to participate in normal Paperclip control-plane behavior: agent defaults, project/issue selection, route validation, and environment probing > - Because this capability is still experimental, the UI surface should be easy to hide and easy to remove later without undoing the underlying implementation > - This pull request adds SSH environment execution support across the runtime, adapters, routes, schema, and tests, then puts the visible environment-management UI behind an experimental flag > - The benefit is that we can validate real SSH-backed agent execution now while keeping the user-facing controls safely gated until the feature is ready to come out of experimentation ## What Changed - Added SSH-backed execution target support in the shared adapter runtime, including remote workspace preparation, skill/runtime asset sync, remote session handling, and workspace restore behavior after runs. - Added SSH execution coverage for supported local adapters, plus remote execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi. - Added environment selection and environment-management backend support needed for SSH execution, including route/service work, validation, probing, and agent default environment persistence. - Added CLI support for SSH environment lab verification and updated related docs/tests. - Added the `enableEnvironments` experimental flag and gated the environment UI behind it on company settings, agent configuration, and project configuration surfaces. ## Verification - `pnpm exec vitest run packages/adapters/claude-local/src/server/execute.remote.test.ts packages/adapters/cursor-local/src/server/execute.remote.test.ts packages/adapters/gemini-local/src/server/execute.remote.test.ts packages/adapters/opencode-local/src/server/execute.remote.test.ts packages/adapters/pi-local/src/server/execute.remote.test.ts` - `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts` - `pnpm exec vitest run server/src/__tests__/instance-settings-routes.test.ts` - `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts ui/src/lib/new-agent-runtime-config.test.ts` - `pnpm -r typecheck` - `pnpm build` - Manual verification on a branch-local dev server: - enabled the experimental flag - created an SSH environment - created a Linux Claude agent using that environment - confirmed a run executed on the Linux box and synced workspace changes back ## Risks - Medium: this touches runtime execution flow across multiple adapters, so regressions would likely show up in remote session setup, workspace sync, or environment selection precedence. - The UI flag reduces exposure, but the underlying runtime and route changes are still substantial and rely on migration correctness. - The change set is broad across adapters, control-plane services, migrations, and UI gating, so review should pay close attention to environment-selection precedence and remote workspace lifecycle behavior. ## Model Used - OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding model with tool use and code execution in the local repo workspace. The local adapter does not surface a more specific public model version string in this branch workflow. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-23 19:15:22 -07:00
describeAdapterExecutionTarget,
ensureAdapterExecutionTargetCommandResolvable,
Let adapters declare runtime command spec for remote provisioning (#5141) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies, running adapter > commands like `claude`, `codex`, `pi` either locally or on remote runtimes > (SSH hosts, sandboxes, etc.) > - On a fresh remote runtime — particularly an ephemeral sandbox — the > adapter's CLI may not be installed yet. Today operators handle this via > external configuration (e.g. a project-level `provisionCommand` shell > script) that has to know about every adapter the operator might want to use > - This means every adapter has its own well-known npm package, but operators > end up writing duplicate provision shell scripts that paste together > `npm install -g @anthropic-ai/claude-code`, `npm install -g @openai/codex`, > etc. — knowledge the adapter itself already has > - This PR moves that knowledge into the adapter modules: each adapter declares > how its runtime command should be detected and (if applicable) installed > via `getRuntimeCommandSpec(config)`. The execution path runs the adapter's > own install command on remote sandbox targets before launching, so a fresh > sandbox bootstraps itself instead of requiring a hand-written provision script > - The benefit is fewer footguns for operators provisioning remote runtimes, > and a clean place for new adapters to plug in their install recipe ## What Changed - New types in `packages/adapter-utils/src/types.ts`: - `AdapterRuntimeCommandSpec` describing `command`, optional `detectCommand`, and optional `installCommand` - Optional `getRuntimeCommandSpec(config)` on `ServerAdapterModule` - Optional `runtimeCommandSpec` on `AdapterExecutionContext` so adapters receive the resolved spec at execute time - New helper `ensureAdapterExecutionTargetRuntimeCommandInstalled(...)` in `packages/adapter-utils/src/execution-target.ts` that runs the install command on remote targets when `transport === "sandbox"`. SSH and local targets are no-ops. Throws on timeout or non-zero exit so failures surface early. - Each of `claude-local`, `codex-local`, `cursor-local`, `gemini-local`, `opencode-local`, `pi-local`'s `execute.ts` now reads `ctx.runtimeCommandSpec?.installCommand` and calls the helper before launching the adapter command. - `server/src/adapters/registry.ts` declares `getRuntimeCommandSpec` for each adapter: - claude/codex/gemini/opencode/pi-local: `npm install -g <package>` recipe via a shared `buildNpmRuntimeCommandSpec` helper, with a defensive guard that only auto-installs when the configured `command` matches the well-known fallback (custom binaries are left alone). - cursor-local: declares `command` only; no auto-install (no public npm package), preserving the existing manual setup. - `server/src/services/heartbeat.ts` resolves the spec via `adapter.getRuntimeCommandSpec?.(runtimeConfig)` and passes it through to `AdapterExecutionContext`. - Tests added in `execution-target.test.ts` (~75 lines), e2b `plugin.test.ts` (~32 lines), and `environment-run-orchestrator.test.ts` (~76 lines). ## Verification - `pnpm --filter @paperclipai/adapter-utils test` - `pnpm --filter @paperclipai/server test -- environment-run-orchestrator` - `pnpm --filter @paperclipai/sandbox-providers-e2b test` - Manual QA: run an adapter (claude/codex/etc.) against a fresh sandbox-backed environment that does NOT have the adapter CLI pre-installed. Confirm the install runs once at the start of the agent run and the adapter then launches successfully. Re-run on the same sandbox; confirm the install command is idempotent and the second run starts faster. - Confirm SSH and local execution paths are unaffected (gated by `transport === "sandbox"`). ## Risks - Behavioural shift on sandbox runs: a new install step now runs at the start of every sandbox agent run for adapters with `installCommand` set. The install commands are idempotent (`if ! command -v X >/dev/null 2>&1; then npm install -g <pkg>; fi`), so this is fast on warm sandboxes. On a cold sandbox, the first run takes longer. - Operators who used the legacy project-level `provisionCommand` to install adapter CLIs can drop that part of their script; the adapter handles it now. Existing scripts continue to work — installs are idempotent. - The cursor-local adapter has no auto-install (no public npm package). Behaviour for cursor-local on sandboxes is unchanged. - New optional surface on `ServerAdapterModule`. Plugins that don't implement `getRuntimeCommandSpec` retain previous behaviour (no auto-install). ## Model Used - OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI - Provider: OpenAI - Used to author the code changes in this PR ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots — N/A - [ ] I have updated relevant documentation to reflect my changes — N/A - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-05-03 18:35:36 -07:00
ensureAdapterExecutionTargetRuntimeCommandInstalled,
Add SSH environment support (#4358) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The environments subsystem already models execution environments, but before this branch there was no end-to-end SSH-backed runtime path for agents to actually run work against a remote box > - That meant agents could be configured around environment concepts without a reliable way to execute adapter sessions remotely, sync workspace state, and preserve run context across supported adapters > - We also need environment selection to participate in normal Paperclip control-plane behavior: agent defaults, project/issue selection, route validation, and environment probing > - Because this capability is still experimental, the UI surface should be easy to hide and easy to remove later without undoing the underlying implementation > - This pull request adds SSH environment execution support across the runtime, adapters, routes, schema, and tests, then puts the visible environment-management UI behind an experimental flag > - The benefit is that we can validate real SSH-backed agent execution now while keeping the user-facing controls safely gated until the feature is ready to come out of experimentation ## What Changed - Added SSH-backed execution target support in the shared adapter runtime, including remote workspace preparation, skill/runtime asset sync, remote session handling, and workspace restore behavior after runs. - Added SSH execution coverage for supported local adapters, plus remote execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi. - Added environment selection and environment-management backend support needed for SSH execution, including route/service work, validation, probing, and agent default environment persistence. - Added CLI support for SSH environment lab verification and updated related docs/tests. - Added the `enableEnvironments` experimental flag and gated the environment UI behind it on company settings, agent configuration, and project configuration surfaces. ## Verification - `pnpm exec vitest run packages/adapters/claude-local/src/server/execute.remote.test.ts packages/adapters/cursor-local/src/server/execute.remote.test.ts packages/adapters/gemini-local/src/server/execute.remote.test.ts packages/adapters/opencode-local/src/server/execute.remote.test.ts packages/adapters/pi-local/src/server/execute.remote.test.ts` - `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts` - `pnpm exec vitest run server/src/__tests__/instance-settings-routes.test.ts` - `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts ui/src/lib/new-agent-runtime-config.test.ts` - `pnpm -r typecheck` - `pnpm build` - Manual verification on a branch-local dev server: - enabled the experimental flag - created an SSH environment - created a Linux Claude agent using that environment - confirmed a run executed on the Linux box and synced workspace changes back ## Risks - Medium: this touches runtime execution flow across multiple adapters, so regressions would likely show up in remote session setup, workspace sync, or environment selection precedence. - The UI flag reduces exposure, but the underlying runtime and route changes are still substantial and rely on migration correctness. - The change set is broad across adapters, control-plane services, migrations, and UI gating, so review should pay close attention to environment-selection precedence and remote workspace lifecycle behavior. ## Model Used - OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding model with tool use and code execution in the local repo workspace. The local adapter does not surface a more specific public model version string in this branch workflow. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-23 19:15:22 -07:00
prepareAdapterExecutionTargetRuntime,
readAdapterExecutionTarget,
readAdapterExecutionTargetHomeDir,
resolveAdapterExecutionTargetCommandForLogs,
runAdapterExecutionTargetProcess,
runAdapterExecutionTargetShellCommand,
Add sandbox callback bridge for remote environment API access (#4801) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Agents can run inside sandboxed environments like E2B, which are isolated from the host network > - Sandboxed agents need to call back to the Paperclip API to report progress, post comments, and update issue status > - But sandbox environments cannot reach the Paperclip server directly because they run in isolated network namespaces > - This PR adds a callback bridge that proxies API requests from the sandbox to the Paperclip server, running as a local HTTP server on the host that forwards authenticated requests > - The bridge is started automatically when an adapter launches a sandbox execution, and torn down when the run completes > - The benefit is sandboxed agents can interact with the Paperclip API without requiring network-level access to the host, enabling E2B and similar providers to work end-to-end ## What Changed - Added `sandbox-callback-bridge.ts` in `packages/adapter-utils/` — a lightweight HTTP bridge server that accepts requests from sandbox environments and proxies them to the Paperclip API with authentication - Added request validation and security policy: the bridge only forwards requests to the configured API URL, validates content types, enforces size limits, and rejects non-API paths - Wired the bridge into all remote adapter execute paths (claude, codex, cursor, gemini, pi) — the bridge starts before the agent process and the bridge URL is passed via environment variables - Updated `environment-execution-target.ts` to prefer the explicit API URL from environment lease metadata for sandbox callback routing - Fixed Claude sandbox runtime setup to work with the bridge configuration - Added comprehensive test coverage for bridge request handling, policy enforcement, and sandbox execution integration - Fixed browser bundling — the bridge module is excluded from the frontend bundle via the adapter-utils index export ## Verification - `pnpm test` — all existing and new tests pass, including bridge unit tests and sandbox execution integration tests - `pnpm typecheck` — clean - Manual: configure an E2B environment, run an agent task, verify the agent can post comments and update issue status through the bridge ## Risks - Medium. This is a new network-facing component (HTTP server on localhost). The security policy restricts forwarding to the configured API URL only and validates all requests, but any proxy introduces attack surface. The bridge binds to localhost only and is scoped to the lifetime of a single agent run. ## Model Used Codex GPT 5.4 high via Paperclip. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-29 16:37:34 -07:00
startAdapterExecutionTargetPaperclipBridge,
Add SSH environment support (#4358) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The environments subsystem already models execution environments, but before this branch there was no end-to-end SSH-backed runtime path for agents to actually run work against a remote box > - That meant agents could be configured around environment concepts without a reliable way to execute adapter sessions remotely, sync workspace state, and preserve run context across supported adapters > - We also need environment selection to participate in normal Paperclip control-plane behavior: agent defaults, project/issue selection, route validation, and environment probing > - Because this capability is still experimental, the UI surface should be easy to hide and easy to remove later without undoing the underlying implementation > - This pull request adds SSH environment execution support across the runtime, adapters, routes, schema, and tests, then puts the visible environment-management UI behind an experimental flag > - The benefit is that we can validate real SSH-backed agent execution now while keeping the user-facing controls safely gated until the feature is ready to come out of experimentation ## What Changed - Added SSH-backed execution target support in the shared adapter runtime, including remote workspace preparation, skill/runtime asset sync, remote session handling, and workspace restore behavior after runs. - Added SSH execution coverage for supported local adapters, plus remote execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi. - Added environment selection and environment-management backend support needed for SSH execution, including route/service work, validation, probing, and agent default environment persistence. - Added CLI support for SSH environment lab verification and updated related docs/tests. - Added the `enableEnvironments` experimental flag and gated the environment UI behind it on company settings, agent configuration, and project configuration surfaces. ## Verification - `pnpm exec vitest run packages/adapters/claude-local/src/server/execute.remote.test.ts packages/adapters/cursor-local/src/server/execute.remote.test.ts packages/adapters/gemini-local/src/server/execute.remote.test.ts packages/adapters/opencode-local/src/server/execute.remote.test.ts packages/adapters/pi-local/src/server/execute.remote.test.ts` - `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts` - `pnpm exec vitest run server/src/__tests__/instance-settings-routes.test.ts` - `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts ui/src/lib/new-agent-runtime-config.test.ts` - `pnpm -r typecheck` - `pnpm build` - Manual verification on a branch-local dev server: - enabled the experimental flag - created an SSH environment - created a Linux Claude agent using that environment - confirmed a run executed on the Linux box and synced workspace changes back ## Risks - Medium: this touches runtime execution flow across multiple adapters, so regressions would likely show up in remote session setup, workspace sync, or environment selection precedence. - The UI flag reduces exposure, but the underlying runtime and route changes are still substantial and rely on migration correctness. - The change set is broad across adapters, control-plane services, migrations, and UI gating, so review should pay close attention to environment-selection precedence and remote workspace lifecycle behavior. ## Model Used - OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding model with tool use and code execution in the local repo workspace. The local adapter does not surface a more specific public model version string in this branch workflow. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-23 19:15:22 -07:00
} from "@paperclipai/adapter-utils/execution-target";
import {
asString,
asNumber,
asStringArray,
parseObject,
Fix CEO AGENT_HOME paths and centralize workspace env propagation (#4551) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies. > - The local adapter layer is responsible for turning Paperclip runtime context into the environment seen by the child agent process. > - The CEO onboarding bundle tells the agent where to read and write its persistent memory and fact files. > - That bundle was using `./memory/...` and `./life/...`, which only works when the process cwd happens to equal the agent home directory. > - At the same time, six local adapters each duplicated the same workspace-env propagation logic, including `AGENT_HOME`, which makes this contract easy to drift. > - This pull request fixes the CEO instructions to use `$AGENT_HOME/...` and centralizes workspace-env propagation in one shared helper with shared tests. > - The benefit is a real bug fix for agent memory paths plus a single tested contract that makes future built-in adapter work less likely to forget `AGENT_HOME`. ## What Changed - Updated `server/src/onboarding-assets/ceo/HEARTBEAT.md` to use `$AGENT_HOME/memory/...` and `$AGENT_HOME/life/...` instead of cwd-relative `./memory/...` and `./life/...`. - Added `applyPaperclipWorkspaceEnv(...)` in `packages/adapter-utils/src/server-utils.ts` to centralize `PAPERCLIP_WORKSPACE_*` and `AGENT_HOME` propagation. - Added shared helper coverage in `packages/adapter-utils/src/server-utils.test.ts` for both populated and skip-empty cases. - Switched the built-in local adapters (`claude_local`, `codex_local`, `cursor_local`, `gemini_local`, `opencode_local`, `pi_local`) over to the shared helper instead of inline env assignment blocks. ## Verification - `pnpm install` - `pnpm exec vitest run packages/adapter-utils/src/server-utils.test.ts packages/adapters/claude-local/src/server/execute.remote.test.ts packages/adapters/codex-local/src/server/execute.remote.test.ts packages/adapters/cursor-local/src/server/execute.remote.test.ts packages/adapters/gemini-local/src/server/execute.remote.test.ts packages/adapters/opencode-local/src/server/execute.remote.test.ts packages/adapters/pi-local/src/server/execute.remote.test.ts` - Result: 7 test files passed, 31 tests passed, 0 failures. ## Risks - Low risk. - The only behavioral surface is the shared env propagation refactor across six adapters; if the helper diverged from prior semantics, an adapter could miss a workspace env var. - The shared helper test plus the affected adapter execute tests reduce that risk, and the helper preserves the prior "set only non-empty strings" behavior. ## Model Used - OpenAI Codex via Paperclip `codex_local` agent runtime; tool-assisted coding workflow with shell execution, file patching, git operations, and API interaction. The exact backend model identifier and context window are not surfaced by this local runtime. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-26 13:57:35 -07:00
applyPaperclipWorkspaceEnv,
buildPaperclipEnv,
buildInvocationEnvForLogs,
ensureAbsoluteDirectory,
ensurePaperclipSkillSymlink,
ensurePathInEnv,
readPaperclipRuntimeSkillEntries,
Add planning mode for issue work (#5353) ## Thinking Path > - Paperclip is a control plane for autonomous AI companies. > - Issues are the core unit of work, and issue comments are how board users and agents coordinate execution. > - Some issue conversations need to produce plans and approvals instead of immediate implementation work. > - The existing issue contract did not distinguish standard execution comments from planning-oriented issue work. > - This pull request adds an issue work-mode contract and board UI affordances for standard vs planning mode. > - The benefit is that planning-mode issues can be created, displayed, discussed, and carried through agent heartbeat context without losing the normal issue workflow. ## What Changed - Added `standard` / `planning` issue work-mode contracts across DB, shared validators/types, server issue flows, plugin protocol, and adapter heartbeat payloads. - Added an idempotent `0081_optimal_dormammu` migration for `issues.work_mode`, ordered after current `public-gh/master` migrations. - Updated heartbeat/context summaries and issue-thread interaction behavior so planning work mode is preserved when creating suggested follow-up issues. - Added UI support for planning-mode issue creation, issue rows, detail composer styling, and composer work-mode toggles. - Added focused server/shared/UI tests plus a Playwright visual verification spec for planning-mode surfaces. - Rebased the branch onto current `public-gh/master` and added durable planning-mode screenshots under `doc/assets/pap-3368/`. ## Verification - `pnpm --filter @paperclipai/db run check:migrations` - `pnpm exec vitest run --project @paperclipai/shared packages/shared/src/validators/issue.test.ts` - `pnpm exec vitest run --project @paperclipai/server server/src/__tests__/heartbeat-context-summary.test.ts server/src/__tests__/issue-thread-interactions-service.test.ts server/src/__tests__/issues-goal-context-routes.test.ts --pool=forks --poolOptions.forks.isolate=true` - `pnpm exec vitest run --project @paperclipai/ui ui/src/components/IssueChatThread.test.tsx ui/src/components/NewIssueDialog.test.tsx ui/src/components/IssueRow.test.tsx ui/src/pages/IssueDetail.test.tsx` - `pnpm exec vitest run --project @paperclipai/adapter-utils packages/adapter-utils/src/server-utils.test.ts` - `PAPERCLIP_E2E_SKIP_LLM=true npx playwright test --config tests/e2e/playwright.config.ts tests/e2e/planning-mode-visual-verification.spec.ts` ## Screenshots Desktop planning detail: ![Desktop planning detail](https://raw.githubusercontent.com/paperclipai/paperclip/PAP-3368-plan-a-planning-mode-for-issues/doc/assets/pap-3368/desktop-planning-detail.png) Desktop planning row: ![Desktop planning row](https://raw.githubusercontent.com/paperclipai/paperclip/PAP-3368-plan-a-planning-mode-for-issues/doc/assets/pap-3368/desktop-planning-row.png) Desktop staged standard toggle: ![Desktop staged standard toggle](https://raw.githubusercontent.com/paperclipai/paperclip/PAP-3368-plan-a-planning-mode-for-issues/doc/assets/pap-3368/desktop-standard-toggle.png) Mobile planning detail: ![Mobile planning detail](https://raw.githubusercontent.com/paperclipai/paperclip/PAP-3368-plan-a-planning-mode-for-issues/doc/assets/pap-3368/mobile-planning-detail.png) Mobile planning row: ![Mobile planning row](https://raw.githubusercontent.com/paperclipai/paperclip/PAP-3368-plan-a-planning-mode-for-issues/doc/assets/pap-3368/mobile-planning-row.png) ## Risks - Medium migration risk: this adds a non-null issue column. The migration uses `ADD COLUMN IF NOT EXISTS` so installations that applied an older branch-local migration number can still apply the final numbered migration safely. - Medium contract risk: issue payloads, plugin payloads, and adapter heartbeat payloads now include work mode; compatibility is handled by defaulting missing values to `standard`. - UI risk is moderate because composer controls changed; focused component tests and visual e2e coverage exercise standard vs planning display and toggle behavior. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex, GPT-5 coding agent in a local Paperclip worktree, with shell/tool use. Exact context-window size is not exposed in this runtime. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-06 07:01:28 -05:00
readPaperclipIssueWorkModeFromContext,
resolvePaperclipDesiredSkillNames,
removeMaintainerOnlySkillSymlinks,
renderTemplate,
renderPaperclipWakePrompt,
Fix remote workspace environment shaping (#5118) > **Stacked PR (part 5 of 7).** Depends on: - PR #5114 - PR #5115 - PR #5116 - PR #5117 > Diff against `master` includes commits from earlier PRs in the stack — the new commit in this PR is the topmost one. ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Agents run with a Paperclip-shaped environment (`PAPERCLIP_WORKSPACE_CWD`, > worktree path, `PAPERCLIP_WORKSPACES_JSON` hints) so the CLI can locate the > correct project tree > - SSH testing reproduced a real failure: a Codex SSH run wrote to > `/tmp/paperclip-env-matrix-...` (the *host* path) instead of the realized > remote workspace at `/home/<user>/paperclip-env-matrix-ssh-claude/...` > because the adapter injected `PAPERCLIP_WORKSPACE_CWD=/tmp/...` into the > remote env > - Code review on the initial codex-only fix asked to roll the same approach > into every other SSH-capable adapter (claude, acpx, cursor, opencode, gemini, > pi) via a shared helper rather than duplicating per-adapter > - This PR adds `shapePaperclipWorkspaceEnvForExecution` in adapter-utils that, > when the execution target is remote: replaces local cwd with the realized > execution cwd, nulls out worktree path (which has no remote meaning), and > rewrites/strips `cwd` entries in workspace hints based on what was actually > synced. Every adapter calls it before invoking the remote runner > - The benefit is that remote runs see the realized remote workspace, host-local > paths stop leaking into remote env, and the rule is unit-tested in one place ## What Changed - Added `shapePaperclipWorkspaceEnvForExecution` to `packages/adapter-utils/src/server-utils.ts` with full unit coverage (`server-utils.test.ts`) - Each of acpx-local, claude-local, codex-local, cursor-local, gemini-local, opencode-local, pi-local now calls the new shaper before issuing the remote command and feeds the shaped values into `applyPaperclipWorkspaceEnv` - Per-adapter `execute.remote.test.ts` files extended to cover the new shaping behaviour: localhost paths replaced with remote cwd, foreign-cwd hints stripped, worktree path nulled out for remote targets - `acpx-local/src/server/execute.test.ts` extended with shaping coverage ## Verification - `pnpm test -- server-utils execute.remote` - `pnpm --filter @paperclipai/adapter-acpx-local test` - Manual QA reproducing the original failure: 1. Provision an E2B sandbox environment for the Paperclip QA company 2. Assign an issue to a remote-targeted claude-local agent and confirm the run starts in the correct remote cwd (no `/Users/...` path leakage in the run logs) 3. Repeat for opencode-local and pi-local ## Risks - Behavioural shift: hints whose `cwd` doesn't match the workspace cwd are now stripped on remote targets. If any adapter relied on a leaked local hint cwd, it will see a missing `cwd` instead. Reviewed all current callers — none do. - Adds a small per-run cost (path resolve + string normalisation) on every remote execution. Negligible. - Worktree path is now nulled out on remote (it has no meaning there). Adapters that previously read the value defensively will continue to work. ## Model Used - OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI - Provider: OpenAI - Used to author the code changes in this PR ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots — N/A - [ ] I have updated relevant documentation to reflect my changes — N/A - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-05-03 13:17:52 -07:00
shapePaperclipWorkspaceEnvForExecution,
stringifyPaperclipWakePayload,
[codex] Add run liveness continuations (#4083) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies. > - Heartbeat runs are the control-plane record of each agent execution window. > - Long-running local agents can exhaust context or stop while still holding useful next-step state. > - Operators need that stop reason, next action, and continuation path to be durable and visible. > - This pull request adds run liveness metadata, continuation summaries, and UI surfaces for issue run ledgers. > - The benefit is that interrupted or long-running work can resume with clearer context instead of losing the agent's last useful handoff. ## What Changed - Added heartbeat-run liveness fields, continuation attempt tracking, and an idempotent `0058` migration. - Added server services and tests for run liveness, continuation summaries, stop metadata, and activity backfill. - Wired local and HTTP adapters to surface continuation/liveness context through shared adapter utilities. - Added shared constants, validators, and heartbeat types for liveness continuation state. - Added issue-detail UI surfaces for continuation handoffs and the run ledger, with component tests. - Updated agent runtime docs, heartbeat protocol docs, prompt guidance, onboarding assets, and skills instructions to explain continuation behavior. - Addressed Greptile feedback by scoping document evidence by run, excluding system continuation-summary documents from liveness evidence, importing shared liveness types, surfacing hidden ledger run counts, documenting bounded retry behavior, and moving run-ledger liveness backfill off the request path. ## Verification - `pnpm exec vitest run packages/adapter-utils/src/server-utils.test.ts server/src/__tests__/run-continuations.test.ts server/src/__tests__/run-liveness.test.ts server/src/__tests__/activity-service.test.ts server/src/__tests__/documents-service.test.ts server/src/__tests__/issue-continuation-summary.test.ts server/src/services/heartbeat-stop-metadata.test.ts ui/src/components/IssueRunLedger.test.tsx ui/src/components/IssueContinuationHandoff.test.tsx ui/src/components/IssueDocumentsSection.test.tsx` - `pnpm --filter @paperclipai/db build` - `pnpm exec vitest run server/src/__tests__/activity-service.test.ts ui/src/components/IssueRunLedger.test.tsx` - `pnpm --filter @paperclipai/ui typecheck` - `pnpm --filter @paperclipai/server typecheck` - `pnpm exec vitest run server/src/__tests__/activity-service.test.ts server/src/__tests__/run-continuations.test.ts ui/src/components/IssueRunLedger.test.tsx` - `pnpm exec vitest run server/src/__tests__/heartbeat-process-recovery.test.ts -t "treats a plan document update"` - `pnpm exec vitest run server/src/__tests__/activity-service.test.ts server/src/__tests__/heartbeat-process-recovery.test.ts -t "activity service|treats a plan document update"` - Remote PR checks on head `e53b1a1d`: `verify`, `e2e`, `policy`, and Snyk all passed. - Confirmed `public-gh/master` is an ancestor of this branch after fetching `public-gh master`. - Confirmed `pnpm-lock.yaml` is not included in the branch diff. - Confirmed migration `0058_wealthy_starbolt.sql` is ordered after `0057` and uses `IF NOT EXISTS` guards for repeat application. - Greptile inline review threads are resolved. ## Risks - Medium risk: this touches heartbeat execution, liveness recovery, activity rendering, issue routes, shared contracts, docs, and UI. - Migration risk is mitigated by additive columns/indexes and idempotent guards. - Run-ledger liveness backfill is now asynchronous, so the first ledger response can briefly show historical missing liveness until the background backfill completes. - UI screenshot coverage is not included in this packaging pass; validation is currently through focused component tests. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex, GPT-5.4, local tool-use coding agent with terminal, git, GitHub connector, GitHub CLI, and Paperclip API access. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge Screenshot note: no before/after screenshots were captured in this PR packaging pass; the UI changes are covered by focused component tests listed above. --------- Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-20 06:01:49 -05:00
DEFAULT_PAPERCLIP_AGENT_PROMPT_TEMPLATE,
joinPromptSections,
} from "@paperclipai/adapter-utils/server-utils";
Wire per-adapter sandbox install commands through test and execute paths (#5280) > **Stacked PR.** Sits on top of the e2b sandbox chain — #5278 (stdin staging) and #5279 (honest-resolvability + login-profiles). The cumulative diff against `master` includes both of those PRs' content; the files touched by *this* PR's commit are the new `maybeRunSandboxInstallCommand` helper in `packages/adapter-utils/src/execution-target.ts` and the per-adapter `index.ts`/`server/test.ts`/`server/execute.ts` wiring under `packages/adapters/{claude,codex,cursor,gemini,opencode,pi}-local/`. The honest resolvability check from #5279 is what gives this PR's install command a meaningful "did it actually land on PATH" follow-up. ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Sandbox execution targets are ephemeral — each fresh lease starts from a template image that may or may not have the agent CLIs preinstalled > - When a CLI isn't preinstalled, the resolvability probe fails at `command -v` and the hello probe never runs > - There's no shared mechanism for "before you probe or provision, install the CLI on this sandbox" > - This pull request adds a `SANDBOX_INSTALL_COMMAND` constant per adapter and a `maybeRunSandboxInstallCommand` helper that runs it via the existing sandbox login shell, captures structured output, and never throws (so the resolvability + hello probe still run after); each adapter's `test()` and `execute()` share the constant so the two callsites can't drift > - The benefit is a fresh sandbox lease without a preinstalled CLI now installs it once via `sh -lc` before the resolvability probe and before managed-runtime provisioning, with a uniform `<adapter>_install_command_run` check on the test report ## What Changed - `packages/adapter-utils/src/execution-target.ts`: add `AdapterSandboxInstallCommandCheck` and `maybeRunSandboxInstallCommand` (runs the install via existing sandbox shell, captures exit/stdout/stderr, returns a structured info/warn check, never throws) - Add `SANDBOX_INSTALL_COMMAND` to each adapter's `index.ts` so `test()` and `execute()` share a single source of truth - Wire each of the 6 affected adapter `testEnvironment()`s to call `maybeRunSandboxInstallCommand` before `ensureAdapterExecutionTargetCommandResolvable` - Pass `installCommand: SANDBOX_INSTALL_COMMAND` through `prepareAdapterExecutionTargetRuntime` in each adapter's `execute()` - Per-adapter install commands use npm globals where possible so binaries land on a PATH segment the template already exports: - claude → `npm install -g @anthropic-ai/claude-code` - codex → `npm install -g @openai/codex` - cursor → `curl https://cursor.com/install -fsS | bash` - gemini → `npm install -g @google/gemini-cli` - opencode → `npm install -g opencode-ai` - pi → `npm install -g @mariozechner/pi-coding-agent` SSH and local targets ignore `installCommand` (SSH runtime takes no such param; local short-circuits before runtime prep), so this is a no-op for non-sandbox environments. ## Verification - `pnpm typecheck` clean - `pnpm vitest run --no-coverage --project @paperclipai/adapter-utils` and per-adapter projects pass - Manual sandbox matrix (claude, codex, cursor, gemini, opencode, pi) — each goes `install_command_run → resolvable → hello_probe_passed` (Codex and Pi land on `hello_probe_auth_required`, which is the configured-credentials problem, not an install issue) - SSH no-regression: SSH Claude still passes; the helper short-circuits on non-sandbox targets ## Risks Medium — adds a network/CPU cost (npm install / curl) on every fresh sandbox lease. Cost is bounded (one-time per lease, typically tens of seconds for npm globals), and the helper never throws so a failing install still lets the report run resolvability and hello probes. If a sandbox image already has the CLI, the install is an idempotent reinstall. ## Model Used Claude Opus 4.7 (1M context) ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots — N/A (no UI) - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-05-05 08:29:28 -07:00
import { DEFAULT_CURSOR_LOCAL_MODEL, SANDBOX_INSTALL_COMMAND } from "../index.js";
import { parseCursorJsonl, isCursorUnknownSessionError } from "./parse.js";
Add cursor sandbox support and fix SSH workspace sync (#4803) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Agents can run inside sandboxed environments like E2B, or on remote hosts via SSH > - The cursor adapter needs to resolve `cursor-agent` inside sandbox environments where it's installed in `~/.local/bin` > - But when using the default `agent` command on a sandbox target, the adapter didn't know to look in `~/.local/bin/cursor-agent`, causing "command not found" failures > - Additionally, repeated SSH runs failed because `git checkout` during workspace sync conflicted with leftover `.paperclip-runtime` files from previous runs > - This PR adds sandbox-aware command resolution for cursor and fixes the SSH workspace sync conflict > - The benefit is cursor works in E2B sandboxes out of the box, and repeated SSH runs don't fail on workspace sync ## What Changed - `cursor-local`: Added `prepareCursorSandboxCommand` — on sandbox targets, reads the remote `$HOME`, prepends `~/.local/bin` to PATH, and prefers `~/.local/bin/cursor-agent` when the default command is requested; tightened the sandbox command probe to validate the binary exists before launching; preserves explicit custom command overrides - `adapter-utils/ssh.ts`: Added `--force` to git checkout in SSH workspace sync to handle `.paperclip-runtime` untracked file conflicts from previous runs ## Verification - `pnpm test` — all existing and new tests pass, including cursor sandbox probe, sandbox execution, and custom command override tests - `pnpm typecheck` — clean - Manual: configure an E2B environment, run a cursor-local task, verify it resolves cursor-agent from the sandbox install path ## Risks - Low-medium. The `--force` flag on git checkout could discard uncommitted changes in the remote workspace, but the workspace is managed by Paperclip and should not contain user edits. ## Model Used Codex GPT 5.4 high via Paperclip. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-29 16:12:06 -07:00
import { prepareCursorSandboxCommand } from "./remote-command.js";
import { normalizeCursorStreamLine } from "../shared/stream.js";
import { hasCursorTrustBypassArg } from "../shared/trust.js";
const __moduleDir = path.dirname(fileURLToPath(import.meta.url));
function firstNonEmptyLine(text: string): string {
return (
text
.split(/\r?\n/)
.map((line) => line.trim())
.find(Boolean) ?? ""
);
}
function hasNonEmptyEnvValue(env: Record<string, string>, key: string): boolean {
const raw = env[key];
return typeof raw === "string" && raw.trim().length > 0;
}
function resolveCursorBillingType(env: Record<string, string>): "api" | "subscription" {
return hasNonEmptyEnvValue(env, "CURSOR_API_KEY") || hasNonEmptyEnvValue(env, "OPENAI_API_KEY")
? "api"
: "subscription";
}
function resolveCursorBiller(
env: Record<string, string>,
billingType: "api" | "subscription",
provider: string | null,
): string {
const openAiCompatibleBiller = inferOpenAiCompatibleBiller(env, null);
if (openAiCompatibleBiller === "openrouter") return "openrouter";
if (billingType === "subscription") return "cursor";
return provider ?? "cursor";
}
function resolveProviderFromModel(model: string): string | null {
const trimmed = model.trim().toLowerCase();
if (!trimmed) return null;
const slash = trimmed.indexOf("/");
if (slash > 0) return trimmed.slice(0, slash);
if (trimmed.includes("sonnet") || trimmed.includes("claude")) return "anthropic";
if (trimmed.startsWith("gpt") || trimmed.startsWith("o")) return "openai";
return null;
}
function normalizeMode(rawMode: string): "plan" | "ask" | null {
const mode = rawMode.trim().toLowerCase();
if (mode === "plan" || mode === "ask") return mode;
return null;
}
function renderPaperclipEnvNote(env: Record<string, string>): string {
const paperclipKeys = Object.keys(env)
.filter((key) => key.startsWith("PAPERCLIP_"))
.sort();
if (paperclipKeys.length === 0) return "";
return [
"Paperclip runtime note:",
`The following PAPERCLIP_* environment variables are available in this run: ${paperclipKeys.join(", ")}`,
"Do not assume these variables are missing without checking your shell environment.",
"",
"",
].join("\n");
}
function cursorSkillsHome(): string {
return path.join(os.homedir(), ".cursor", "skills");
}
Add SSH environment support (#4358) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The environments subsystem already models execution environments, but before this branch there was no end-to-end SSH-backed runtime path for agents to actually run work against a remote box > - That meant agents could be configured around environment concepts without a reliable way to execute adapter sessions remotely, sync workspace state, and preserve run context across supported adapters > - We also need environment selection to participate in normal Paperclip control-plane behavior: agent defaults, project/issue selection, route validation, and environment probing > - Because this capability is still experimental, the UI surface should be easy to hide and easy to remove later without undoing the underlying implementation > - This pull request adds SSH environment execution support across the runtime, adapters, routes, schema, and tests, then puts the visible environment-management UI behind an experimental flag > - The benefit is that we can validate real SSH-backed agent execution now while keeping the user-facing controls safely gated until the feature is ready to come out of experimentation ## What Changed - Added SSH-backed execution target support in the shared adapter runtime, including remote workspace preparation, skill/runtime asset sync, remote session handling, and workspace restore behavior after runs. - Added SSH execution coverage for supported local adapters, plus remote execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi. - Added environment selection and environment-management backend support needed for SSH execution, including route/service work, validation, probing, and agent default environment persistence. - Added CLI support for SSH environment lab verification and updated related docs/tests. - Added the `enableEnvironments` experimental flag and gated the environment UI behind it on company settings, agent configuration, and project configuration surfaces. ## Verification - `pnpm exec vitest run packages/adapters/claude-local/src/server/execute.remote.test.ts packages/adapters/cursor-local/src/server/execute.remote.test.ts packages/adapters/gemini-local/src/server/execute.remote.test.ts packages/adapters/opencode-local/src/server/execute.remote.test.ts packages/adapters/pi-local/src/server/execute.remote.test.ts` - `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts` - `pnpm exec vitest run server/src/__tests__/instance-settings-routes.test.ts` - `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts ui/src/lib/new-agent-runtime-config.test.ts` - `pnpm -r typecheck` - `pnpm build` - Manual verification on a branch-local dev server: - enabled the experimental flag - created an SSH environment - created a Linux Claude agent using that environment - confirmed a run executed on the Linux box and synced workspace changes back ## Risks - Medium: this touches runtime execution flow across multiple adapters, so regressions would likely show up in remote session setup, workspace sync, or environment selection precedence. - The UI flag reduces exposure, but the underlying runtime and route changes are still substantial and rely on migration correctness. - The change set is broad across adapters, control-plane services, migrations, and UI gating, so review should pay close attention to environment-selection precedence and remote workspace lifecycle behavior. ## Model Used - OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding model with tool use and code execution in the local repo workspace. The local adapter does not surface a more specific public model version string in this branch workflow. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-23 19:15:22 -07:00
async function buildCursorSkillsDir(config: Record<string, unknown>): Promise<string> {
const tmp = await fs.mkdtemp(path.join(os.tmpdir(), "paperclip-cursor-skills-"));
const target = path.join(tmp, "skills");
await fs.mkdir(target, { recursive: true });
const availableEntries = await readPaperclipRuntimeSkillEntries(config, __moduleDir);
const desiredNames = new Set(resolvePaperclipDesiredSkillNames(config, availableEntries));
for (const entry of availableEntries) {
if (!desiredNames.has(entry.key)) continue;
await fs.symlink(entry.source, path.join(target, entry.runtimeName));
}
return target;
}
type EnsureCursorSkillsInjectedOptions = {
skillsDir?: string | null;
skillsEntries?: Array<{ key: string; runtimeName: string; source: string }>;
skillsHome?: string;
linkSkill?: (source: string, target: string) => Promise<void>;
};
export async function ensureCursorSkillsInjected(
onLog: AdapterExecutionContext["onLog"],
options: EnsureCursorSkillsInjectedOptions = {},
) {
const skillsEntries = options.skillsEntries
?? (options.skillsDir
? (await fs.readdir(options.skillsDir, { withFileTypes: true }))
.filter((entry) => entry.isDirectory())
.map((entry) => ({
key: entry.name,
runtimeName: entry.name,
source: path.join(options.skillsDir!, entry.name),
}))
: await readPaperclipRuntimeSkillEntries({}, __moduleDir));
if (skillsEntries.length === 0) return;
const skillsHome = options.skillsHome ?? cursorSkillsHome();
try {
await fs.mkdir(skillsHome, { recursive: true });
} catch (err) {
await onLog(
"stderr",
`[paperclip] Failed to prepare Cursor skills directory ${skillsHome}: ${err instanceof Error ? err.message : String(err)}\n`,
);
return;
}
const removedSkills = await removeMaintainerOnlySkillSymlinks(
skillsHome,
skillsEntries.map((entry) => entry.runtimeName),
);
for (const skillName of removedSkills) {
await onLog(
"stderr",
`[paperclip] Removed maintainer-only Cursor skill "${skillName}" from ${skillsHome}\n`,
);
}
const linkSkill = options.linkSkill ?? ((source: string, target: string) => fs.symlink(source, target));
for (const entry of skillsEntries) {
const target = path.join(skillsHome, entry.runtimeName);
try {
const result = await ensurePaperclipSkillSymlink(entry.source, target, linkSkill);
if (result === "skipped") continue;
await onLog(
"stderr",
`[paperclip] ${result === "repaired" ? "Repaired" : "Injected"} Cursor skill "${entry.key}" into ${skillsHome}\n`,
);
} catch (err) {
await onLog(
"stderr",
`[paperclip] Failed to inject Cursor skill "${entry.key}" into ${skillsHome}: ${err instanceof Error ? err.message : String(err)}\n`,
);
}
}
}
export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExecutionResult> {
const { runId, agent, runtime, config, context, onLog, onMeta, onSpawn, authToken } = ctx;
Add SSH environment support (#4358) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The environments subsystem already models execution environments, but before this branch there was no end-to-end SSH-backed runtime path for agents to actually run work against a remote box > - That meant agents could be configured around environment concepts without a reliable way to execute adapter sessions remotely, sync workspace state, and preserve run context across supported adapters > - We also need environment selection to participate in normal Paperclip control-plane behavior: agent defaults, project/issue selection, route validation, and environment probing > - Because this capability is still experimental, the UI surface should be easy to hide and easy to remove later without undoing the underlying implementation > - This pull request adds SSH environment execution support across the runtime, adapters, routes, schema, and tests, then puts the visible environment-management UI behind an experimental flag > - The benefit is that we can validate real SSH-backed agent execution now while keeping the user-facing controls safely gated until the feature is ready to come out of experimentation ## What Changed - Added SSH-backed execution target support in the shared adapter runtime, including remote workspace preparation, skill/runtime asset sync, remote session handling, and workspace restore behavior after runs. - Added SSH execution coverage for supported local adapters, plus remote execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi. - Added environment selection and environment-management backend support needed for SSH execution, including route/service work, validation, probing, and agent default environment persistence. - Added CLI support for SSH environment lab verification and updated related docs/tests. - Added the `enableEnvironments` experimental flag and gated the environment UI behind it on company settings, agent configuration, and project configuration surfaces. ## Verification - `pnpm exec vitest run packages/adapters/claude-local/src/server/execute.remote.test.ts packages/adapters/cursor-local/src/server/execute.remote.test.ts packages/adapters/gemini-local/src/server/execute.remote.test.ts packages/adapters/opencode-local/src/server/execute.remote.test.ts packages/adapters/pi-local/src/server/execute.remote.test.ts` - `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts` - `pnpm exec vitest run server/src/__tests__/instance-settings-routes.test.ts` - `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts ui/src/lib/new-agent-runtime-config.test.ts` - `pnpm -r typecheck` - `pnpm build` - Manual verification on a branch-local dev server: - enabled the experimental flag - created an SSH environment - created a Linux Claude agent using that environment - confirmed a run executed on the Linux box and synced workspace changes back ## Risks - Medium: this touches runtime execution flow across multiple adapters, so regressions would likely show up in remote session setup, workspace sync, or environment selection precedence. - The UI flag reduces exposure, but the underlying runtime and route changes are still substantial and rely on migration correctness. - The change set is broad across adapters, control-plane services, migrations, and UI gating, so review should pay close attention to environment-selection precedence and remote workspace lifecycle behavior. ## Model Used - OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding model with tool use and code execution in the local repo workspace. The local adapter does not surface a more specific public model version string in this branch workflow. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-23 19:15:22 -07:00
const executionTarget = readAdapterExecutionTarget({
executionTarget: ctx.executionTarget,
legacyRemoteExecution: ctx.executionTransport?.remoteExecution,
});
const executionTargetIsRemote = adapterExecutionTargetIsRemote(executionTarget);
const promptTemplate = asString(
config.promptTemplate,
[codex] Add run liveness continuations (#4083) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies. > - Heartbeat runs are the control-plane record of each agent execution window. > - Long-running local agents can exhaust context or stop while still holding useful next-step state. > - Operators need that stop reason, next action, and continuation path to be durable and visible. > - This pull request adds run liveness metadata, continuation summaries, and UI surfaces for issue run ledgers. > - The benefit is that interrupted or long-running work can resume with clearer context instead of losing the agent's last useful handoff. ## What Changed - Added heartbeat-run liveness fields, continuation attempt tracking, and an idempotent `0058` migration. - Added server services and tests for run liveness, continuation summaries, stop metadata, and activity backfill. - Wired local and HTTP adapters to surface continuation/liveness context through shared adapter utilities. - Added shared constants, validators, and heartbeat types for liveness continuation state. - Added issue-detail UI surfaces for continuation handoffs and the run ledger, with component tests. - Updated agent runtime docs, heartbeat protocol docs, prompt guidance, onboarding assets, and skills instructions to explain continuation behavior. - Addressed Greptile feedback by scoping document evidence by run, excluding system continuation-summary documents from liveness evidence, importing shared liveness types, surfacing hidden ledger run counts, documenting bounded retry behavior, and moving run-ledger liveness backfill off the request path. ## Verification - `pnpm exec vitest run packages/adapter-utils/src/server-utils.test.ts server/src/__tests__/run-continuations.test.ts server/src/__tests__/run-liveness.test.ts server/src/__tests__/activity-service.test.ts server/src/__tests__/documents-service.test.ts server/src/__tests__/issue-continuation-summary.test.ts server/src/services/heartbeat-stop-metadata.test.ts ui/src/components/IssueRunLedger.test.tsx ui/src/components/IssueContinuationHandoff.test.tsx ui/src/components/IssueDocumentsSection.test.tsx` - `pnpm --filter @paperclipai/db build` - `pnpm exec vitest run server/src/__tests__/activity-service.test.ts ui/src/components/IssueRunLedger.test.tsx` - `pnpm --filter @paperclipai/ui typecheck` - `pnpm --filter @paperclipai/server typecheck` - `pnpm exec vitest run server/src/__tests__/activity-service.test.ts server/src/__tests__/run-continuations.test.ts ui/src/components/IssueRunLedger.test.tsx` - `pnpm exec vitest run server/src/__tests__/heartbeat-process-recovery.test.ts -t "treats a plan document update"` - `pnpm exec vitest run server/src/__tests__/activity-service.test.ts server/src/__tests__/heartbeat-process-recovery.test.ts -t "activity service|treats a plan document update"` - Remote PR checks on head `e53b1a1d`: `verify`, `e2e`, `policy`, and Snyk all passed. - Confirmed `public-gh/master` is an ancestor of this branch after fetching `public-gh master`. - Confirmed `pnpm-lock.yaml` is not included in the branch diff. - Confirmed migration `0058_wealthy_starbolt.sql` is ordered after `0057` and uses `IF NOT EXISTS` guards for repeat application. - Greptile inline review threads are resolved. ## Risks - Medium risk: this touches heartbeat execution, liveness recovery, activity rendering, issue routes, shared contracts, docs, and UI. - Migration risk is mitigated by additive columns/indexes and idempotent guards. - Run-ledger liveness backfill is now asynchronous, so the first ledger response can briefly show historical missing liveness until the background backfill completes. - UI screenshot coverage is not included in this packaging pass; validation is currently through focused component tests. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex, GPT-5.4, local tool-use coding agent with terminal, git, GitHub connector, GitHub CLI, and Paperclip API access. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge Screenshot note: no before/after screenshots were captured in this PR packaging pass; the UI changes are covered by focused component tests listed above. --------- Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-20 06:01:49 -05:00
DEFAULT_PAPERCLIP_AGENT_PROMPT_TEMPLATE,
);
Add cursor sandbox support and fix SSH workspace sync (#4803) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Agents can run inside sandboxed environments like E2B, or on remote hosts via SSH > - The cursor adapter needs to resolve `cursor-agent` inside sandbox environments where it's installed in `~/.local/bin` > - But when using the default `agent` command on a sandbox target, the adapter didn't know to look in `~/.local/bin/cursor-agent`, causing "command not found" failures > - Additionally, repeated SSH runs failed because `git checkout` during workspace sync conflicted with leftover `.paperclip-runtime` files from previous runs > - This PR adds sandbox-aware command resolution for cursor and fixes the SSH workspace sync conflict > - The benefit is cursor works in E2B sandboxes out of the box, and repeated SSH runs don't fail on workspace sync ## What Changed - `cursor-local`: Added `prepareCursorSandboxCommand` — on sandbox targets, reads the remote `$HOME`, prepends `~/.local/bin` to PATH, and prefers `~/.local/bin/cursor-agent` when the default command is requested; tightened the sandbox command probe to validate the binary exists before launching; preserves explicit custom command overrides - `adapter-utils/ssh.ts`: Added `--force` to git checkout in SSH workspace sync to handle `.paperclip-runtime` untracked file conflicts from previous runs ## Verification - `pnpm test` — all existing and new tests pass, including cursor sandbox probe, sandbox execution, and custom command override tests - `pnpm typecheck` — clean - Manual: configure an E2B environment, run a cursor-local task, verify it resolves cursor-agent from the sandbox install path ## Risks - Low-medium. The `--force` flag on git checkout could discard uncommitted changes in the remote workspace, but the workspace is managed by Paperclip and should not contain user edits. ## Model Used Codex GPT 5.4 high via Paperclip. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-29 16:12:06 -07:00
let command = asString(config.command, "agent");
const model = asString(config.model, DEFAULT_CURSOR_LOCAL_MODEL).trim();
const mode = normalizeMode(asString(config.mode, ""));
const workspaceContext = parseObject(context.paperclipWorkspace);
const workspaceCwd = asString(workspaceContext.cwd, "");
const workspaceSource = asString(workspaceContext.source, "");
const workspaceId = asString(workspaceContext.workspaceId, "");
const workspaceRepoUrl = asString(workspaceContext.repoUrl, "");
const workspaceRepoRef = asString(workspaceContext.repoRef, "");
const agentHome = asString(workspaceContext.agentHome, "");
const workspaceHints = Array.isArray(context.paperclipWorkspaces)
? context.paperclipWorkspaces.filter(
(value): value is Record<string, unknown> => typeof value === "object" && value !== null,
)
: [];
const configuredCwd = asString(config.cwd, "");
const useConfiguredInsteadOfAgentHome = workspaceSource === "agent_home" && configuredCwd.length > 0;
const effectiveWorkspaceCwd = useConfiguredInsteadOfAgentHome ? "" : workspaceCwd;
const cwd = effectiveWorkspaceCwd || configuredCwd || process.cwd();
Fix remote workspace environment shaping (#5118) > **Stacked PR (part 5 of 7).** Depends on: - PR #5114 - PR #5115 - PR #5116 - PR #5117 > Diff against `master` includes commits from earlier PRs in the stack — the new commit in this PR is the topmost one. ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Agents run with a Paperclip-shaped environment (`PAPERCLIP_WORKSPACE_CWD`, > worktree path, `PAPERCLIP_WORKSPACES_JSON` hints) so the CLI can locate the > correct project tree > - SSH testing reproduced a real failure: a Codex SSH run wrote to > `/tmp/paperclip-env-matrix-...` (the *host* path) instead of the realized > remote workspace at `/home/<user>/paperclip-env-matrix-ssh-claude/...` > because the adapter injected `PAPERCLIP_WORKSPACE_CWD=/tmp/...` into the > remote env > - Code review on the initial codex-only fix asked to roll the same approach > into every other SSH-capable adapter (claude, acpx, cursor, opencode, gemini, > pi) via a shared helper rather than duplicating per-adapter > - This PR adds `shapePaperclipWorkspaceEnvForExecution` in adapter-utils that, > when the execution target is remote: replaces local cwd with the realized > execution cwd, nulls out worktree path (which has no remote meaning), and > rewrites/strips `cwd` entries in workspace hints based on what was actually > synced. Every adapter calls it before invoking the remote runner > - The benefit is that remote runs see the realized remote workspace, host-local > paths stop leaking into remote env, and the rule is unit-tested in one place ## What Changed - Added `shapePaperclipWorkspaceEnvForExecution` to `packages/adapter-utils/src/server-utils.ts` with full unit coverage (`server-utils.test.ts`) - Each of acpx-local, claude-local, codex-local, cursor-local, gemini-local, opencode-local, pi-local now calls the new shaper before issuing the remote command and feeds the shaped values into `applyPaperclipWorkspaceEnv` - Per-adapter `execute.remote.test.ts` files extended to cover the new shaping behaviour: localhost paths replaced with remote cwd, foreign-cwd hints stripped, worktree path nulled out for remote targets - `acpx-local/src/server/execute.test.ts` extended with shaping coverage ## Verification - `pnpm test -- server-utils execute.remote` - `pnpm --filter @paperclipai/adapter-acpx-local test` - Manual QA reproducing the original failure: 1. Provision an E2B sandbox environment for the Paperclip QA company 2. Assign an issue to a remote-targeted claude-local agent and confirm the run starts in the correct remote cwd (no `/Users/...` path leakage in the run logs) 3. Repeat for opencode-local and pi-local ## Risks - Behavioural shift: hints whose `cwd` doesn't match the workspace cwd are now stripped on remote targets. If any adapter relied on a leaked local hint cwd, it will see a missing `cwd` instead. Reviewed all current callers — none do. - Adds a small per-run cost (path resolve + string normalisation) on every remote execution. Negligible. - Worktree path is now nulled out on remote (it has no meaning there). Adapters that previously read the value defensively will continue to work. ## Model Used - OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI - Provider: OpenAI - Used to author the code changes in this PR ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots — N/A - [ ] I have updated relevant documentation to reflect my changes — N/A - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-05-03 13:17:52 -07:00
const effectiveExecutionCwd = adapterExecutionTargetRemoteCwd(executionTarget, cwd);
const shapedWorkspaceEnv = shapePaperclipWorkspaceEnvForExecution({
workspaceCwd: effectiveWorkspaceCwd,
workspaceHints,
executionTargetIsRemote,
executionCwd: effectiveExecutionCwd,
});
await ensureAbsoluteDirectory(cwd, { createIfMissing: true });
const cursorSkillEntries = await readPaperclipRuntimeSkillEntries(config, __moduleDir);
const desiredCursorSkillNames = resolvePaperclipDesiredSkillNames(config, cursorSkillEntries);
Add SSH environment support (#4358) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The environments subsystem already models execution environments, but before this branch there was no end-to-end SSH-backed runtime path for agents to actually run work against a remote box > - That meant agents could be configured around environment concepts without a reliable way to execute adapter sessions remotely, sync workspace state, and preserve run context across supported adapters > - We also need environment selection to participate in normal Paperclip control-plane behavior: agent defaults, project/issue selection, route validation, and environment probing > - Because this capability is still experimental, the UI surface should be easy to hide and easy to remove later without undoing the underlying implementation > - This pull request adds SSH environment execution support across the runtime, adapters, routes, schema, and tests, then puts the visible environment-management UI behind an experimental flag > - The benefit is that we can validate real SSH-backed agent execution now while keeping the user-facing controls safely gated until the feature is ready to come out of experimentation ## What Changed - Added SSH-backed execution target support in the shared adapter runtime, including remote workspace preparation, skill/runtime asset sync, remote session handling, and workspace restore behavior after runs. - Added SSH execution coverage for supported local adapters, plus remote execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi. - Added environment selection and environment-management backend support needed for SSH execution, including route/service work, validation, probing, and agent default environment persistence. - Added CLI support for SSH environment lab verification and updated related docs/tests. - Added the `enableEnvironments` experimental flag and gated the environment UI behind it on company settings, agent configuration, and project configuration surfaces. ## Verification - `pnpm exec vitest run packages/adapters/claude-local/src/server/execute.remote.test.ts packages/adapters/cursor-local/src/server/execute.remote.test.ts packages/adapters/gemini-local/src/server/execute.remote.test.ts packages/adapters/opencode-local/src/server/execute.remote.test.ts packages/adapters/pi-local/src/server/execute.remote.test.ts` - `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts` - `pnpm exec vitest run server/src/__tests__/instance-settings-routes.test.ts` - `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts ui/src/lib/new-agent-runtime-config.test.ts` - `pnpm -r typecheck` - `pnpm build` - Manual verification on a branch-local dev server: - enabled the experimental flag - created an SSH environment - created a Linux Claude agent using that environment - confirmed a run executed on the Linux box and synced workspace changes back ## Risks - Medium: this touches runtime execution flow across multiple adapters, so regressions would likely show up in remote session setup, workspace sync, or environment selection precedence. - The UI flag reduces exposure, but the underlying runtime and route changes are still substantial and rely on migration correctness. - The change set is broad across adapters, control-plane services, migrations, and UI gating, so review should pay close attention to environment-selection precedence and remote workspace lifecycle behavior. ## Model Used - OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding model with tool use and code execution in the local repo workspace. The local adapter does not surface a more specific public model version string in this branch workflow. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-23 19:15:22 -07:00
if (!executionTargetIsRemote) {
await ensureCursorSkillsInjected(onLog, {
skillsEntries: cursorSkillEntries.filter((entry) => desiredCursorSkillNames.includes(entry.key)),
});
}
const envConfig = parseObject(config.env);
const hasExplicitApiKey =
typeof envConfig.PAPERCLIP_API_KEY === "string" && envConfig.PAPERCLIP_API_KEY.trim().length > 0;
Add cursor sandbox support and fix SSH workspace sync (#4803) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Agents can run inside sandboxed environments like E2B, or on remote hosts via SSH > - The cursor adapter needs to resolve `cursor-agent` inside sandbox environments where it's installed in `~/.local/bin` > - But when using the default `agent` command on a sandbox target, the adapter didn't know to look in `~/.local/bin/cursor-agent`, causing "command not found" failures > - Additionally, repeated SSH runs failed because `git checkout` during workspace sync conflicted with leftover `.paperclip-runtime` files from previous runs > - This PR adds sandbox-aware command resolution for cursor and fixes the SSH workspace sync conflict > - The benefit is cursor works in E2B sandboxes out of the box, and repeated SSH runs don't fail on workspace sync ## What Changed - `cursor-local`: Added `prepareCursorSandboxCommand` — on sandbox targets, reads the remote `$HOME`, prepends `~/.local/bin` to PATH, and prefers `~/.local/bin/cursor-agent` when the default command is requested; tightened the sandbox command probe to validate the binary exists before launching; preserves explicit custom command overrides - `adapter-utils/ssh.ts`: Added `--force` to git checkout in SSH workspace sync to handle `.paperclip-runtime` untracked file conflicts from previous runs ## Verification - `pnpm test` — all existing and new tests pass, including cursor sandbox probe, sandbox execution, and custom command override tests - `pnpm typecheck` — clean - Manual: configure an E2B environment, run a cursor-local task, verify it resolves cursor-agent from the sandbox install path ## Risks - Low-medium. The `--force` flag on git checkout could discard uncommitted changes in the remote workspace, but the workspace is managed by Paperclip and should not contain user edits. ## Model Used Codex GPT 5.4 high via Paperclip. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-29 16:12:06 -07:00
let env: Record<string, string> = { ...buildPaperclipEnv(agent) };
env.PAPERCLIP_RUN_ID = runId;
const wakeTaskId =
(typeof context.taskId === "string" && context.taskId.trim().length > 0 && context.taskId.trim()) ||
(typeof context.issueId === "string" && context.issueId.trim().length > 0 && context.issueId.trim()) ||
null;
const wakeReason =
typeof context.wakeReason === "string" && context.wakeReason.trim().length > 0
? context.wakeReason.trim()
: null;
const wakeCommentId =
(typeof context.wakeCommentId === "string" && context.wakeCommentId.trim().length > 0 && context.wakeCommentId.trim()) ||
(typeof context.commentId === "string" && context.commentId.trim().length > 0 && context.commentId.trim()) ||
null;
const approvalId =
typeof context.approvalId === "string" && context.approvalId.trim().length > 0
? context.approvalId.trim()
: null;
const approvalStatus =
typeof context.approvalStatus === "string" && context.approvalStatus.trim().length > 0
? context.approvalStatus.trim()
: null;
const linkedIssueIds = Array.isArray(context.issueIds)
? context.issueIds.filter((value): value is string => typeof value === "string" && value.trim().length > 0)
: [];
const wakePayloadJson = stringifyPaperclipWakePayload(context.paperclipWake);
Add planning mode for issue work (#5353) ## Thinking Path > - Paperclip is a control plane for autonomous AI companies. > - Issues are the core unit of work, and issue comments are how board users and agents coordinate execution. > - Some issue conversations need to produce plans and approvals instead of immediate implementation work. > - The existing issue contract did not distinguish standard execution comments from planning-oriented issue work. > - This pull request adds an issue work-mode contract and board UI affordances for standard vs planning mode. > - The benefit is that planning-mode issues can be created, displayed, discussed, and carried through agent heartbeat context without losing the normal issue workflow. ## What Changed - Added `standard` / `planning` issue work-mode contracts across DB, shared validators/types, server issue flows, plugin protocol, and adapter heartbeat payloads. - Added an idempotent `0081_optimal_dormammu` migration for `issues.work_mode`, ordered after current `public-gh/master` migrations. - Updated heartbeat/context summaries and issue-thread interaction behavior so planning work mode is preserved when creating suggested follow-up issues. - Added UI support for planning-mode issue creation, issue rows, detail composer styling, and composer work-mode toggles. - Added focused server/shared/UI tests plus a Playwright visual verification spec for planning-mode surfaces. - Rebased the branch onto current `public-gh/master` and added durable planning-mode screenshots under `doc/assets/pap-3368/`. ## Verification - `pnpm --filter @paperclipai/db run check:migrations` - `pnpm exec vitest run --project @paperclipai/shared packages/shared/src/validators/issue.test.ts` - `pnpm exec vitest run --project @paperclipai/server server/src/__tests__/heartbeat-context-summary.test.ts server/src/__tests__/issue-thread-interactions-service.test.ts server/src/__tests__/issues-goal-context-routes.test.ts --pool=forks --poolOptions.forks.isolate=true` - `pnpm exec vitest run --project @paperclipai/ui ui/src/components/IssueChatThread.test.tsx ui/src/components/NewIssueDialog.test.tsx ui/src/components/IssueRow.test.tsx ui/src/pages/IssueDetail.test.tsx` - `pnpm exec vitest run --project @paperclipai/adapter-utils packages/adapter-utils/src/server-utils.test.ts` - `PAPERCLIP_E2E_SKIP_LLM=true npx playwright test --config tests/e2e/playwright.config.ts tests/e2e/planning-mode-visual-verification.spec.ts` ## Screenshots Desktop planning detail: ![Desktop planning detail](https://raw.githubusercontent.com/paperclipai/paperclip/PAP-3368-plan-a-planning-mode-for-issues/doc/assets/pap-3368/desktop-planning-detail.png) Desktop planning row: ![Desktop planning row](https://raw.githubusercontent.com/paperclipai/paperclip/PAP-3368-plan-a-planning-mode-for-issues/doc/assets/pap-3368/desktop-planning-row.png) Desktop staged standard toggle: ![Desktop staged standard toggle](https://raw.githubusercontent.com/paperclipai/paperclip/PAP-3368-plan-a-planning-mode-for-issues/doc/assets/pap-3368/desktop-standard-toggle.png) Mobile planning detail: ![Mobile planning detail](https://raw.githubusercontent.com/paperclipai/paperclip/PAP-3368-plan-a-planning-mode-for-issues/doc/assets/pap-3368/mobile-planning-detail.png) Mobile planning row: ![Mobile planning row](https://raw.githubusercontent.com/paperclipai/paperclip/PAP-3368-plan-a-planning-mode-for-issues/doc/assets/pap-3368/mobile-planning-row.png) ## Risks - Medium migration risk: this adds a non-null issue column. The migration uses `ADD COLUMN IF NOT EXISTS` so installations that applied an older branch-local migration number can still apply the final numbered migration safely. - Medium contract risk: issue payloads, plugin payloads, and adapter heartbeat payloads now include work mode; compatibility is handled by defaulting missing values to `standard`. - UI risk is moderate because composer controls changed; focused component tests and visual e2e coverage exercise standard vs planning display and toggle behavior. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex, GPT-5 coding agent in a local Paperclip worktree, with shell/tool use. Exact context-window size is not exposed in this runtime. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-06 07:01:28 -05:00
const issueWorkMode = readPaperclipIssueWorkModeFromContext(context);
if (wakeTaskId) {
env.PAPERCLIP_TASK_ID = wakeTaskId;
}
Add planning mode for issue work (#5353) ## Thinking Path > - Paperclip is a control plane for autonomous AI companies. > - Issues are the core unit of work, and issue comments are how board users and agents coordinate execution. > - Some issue conversations need to produce plans and approvals instead of immediate implementation work. > - The existing issue contract did not distinguish standard execution comments from planning-oriented issue work. > - This pull request adds an issue work-mode contract and board UI affordances for standard vs planning mode. > - The benefit is that planning-mode issues can be created, displayed, discussed, and carried through agent heartbeat context without losing the normal issue workflow. ## What Changed - Added `standard` / `planning` issue work-mode contracts across DB, shared validators/types, server issue flows, plugin protocol, and adapter heartbeat payloads. - Added an idempotent `0081_optimal_dormammu` migration for `issues.work_mode`, ordered after current `public-gh/master` migrations. - Updated heartbeat/context summaries and issue-thread interaction behavior so planning work mode is preserved when creating suggested follow-up issues. - Added UI support for planning-mode issue creation, issue rows, detail composer styling, and composer work-mode toggles. - Added focused server/shared/UI tests plus a Playwright visual verification spec for planning-mode surfaces. - Rebased the branch onto current `public-gh/master` and added durable planning-mode screenshots under `doc/assets/pap-3368/`. ## Verification - `pnpm --filter @paperclipai/db run check:migrations` - `pnpm exec vitest run --project @paperclipai/shared packages/shared/src/validators/issue.test.ts` - `pnpm exec vitest run --project @paperclipai/server server/src/__tests__/heartbeat-context-summary.test.ts server/src/__tests__/issue-thread-interactions-service.test.ts server/src/__tests__/issues-goal-context-routes.test.ts --pool=forks --poolOptions.forks.isolate=true` - `pnpm exec vitest run --project @paperclipai/ui ui/src/components/IssueChatThread.test.tsx ui/src/components/NewIssueDialog.test.tsx ui/src/components/IssueRow.test.tsx ui/src/pages/IssueDetail.test.tsx` - `pnpm exec vitest run --project @paperclipai/adapter-utils packages/adapter-utils/src/server-utils.test.ts` - `PAPERCLIP_E2E_SKIP_LLM=true npx playwright test --config tests/e2e/playwright.config.ts tests/e2e/planning-mode-visual-verification.spec.ts` ## Screenshots Desktop planning detail: ![Desktop planning detail](https://raw.githubusercontent.com/paperclipai/paperclip/PAP-3368-plan-a-planning-mode-for-issues/doc/assets/pap-3368/desktop-planning-detail.png) Desktop planning row: ![Desktop planning row](https://raw.githubusercontent.com/paperclipai/paperclip/PAP-3368-plan-a-planning-mode-for-issues/doc/assets/pap-3368/desktop-planning-row.png) Desktop staged standard toggle: ![Desktop staged standard toggle](https://raw.githubusercontent.com/paperclipai/paperclip/PAP-3368-plan-a-planning-mode-for-issues/doc/assets/pap-3368/desktop-standard-toggle.png) Mobile planning detail: ![Mobile planning detail](https://raw.githubusercontent.com/paperclipai/paperclip/PAP-3368-plan-a-planning-mode-for-issues/doc/assets/pap-3368/mobile-planning-detail.png) Mobile planning row: ![Mobile planning row](https://raw.githubusercontent.com/paperclipai/paperclip/PAP-3368-plan-a-planning-mode-for-issues/doc/assets/pap-3368/mobile-planning-row.png) ## Risks - Medium migration risk: this adds a non-null issue column. The migration uses `ADD COLUMN IF NOT EXISTS` so installations that applied an older branch-local migration number can still apply the final numbered migration safely. - Medium contract risk: issue payloads, plugin payloads, and adapter heartbeat payloads now include work mode; compatibility is handled by defaulting missing values to `standard`. - UI risk is moderate because composer controls changed; focused component tests and visual e2e coverage exercise standard vs planning display and toggle behavior. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex, GPT-5 coding agent in a local Paperclip worktree, with shell/tool use. Exact context-window size is not exposed in this runtime. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-06 07:01:28 -05:00
if (issueWorkMode) {
env.PAPERCLIP_ISSUE_WORK_MODE = issueWorkMode;
}
if (wakeReason) {
env.PAPERCLIP_WAKE_REASON = wakeReason;
}
if (wakeCommentId) {
env.PAPERCLIP_WAKE_COMMENT_ID = wakeCommentId;
}
if (approvalId) {
env.PAPERCLIP_APPROVAL_ID = approvalId;
}
if (approvalStatus) {
env.PAPERCLIP_APPROVAL_STATUS = approvalStatus;
}
if (linkedIssueIds.length > 0) {
env.PAPERCLIP_LINKED_ISSUE_IDS = linkedIssueIds.join(",");
}
if (wakePayloadJson) {
env.PAPERCLIP_WAKE_PAYLOAD_JSON = wakePayloadJson;
}
Fix CEO AGENT_HOME paths and centralize workspace env propagation (#4551) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies. > - The local adapter layer is responsible for turning Paperclip runtime context into the environment seen by the child agent process. > - The CEO onboarding bundle tells the agent where to read and write its persistent memory and fact files. > - That bundle was using `./memory/...` and `./life/...`, which only works when the process cwd happens to equal the agent home directory. > - At the same time, six local adapters each duplicated the same workspace-env propagation logic, including `AGENT_HOME`, which makes this contract easy to drift. > - This pull request fixes the CEO instructions to use `$AGENT_HOME/...` and centralizes workspace-env propagation in one shared helper with shared tests. > - The benefit is a real bug fix for agent memory paths plus a single tested contract that makes future built-in adapter work less likely to forget `AGENT_HOME`. ## What Changed - Updated `server/src/onboarding-assets/ceo/HEARTBEAT.md` to use `$AGENT_HOME/memory/...` and `$AGENT_HOME/life/...` instead of cwd-relative `./memory/...` and `./life/...`. - Added `applyPaperclipWorkspaceEnv(...)` in `packages/adapter-utils/src/server-utils.ts` to centralize `PAPERCLIP_WORKSPACE_*` and `AGENT_HOME` propagation. - Added shared helper coverage in `packages/adapter-utils/src/server-utils.test.ts` for both populated and skip-empty cases. - Switched the built-in local adapters (`claude_local`, `codex_local`, `cursor_local`, `gemini_local`, `opencode_local`, `pi_local`) over to the shared helper instead of inline env assignment blocks. ## Verification - `pnpm install` - `pnpm exec vitest run packages/adapter-utils/src/server-utils.test.ts packages/adapters/claude-local/src/server/execute.remote.test.ts packages/adapters/codex-local/src/server/execute.remote.test.ts packages/adapters/cursor-local/src/server/execute.remote.test.ts packages/adapters/gemini-local/src/server/execute.remote.test.ts packages/adapters/opencode-local/src/server/execute.remote.test.ts packages/adapters/pi-local/src/server/execute.remote.test.ts` - Result: 7 test files passed, 31 tests passed, 0 failures. ## Risks - Low risk. - The only behavioral surface is the shared env propagation refactor across six adapters; if the helper diverged from prior semantics, an adapter could miss a workspace env var. - The shared helper test plus the affected adapter execute tests reduce that risk, and the helper preserves the prior "set only non-empty strings" behavior. ## Model Used - OpenAI Codex via Paperclip `codex_local` agent runtime; tool-assisted coding workflow with shell execution, file patching, git operations, and API interaction. The exact backend model identifier and context window are not surfaced by this local runtime. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-26 13:57:35 -07:00
applyPaperclipWorkspaceEnv(env, {
Fix remote workspace environment shaping (#5118) > **Stacked PR (part 5 of 7).** Depends on: - PR #5114 - PR #5115 - PR #5116 - PR #5117 > Diff against `master` includes commits from earlier PRs in the stack — the new commit in this PR is the topmost one. ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Agents run with a Paperclip-shaped environment (`PAPERCLIP_WORKSPACE_CWD`, > worktree path, `PAPERCLIP_WORKSPACES_JSON` hints) so the CLI can locate the > correct project tree > - SSH testing reproduced a real failure: a Codex SSH run wrote to > `/tmp/paperclip-env-matrix-...` (the *host* path) instead of the realized > remote workspace at `/home/<user>/paperclip-env-matrix-ssh-claude/...` > because the adapter injected `PAPERCLIP_WORKSPACE_CWD=/tmp/...` into the > remote env > - Code review on the initial codex-only fix asked to roll the same approach > into every other SSH-capable adapter (claude, acpx, cursor, opencode, gemini, > pi) via a shared helper rather than duplicating per-adapter > - This PR adds `shapePaperclipWorkspaceEnvForExecution` in adapter-utils that, > when the execution target is remote: replaces local cwd with the realized > execution cwd, nulls out worktree path (which has no remote meaning), and > rewrites/strips `cwd` entries in workspace hints based on what was actually > synced. Every adapter calls it before invoking the remote runner > - The benefit is that remote runs see the realized remote workspace, host-local > paths stop leaking into remote env, and the rule is unit-tested in one place ## What Changed - Added `shapePaperclipWorkspaceEnvForExecution` to `packages/adapter-utils/src/server-utils.ts` with full unit coverage (`server-utils.test.ts`) - Each of acpx-local, claude-local, codex-local, cursor-local, gemini-local, opencode-local, pi-local now calls the new shaper before issuing the remote command and feeds the shaped values into `applyPaperclipWorkspaceEnv` - Per-adapter `execute.remote.test.ts` files extended to cover the new shaping behaviour: localhost paths replaced with remote cwd, foreign-cwd hints stripped, worktree path nulled out for remote targets - `acpx-local/src/server/execute.test.ts` extended with shaping coverage ## Verification - `pnpm test -- server-utils execute.remote` - `pnpm --filter @paperclipai/adapter-acpx-local test` - Manual QA reproducing the original failure: 1. Provision an E2B sandbox environment for the Paperclip QA company 2. Assign an issue to a remote-targeted claude-local agent and confirm the run starts in the correct remote cwd (no `/Users/...` path leakage in the run logs) 3. Repeat for opencode-local and pi-local ## Risks - Behavioural shift: hints whose `cwd` doesn't match the workspace cwd are now stripped on remote targets. If any adapter relied on a leaked local hint cwd, it will see a missing `cwd` instead. Reviewed all current callers — none do. - Adds a small per-run cost (path resolve + string normalisation) on every remote execution. Negligible. - Worktree path is now nulled out on remote (it has no meaning there). Adapters that previously read the value defensively will continue to work. ## Model Used - OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI - Provider: OpenAI - Used to author the code changes in this PR ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots — N/A - [ ] I have updated relevant documentation to reflect my changes — N/A - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-05-03 13:17:52 -07:00
workspaceCwd: shapedWorkspaceEnv.workspaceCwd,
Fix CEO AGENT_HOME paths and centralize workspace env propagation (#4551) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies. > - The local adapter layer is responsible for turning Paperclip runtime context into the environment seen by the child agent process. > - The CEO onboarding bundle tells the agent where to read and write its persistent memory and fact files. > - That bundle was using `./memory/...` and `./life/...`, which only works when the process cwd happens to equal the agent home directory. > - At the same time, six local adapters each duplicated the same workspace-env propagation logic, including `AGENT_HOME`, which makes this contract easy to drift. > - This pull request fixes the CEO instructions to use `$AGENT_HOME/...` and centralizes workspace-env propagation in one shared helper with shared tests. > - The benefit is a real bug fix for agent memory paths plus a single tested contract that makes future built-in adapter work less likely to forget `AGENT_HOME`. ## What Changed - Updated `server/src/onboarding-assets/ceo/HEARTBEAT.md` to use `$AGENT_HOME/memory/...` and `$AGENT_HOME/life/...` instead of cwd-relative `./memory/...` and `./life/...`. - Added `applyPaperclipWorkspaceEnv(...)` in `packages/adapter-utils/src/server-utils.ts` to centralize `PAPERCLIP_WORKSPACE_*` and `AGENT_HOME` propagation. - Added shared helper coverage in `packages/adapter-utils/src/server-utils.test.ts` for both populated and skip-empty cases. - Switched the built-in local adapters (`claude_local`, `codex_local`, `cursor_local`, `gemini_local`, `opencode_local`, `pi_local`) over to the shared helper instead of inline env assignment blocks. ## Verification - `pnpm install` - `pnpm exec vitest run packages/adapter-utils/src/server-utils.test.ts packages/adapters/claude-local/src/server/execute.remote.test.ts packages/adapters/codex-local/src/server/execute.remote.test.ts packages/adapters/cursor-local/src/server/execute.remote.test.ts packages/adapters/gemini-local/src/server/execute.remote.test.ts packages/adapters/opencode-local/src/server/execute.remote.test.ts packages/adapters/pi-local/src/server/execute.remote.test.ts` - Result: 7 test files passed, 31 tests passed, 0 failures. ## Risks - Low risk. - The only behavioral surface is the shared env propagation refactor across six adapters; if the helper diverged from prior semantics, an adapter could miss a workspace env var. - The shared helper test plus the affected adapter execute tests reduce that risk, and the helper preserves the prior "set only non-empty strings" behavior. ## Model Used - OpenAI Codex via Paperclip `codex_local` agent runtime; tool-assisted coding workflow with shell execution, file patching, git operations, and API interaction. The exact backend model identifier and context window are not surfaced by this local runtime. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-26 13:57:35 -07:00
workspaceSource,
workspaceId,
workspaceRepoUrl,
workspaceRepoRef,
agentHome,
});
Fix remote workspace environment shaping (#5118) > **Stacked PR (part 5 of 7).** Depends on: - PR #5114 - PR #5115 - PR #5116 - PR #5117 > Diff against `master` includes commits from earlier PRs in the stack — the new commit in this PR is the topmost one. ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Agents run with a Paperclip-shaped environment (`PAPERCLIP_WORKSPACE_CWD`, > worktree path, `PAPERCLIP_WORKSPACES_JSON` hints) so the CLI can locate the > correct project tree > - SSH testing reproduced a real failure: a Codex SSH run wrote to > `/tmp/paperclip-env-matrix-...` (the *host* path) instead of the realized > remote workspace at `/home/<user>/paperclip-env-matrix-ssh-claude/...` > because the adapter injected `PAPERCLIP_WORKSPACE_CWD=/tmp/...` into the > remote env > - Code review on the initial codex-only fix asked to roll the same approach > into every other SSH-capable adapter (claude, acpx, cursor, opencode, gemini, > pi) via a shared helper rather than duplicating per-adapter > - This PR adds `shapePaperclipWorkspaceEnvForExecution` in adapter-utils that, > when the execution target is remote: replaces local cwd with the realized > execution cwd, nulls out worktree path (which has no remote meaning), and > rewrites/strips `cwd` entries in workspace hints based on what was actually > synced. Every adapter calls it before invoking the remote runner > - The benefit is that remote runs see the realized remote workspace, host-local > paths stop leaking into remote env, and the rule is unit-tested in one place ## What Changed - Added `shapePaperclipWorkspaceEnvForExecution` to `packages/adapter-utils/src/server-utils.ts` with full unit coverage (`server-utils.test.ts`) - Each of acpx-local, claude-local, codex-local, cursor-local, gemini-local, opencode-local, pi-local now calls the new shaper before issuing the remote command and feeds the shaped values into `applyPaperclipWorkspaceEnv` - Per-adapter `execute.remote.test.ts` files extended to cover the new shaping behaviour: localhost paths replaced with remote cwd, foreign-cwd hints stripped, worktree path nulled out for remote targets - `acpx-local/src/server/execute.test.ts` extended with shaping coverage ## Verification - `pnpm test -- server-utils execute.remote` - `pnpm --filter @paperclipai/adapter-acpx-local test` - Manual QA reproducing the original failure: 1. Provision an E2B sandbox environment for the Paperclip QA company 2. Assign an issue to a remote-targeted claude-local agent and confirm the run starts in the correct remote cwd (no `/Users/...` path leakage in the run logs) 3. Repeat for opencode-local and pi-local ## Risks - Behavioural shift: hints whose `cwd` doesn't match the workspace cwd are now stripped on remote targets. If any adapter relied on a leaked local hint cwd, it will see a missing `cwd` instead. Reviewed all current callers — none do. - Adds a small per-run cost (path resolve + string normalisation) on every remote execution. Negligible. - Worktree path is now nulled out on remote (it has no meaning there). Adapters that previously read the value defensively will continue to work. ## Model Used - OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI - Provider: OpenAI - Used to author the code changes in this PR ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots — N/A - [ ] I have updated relevant documentation to reflect my changes — N/A - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-05-03 13:17:52 -07:00
if (shapedWorkspaceEnv.workspaceHints.length > 0) {
env.PAPERCLIP_WORKSPACES_JSON = JSON.stringify(shapedWorkspaceEnv.workspaceHints);
}
for (const [k, v] of Object.entries(envConfig)) {
if (typeof v === "string") env[k] = v;
}
if (!hasExplicitApiKey && authToken) {
env.PAPERCLIP_API_KEY = authToken;
}
Add cursor sandbox support and fix SSH workspace sync (#4803) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Agents can run inside sandboxed environments like E2B, or on remote hosts via SSH > - The cursor adapter needs to resolve `cursor-agent` inside sandbox environments where it's installed in `~/.local/bin` > - But when using the default `agent` command on a sandbox target, the adapter didn't know to look in `~/.local/bin/cursor-agent`, causing "command not found" failures > - Additionally, repeated SSH runs failed because `git checkout` during workspace sync conflicted with leftover `.paperclip-runtime` files from previous runs > - This PR adds sandbox-aware command resolution for cursor and fixes the SSH workspace sync conflict > - The benefit is cursor works in E2B sandboxes out of the box, and repeated SSH runs don't fail on workspace sync ## What Changed - `cursor-local`: Added `prepareCursorSandboxCommand` — on sandbox targets, reads the remote `$HOME`, prepends `~/.local/bin` to PATH, and prefers `~/.local/bin/cursor-agent` when the default command is requested; tightened the sandbox command probe to validate the binary exists before launching; preserves explicit custom command overrides - `adapter-utils/ssh.ts`: Added `--force` to git checkout in SSH workspace sync to handle `.paperclip-runtime` untracked file conflicts from previous runs ## Verification - `pnpm test` — all existing and new tests pass, including cursor sandbox probe, sandbox execution, and custom command override tests - `pnpm typecheck` — clean - Manual: configure an E2B environment, run a cursor-local task, verify it resolves cursor-agent from the sandbox install path ## Risks - Low-medium. The `--force` flag on git checkout could discard uncommitted changes in the remote workspace, but the workspace is managed by Paperclip and should not contain user edits. ## Model Used Codex GPT 5.4 high via Paperclip. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-29 16:12:06 -07:00
const timeoutSec = asNumber(config.timeoutSec, 0);
const graceSec = asNumber(config.graceSec, 20);
Let adapters declare runtime command spec for remote provisioning (#5141) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies, running adapter > commands like `claude`, `codex`, `pi` either locally or on remote runtimes > (SSH hosts, sandboxes, etc.) > - On a fresh remote runtime — particularly an ephemeral sandbox — the > adapter's CLI may not be installed yet. Today operators handle this via > external configuration (e.g. a project-level `provisionCommand` shell > script) that has to know about every adapter the operator might want to use > - This means every adapter has its own well-known npm package, but operators > end up writing duplicate provision shell scripts that paste together > `npm install -g @anthropic-ai/claude-code`, `npm install -g @openai/codex`, > etc. — knowledge the adapter itself already has > - This PR moves that knowledge into the adapter modules: each adapter declares > how its runtime command should be detected and (if applicable) installed > via `getRuntimeCommandSpec(config)`. The execution path runs the adapter's > own install command on remote sandbox targets before launching, so a fresh > sandbox bootstraps itself instead of requiring a hand-written provision script > - The benefit is fewer footguns for operators provisioning remote runtimes, > and a clean place for new adapters to plug in their install recipe ## What Changed - New types in `packages/adapter-utils/src/types.ts`: - `AdapterRuntimeCommandSpec` describing `command`, optional `detectCommand`, and optional `installCommand` - Optional `getRuntimeCommandSpec(config)` on `ServerAdapterModule` - Optional `runtimeCommandSpec` on `AdapterExecutionContext` so adapters receive the resolved spec at execute time - New helper `ensureAdapterExecutionTargetRuntimeCommandInstalled(...)` in `packages/adapter-utils/src/execution-target.ts` that runs the install command on remote targets when `transport === "sandbox"`. SSH and local targets are no-ops. Throws on timeout or non-zero exit so failures surface early. - Each of `claude-local`, `codex-local`, `cursor-local`, `gemini-local`, `opencode-local`, `pi-local`'s `execute.ts` now reads `ctx.runtimeCommandSpec?.installCommand` and calls the helper before launching the adapter command. - `server/src/adapters/registry.ts` declares `getRuntimeCommandSpec` for each adapter: - claude/codex/gemini/opencode/pi-local: `npm install -g <package>` recipe via a shared `buildNpmRuntimeCommandSpec` helper, with a defensive guard that only auto-installs when the configured `command` matches the well-known fallback (custom binaries are left alone). - cursor-local: declares `command` only; no auto-install (no public npm package), preserving the existing manual setup. - `server/src/services/heartbeat.ts` resolves the spec via `adapter.getRuntimeCommandSpec?.(runtimeConfig)` and passes it through to `AdapterExecutionContext`. - Tests added in `execution-target.test.ts` (~75 lines), e2b `plugin.test.ts` (~32 lines), and `environment-run-orchestrator.test.ts` (~76 lines). ## Verification - `pnpm --filter @paperclipai/adapter-utils test` - `pnpm --filter @paperclipai/server test -- environment-run-orchestrator` - `pnpm --filter @paperclipai/sandbox-providers-e2b test` - Manual QA: run an adapter (claude/codex/etc.) against a fresh sandbox-backed environment that does NOT have the adapter CLI pre-installed. Confirm the install runs once at the start of the agent run and the adapter then launches successfully. Re-run on the same sandbox; confirm the install command is idempotent and the second run starts faster. - Confirm SSH and local execution paths are unaffected (gated by `transport === "sandbox"`). ## Risks - Behavioural shift on sandbox runs: a new install step now runs at the start of every sandbox agent run for adapters with `installCommand` set. The install commands are idempotent (`if ! command -v X >/dev/null 2>&1; then npm install -g <pkg>; fi`), so this is fast on warm sandboxes. On a cold sandbox, the first run takes longer. - Operators who used the legacy project-level `provisionCommand` to install adapter CLIs can drop that part of their script; the adapter handles it now. Existing scripts continue to work — installs are idempotent. - The cursor-local adapter has no auto-install (no public npm package). Behaviour for cursor-local on sandboxes is unchanged. - New optional surface on `ServerAdapterModule`. Plugins that don't implement `getRuntimeCommandSpec` retain previous behaviour (no auto-install). ## Model Used - OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI - Provider: OpenAI - Used to author the code changes in this PR ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots — N/A - [ ] I have updated relevant documentation to reflect my changes — N/A - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-05-03 18:35:36 -07:00
await ensureAdapterExecutionTargetRuntimeCommandInstalled({
runId,
target: executionTarget,
installCommand: ctx.runtimeCommandSpec?.installCommand,
detectCommand: ctx.runtimeCommandSpec?.detectCommand,
cwd,
env,
timeoutSec,
graceSec,
onLog,
});
Add cursor sandbox support and fix SSH workspace sync (#4803) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Agents can run inside sandboxed environments like E2B, or on remote hosts via SSH > - The cursor adapter needs to resolve `cursor-agent` inside sandbox environments where it's installed in `~/.local/bin` > - But when using the default `agent` command on a sandbox target, the adapter didn't know to look in `~/.local/bin/cursor-agent`, causing "command not found" failures > - Additionally, repeated SSH runs failed because `git checkout` during workspace sync conflicted with leftover `.paperclip-runtime` files from previous runs > - This PR adds sandbox-aware command resolution for cursor and fixes the SSH workspace sync conflict > - The benefit is cursor works in E2B sandboxes out of the box, and repeated SSH runs don't fail on workspace sync ## What Changed - `cursor-local`: Added `prepareCursorSandboxCommand` — on sandbox targets, reads the remote `$HOME`, prepends `~/.local/bin` to PATH, and prefers `~/.local/bin/cursor-agent` when the default command is requested; tightened the sandbox command probe to validate the binary exists before launching; preserves explicit custom command overrides - `adapter-utils/ssh.ts`: Added `--force` to git checkout in SSH workspace sync to handle `.paperclip-runtime` untracked file conflicts from previous runs ## Verification - `pnpm test` — all existing and new tests pass, including cursor sandbox probe, sandbox execution, and custom command override tests - `pnpm typecheck` — clean - Manual: configure an E2B environment, run a cursor-local task, verify it resolves cursor-agent from the sandbox install path ## Risks - Low-medium. The `--force` flag on git checkout could discard uncommitted changes in the remote workspace, but the workspace is managed by Paperclip and should not contain user edits. ## Model Used Codex GPT 5.4 high via Paperclip. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-29 16:12:06 -07:00
// Probe the sandbox before the managed-home override so we discover
// cursor-agent from the real system HOME (e.g. ~/.local/bin/cursor-agent).
// The managed HOME set later is for runtime isolation, not for finding the CLI.
const sandboxCommand = await prepareCursorSandboxCommand({
runId,
target: executionTarget,
command,
cwd,
env,
timeoutSec,
graceSec,
});
command = sandboxCommand.command;
env = sandboxCommand.env;
const effectiveEnv = Object.fromEntries(
Object.entries({ ...process.env, ...env }).filter(
(entry): entry is [string, string] => typeof entry[1] === "string",
),
);
const billingType = resolveCursorBillingType(effectiveEnv);
const runtimeEnv = ensurePathInEnv(effectiveEnv);
Wire per-adapter sandbox install commands through test and execute paths (#5280) > **Stacked PR.** Sits on top of the e2b sandbox chain — #5278 (stdin staging) and #5279 (honest-resolvability + login-profiles). The cumulative diff against `master` includes both of those PRs' content; the files touched by *this* PR's commit are the new `maybeRunSandboxInstallCommand` helper in `packages/adapter-utils/src/execution-target.ts` and the per-adapter `index.ts`/`server/test.ts`/`server/execute.ts` wiring under `packages/adapters/{claude,codex,cursor,gemini,opencode,pi}-local/`. The honest resolvability check from #5279 is what gives this PR's install command a meaningful "did it actually land on PATH" follow-up. ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Sandbox execution targets are ephemeral — each fresh lease starts from a template image that may or may not have the agent CLIs preinstalled > - When a CLI isn't preinstalled, the resolvability probe fails at `command -v` and the hello probe never runs > - There's no shared mechanism for "before you probe or provision, install the CLI on this sandbox" > - This pull request adds a `SANDBOX_INSTALL_COMMAND` constant per adapter and a `maybeRunSandboxInstallCommand` helper that runs it via the existing sandbox login shell, captures structured output, and never throws (so the resolvability + hello probe still run after); each adapter's `test()` and `execute()` share the constant so the two callsites can't drift > - The benefit is a fresh sandbox lease without a preinstalled CLI now installs it once via `sh -lc` before the resolvability probe and before managed-runtime provisioning, with a uniform `<adapter>_install_command_run` check on the test report ## What Changed - `packages/adapter-utils/src/execution-target.ts`: add `AdapterSandboxInstallCommandCheck` and `maybeRunSandboxInstallCommand` (runs the install via existing sandbox shell, captures exit/stdout/stderr, returns a structured info/warn check, never throws) - Add `SANDBOX_INSTALL_COMMAND` to each adapter's `index.ts` so `test()` and `execute()` share a single source of truth - Wire each of the 6 affected adapter `testEnvironment()`s to call `maybeRunSandboxInstallCommand` before `ensureAdapterExecutionTargetCommandResolvable` - Pass `installCommand: SANDBOX_INSTALL_COMMAND` through `prepareAdapterExecutionTargetRuntime` in each adapter's `execute()` - Per-adapter install commands use npm globals where possible so binaries land on a PATH segment the template already exports: - claude → `npm install -g @anthropic-ai/claude-code` - codex → `npm install -g @openai/codex` - cursor → `curl https://cursor.com/install -fsS | bash` - gemini → `npm install -g @google/gemini-cli` - opencode → `npm install -g opencode-ai` - pi → `npm install -g @mariozechner/pi-coding-agent` SSH and local targets ignore `installCommand` (SSH runtime takes no such param; local short-circuits before runtime prep), so this is a no-op for non-sandbox environments. ## Verification - `pnpm typecheck` clean - `pnpm vitest run --no-coverage --project @paperclipai/adapter-utils` and per-adapter projects pass - Manual sandbox matrix (claude, codex, cursor, gemini, opencode, pi) — each goes `install_command_run → resolvable → hello_probe_passed` (Codex and Pi land on `hello_probe_auth_required`, which is the configured-credentials problem, not an install issue) - SSH no-regression: SSH Claude still passes; the helper short-circuits on non-sandbox targets ## Risks Medium — adds a network/CPU cost (npm install / curl) on every fresh sandbox lease. Cost is bounded (one-time per lease, typically tens of seconds for npm globals), and the helper never throws so a failing install still lets the report run resolvability and hello probes. If a sandbox image already has the CLI, the install is an idempotent reinstall. ## Model Used Claude Opus 4.7 (1M context) ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots — N/A (no UI) - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-05-05 08:29:28 -07:00
await ensureAdapterExecutionTargetCommandResolvable(command, executionTarget, cwd, runtimeEnv, { installCommand: SANDBOX_INSTALL_COMMAND });
Add SSH environment support (#4358) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The environments subsystem already models execution environments, but before this branch there was no end-to-end SSH-backed runtime path for agents to actually run work against a remote box > - That meant agents could be configured around environment concepts without a reliable way to execute adapter sessions remotely, sync workspace state, and preserve run context across supported adapters > - We also need environment selection to participate in normal Paperclip control-plane behavior: agent defaults, project/issue selection, route validation, and environment probing > - Because this capability is still experimental, the UI surface should be easy to hide and easy to remove later without undoing the underlying implementation > - This pull request adds SSH environment execution support across the runtime, adapters, routes, schema, and tests, then puts the visible environment-management UI behind an experimental flag > - The benefit is that we can validate real SSH-backed agent execution now while keeping the user-facing controls safely gated until the feature is ready to come out of experimentation ## What Changed - Added SSH-backed execution target support in the shared adapter runtime, including remote workspace preparation, skill/runtime asset sync, remote session handling, and workspace restore behavior after runs. - Added SSH execution coverage for supported local adapters, plus remote execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi. - Added environment selection and environment-management backend support needed for SSH execution, including route/service work, validation, probing, and agent default environment persistence. - Added CLI support for SSH environment lab verification and updated related docs/tests. - Added the `enableEnvironments` experimental flag and gated the environment UI behind it on company settings, agent configuration, and project configuration surfaces. ## Verification - `pnpm exec vitest run packages/adapters/claude-local/src/server/execute.remote.test.ts packages/adapters/cursor-local/src/server/execute.remote.test.ts packages/adapters/gemini-local/src/server/execute.remote.test.ts packages/adapters/opencode-local/src/server/execute.remote.test.ts packages/adapters/pi-local/src/server/execute.remote.test.ts` - `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts` - `pnpm exec vitest run server/src/__tests__/instance-settings-routes.test.ts` - `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts ui/src/lib/new-agent-runtime-config.test.ts` - `pnpm -r typecheck` - `pnpm build` - Manual verification on a branch-local dev server: - enabled the experimental flag - created an SSH environment - created a Linux Claude agent using that environment - confirmed a run executed on the Linux box and synced workspace changes back ## Risks - Medium: this touches runtime execution flow across multiple adapters, so regressions would likely show up in remote session setup, workspace sync, or environment selection precedence. - The UI flag reduces exposure, but the underlying runtime and route changes are still substantial and rely on migration correctness. - The change set is broad across adapters, control-plane services, migrations, and UI gating, so review should pay close attention to environment-selection precedence and remote workspace lifecycle behavior. ## Model Used - OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding model with tool use and code execution in the local repo workspace. The local adapter does not surface a more specific public model version string in this branch workflow. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-23 19:15:22 -07:00
const resolvedCommand = await resolveAdapterExecutionTargetCommandForLogs(command, executionTarget, cwd, runtimeEnv);
Add sandbox callback bridge for remote environment API access (#4801) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Agents can run inside sandboxed environments like E2B, which are isolated from the host network > - Sandboxed agents need to call back to the Paperclip API to report progress, post comments, and update issue status > - But sandbox environments cannot reach the Paperclip server directly because they run in isolated network namespaces > - This PR adds a callback bridge that proxies API requests from the sandbox to the Paperclip server, running as a local HTTP server on the host that forwards authenticated requests > - The bridge is started automatically when an adapter launches a sandbox execution, and torn down when the run completes > - The benefit is sandboxed agents can interact with the Paperclip API without requiring network-level access to the host, enabling E2B and similar providers to work end-to-end ## What Changed - Added `sandbox-callback-bridge.ts` in `packages/adapter-utils/` — a lightweight HTTP bridge server that accepts requests from sandbox environments and proxies them to the Paperclip API with authentication - Added request validation and security policy: the bridge only forwards requests to the configured API URL, validates content types, enforces size limits, and rejects non-API paths - Wired the bridge into all remote adapter execute paths (claude, codex, cursor, gemini, pi) — the bridge starts before the agent process and the bridge URL is passed via environment variables - Updated `environment-execution-target.ts` to prefer the explicit API URL from environment lease metadata for sandbox callback routing - Fixed Claude sandbox runtime setup to work with the bridge configuration - Added comprehensive test coverage for bridge request handling, policy enforcement, and sandbox execution integration - Fixed browser bundling — the bridge module is excluded from the frontend bundle via the adapter-utils index export ## Verification - `pnpm test` — all existing and new tests pass, including bridge unit tests and sandbox execution integration tests - `pnpm typecheck` — clean - Manual: configure an E2B environment, run an agent task, verify the agent can post comments and update issue status through the bridge ## Risks - Medium. This is a new network-facing component (HTTP server on localhost). The security policy restricts forwarding to the configured API URL only and validates all requests, but any proxy introduces attack surface. The bridge binds to localhost only and is scoped to the lifetime of a single agent run. ## Model Used Codex GPT 5.4 high via Paperclip. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-29 16:37:34 -07:00
let loggedEnv = buildInvocationEnvForLogs(env, {
runtimeEnv,
includeRuntimeKeys: ["HOME"],
resolvedCommand,
});
const extraArgs = (() => {
const fromExtraArgs = asStringArray(config.extraArgs);
if (fromExtraArgs.length > 0) return fromExtraArgs;
return asStringArray(config.args);
})();
const autoTrustEnabled = !hasCursorTrustBypassArg(extraArgs);
Add SSH environment support (#4358) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The environments subsystem already models execution environments, but before this branch there was no end-to-end SSH-backed runtime path for agents to actually run work against a remote box > - That meant agents could be configured around environment concepts without a reliable way to execute adapter sessions remotely, sync workspace state, and preserve run context across supported adapters > - We also need environment selection to participate in normal Paperclip control-plane behavior: agent defaults, project/issue selection, route validation, and environment probing > - Because this capability is still experimental, the UI surface should be easy to hide and easy to remove later without undoing the underlying implementation > - This pull request adds SSH environment execution support across the runtime, adapters, routes, schema, and tests, then puts the visible environment-management UI behind an experimental flag > - The benefit is that we can validate real SSH-backed agent execution now while keeping the user-facing controls safely gated until the feature is ready to come out of experimentation ## What Changed - Added SSH-backed execution target support in the shared adapter runtime, including remote workspace preparation, skill/runtime asset sync, remote session handling, and workspace restore behavior after runs. - Added SSH execution coverage for supported local adapters, plus remote execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi. - Added environment selection and environment-management backend support needed for SSH execution, including route/service work, validation, probing, and agent default environment persistence. - Added CLI support for SSH environment lab verification and updated related docs/tests. - Added the `enableEnvironments` experimental flag and gated the environment UI behind it on company settings, agent configuration, and project configuration surfaces. ## Verification - `pnpm exec vitest run packages/adapters/claude-local/src/server/execute.remote.test.ts packages/adapters/cursor-local/src/server/execute.remote.test.ts packages/adapters/gemini-local/src/server/execute.remote.test.ts packages/adapters/opencode-local/src/server/execute.remote.test.ts packages/adapters/pi-local/src/server/execute.remote.test.ts` - `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts` - `pnpm exec vitest run server/src/__tests__/instance-settings-routes.test.ts` - `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts ui/src/lib/new-agent-runtime-config.test.ts` - `pnpm -r typecheck` - `pnpm build` - Manual verification on a branch-local dev server: - enabled the experimental flag - created an SSH environment - created a Linux Claude agent using that environment - confirmed a run executed on the Linux box and synced workspace changes back ## Risks - Medium: this touches runtime execution flow across multiple adapters, so regressions would likely show up in remote session setup, workspace sync, or environment selection precedence. - The UI flag reduces exposure, but the underlying runtime and route changes are still substantial and rely on migration correctness. - The change set is broad across adapters, control-plane services, migrations, and UI gating, so review should pay close attention to environment-selection precedence and remote workspace lifecycle behavior. ## Model Used - OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding model with tool use and code execution in the local repo workspace. The local adapter does not surface a more specific public model version string in this branch workflow. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-23 19:15:22 -07:00
let restoreRemoteWorkspace: (() => Promise<void>) | null = null;
let localSkillsDir: string | null = null;
Add sandbox callback bridge for remote environment API access (#4801) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Agents can run inside sandboxed environments like E2B, which are isolated from the host network > - Sandboxed agents need to call back to the Paperclip API to report progress, post comments, and update issue status > - But sandbox environments cannot reach the Paperclip server directly because they run in isolated network namespaces > - This PR adds a callback bridge that proxies API requests from the sandbox to the Paperclip server, running as a local HTTP server on the host that forwards authenticated requests > - The bridge is started automatically when an adapter launches a sandbox execution, and torn down when the run completes > - The benefit is sandboxed agents can interact with the Paperclip API without requiring network-level access to the host, enabling E2B and similar providers to work end-to-end ## What Changed - Added `sandbox-callback-bridge.ts` in `packages/adapter-utils/` — a lightweight HTTP bridge server that accepts requests from sandbox environments and proxies them to the Paperclip API with authentication - Added request validation and security policy: the bridge only forwards requests to the configured API URL, validates content types, enforces size limits, and rejects non-API paths - Wired the bridge into all remote adapter execute paths (claude, codex, cursor, gemini, pi) — the bridge starts before the agent process and the bridge URL is passed via environment variables - Updated `environment-execution-target.ts` to prefer the explicit API URL from environment lease metadata for sandbox callback routing - Fixed Claude sandbox runtime setup to work with the bridge configuration - Added comprehensive test coverage for bridge request handling, policy enforcement, and sandbox execution integration - Fixed browser bundling — the bridge module is excluded from the frontend bundle via the adapter-utils index export ## Verification - `pnpm test` — all existing and new tests pass, including bridge unit tests and sandbox execution integration tests - `pnpm typecheck` — clean - Manual: configure an E2B environment, run an agent task, verify the agent can post comments and update issue status through the bridge ## Risks - Medium. This is a new network-facing component (HTTP server on localhost). The security policy restricts forwarding to the configured API URL only and validates all requests, but any proxy introduces attack surface. The bridge binds to localhost only and is scoped to the lifetime of a single agent run. ## Model Used Codex GPT 5.4 high via Paperclip. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-29 16:37:34 -07:00
let remoteRuntimeRootDir: string | null = null;
let paperclipBridge: Awaited<ReturnType<typeof startAdapterExecutionTargetPaperclipBridge>> = null;
Add SSH environment support (#4358) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The environments subsystem already models execution environments, but before this branch there was no end-to-end SSH-backed runtime path for agents to actually run work against a remote box > - That meant agents could be configured around environment concepts without a reliable way to execute adapter sessions remotely, sync workspace state, and preserve run context across supported adapters > - We also need environment selection to participate in normal Paperclip control-plane behavior: agent defaults, project/issue selection, route validation, and environment probing > - Because this capability is still experimental, the UI surface should be easy to hide and easy to remove later without undoing the underlying implementation > - This pull request adds SSH environment execution support across the runtime, adapters, routes, schema, and tests, then puts the visible environment-management UI behind an experimental flag > - The benefit is that we can validate real SSH-backed agent execution now while keeping the user-facing controls safely gated until the feature is ready to come out of experimentation ## What Changed - Added SSH-backed execution target support in the shared adapter runtime, including remote workspace preparation, skill/runtime asset sync, remote session handling, and workspace restore behavior after runs. - Added SSH execution coverage for supported local adapters, plus remote execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi. - Added environment selection and environment-management backend support needed for SSH execution, including route/service work, validation, probing, and agent default environment persistence. - Added CLI support for SSH environment lab verification and updated related docs/tests. - Added the `enableEnvironments` experimental flag and gated the environment UI behind it on company settings, agent configuration, and project configuration surfaces. ## Verification - `pnpm exec vitest run packages/adapters/claude-local/src/server/execute.remote.test.ts packages/adapters/cursor-local/src/server/execute.remote.test.ts packages/adapters/gemini-local/src/server/execute.remote.test.ts packages/adapters/opencode-local/src/server/execute.remote.test.ts packages/adapters/pi-local/src/server/execute.remote.test.ts` - `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts` - `pnpm exec vitest run server/src/__tests__/instance-settings-routes.test.ts` - `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts ui/src/lib/new-agent-runtime-config.test.ts` - `pnpm -r typecheck` - `pnpm build` - Manual verification on a branch-local dev server: - enabled the experimental flag - created an SSH environment - created a Linux Claude agent using that environment - confirmed a run executed on the Linux box and synced workspace changes back ## Risks - Medium: this touches runtime execution flow across multiple adapters, so regressions would likely show up in remote session setup, workspace sync, or environment selection precedence. - The UI flag reduces exposure, but the underlying runtime and route changes are still substantial and rely on migration correctness. - The change set is broad across adapters, control-plane services, migrations, and UI gating, so review should pay close attention to environment-selection precedence and remote workspace lifecycle behavior. ## Model Used - OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding model with tool use and code execution in the local repo workspace. The local adapter does not surface a more specific public model version string in this branch workflow. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-23 19:15:22 -07:00
if (executionTargetIsRemote) {
try {
localSkillsDir = await buildCursorSkillsDir(config);
await onLog(
"stdout",
`[paperclip] Syncing workspace and Cursor runtime assets to ${describeAdapterExecutionTarget(executionTarget)}.\n`,
);
const preparedExecutionTargetRuntime = await prepareAdapterExecutionTargetRuntime({
Harden remote workspace sync and restore flows (#5444) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - When an agent runs against a remote target, Paperclip syncs the workspace out to the remote at run start and restores changes back to the local workspace at run end > - The previous restore flow naïvely overwrote local files with whatever the remote returned, so files that the remote run never touched but had timestamp/mode drift could be needlessly rewritten — and a single static `refs/paperclip/ssh-sync/imported` ref made concurrent SSH workspace exports race on the same git ref > - This pull request adds a `workspace-restore-merge` module that diffs a pre-run snapshot against the post-run remote state and only writes back files the remote actually changed; SSH workspace exports now use a per-import unique ref so concurrent runs can't trample each other > - Every adapter's execute path threads the snapshot through `prepareAdapterExecutionTargetRuntime` so the merge has the baseline it needs > - The benefit is workspace restores no longer churn untouched files, and concurrent SSH runs no longer collide on the import ref ## What Changed - `packages/adapter-utils/src/workspace-restore-merge.{ts,test.ts}`: new module — directory snapshot (kind/mode/sha256/symlink target) plus snapshot-aware merge that writes only the files the remote changed - `packages/adapter-utils/src/ssh.ts`: SSH workspace export uses a per-import unique ref (`refs/paperclip/ssh-sync/imported/<uuid>`); restore goes through the new merge helper; `ssh-fixture.test.ts` covers the unique-ref + merge paths - `packages/adapter-utils/src/sandbox-managed-runtime.ts` + `remote-managed-runtime.ts`: thread the snapshot/merge through the sandbox and SSH paths - `packages/adapter-utils/src/server-utils.{ts,test.ts}` + `execution-target.ts`: helpers for capturing the pre-run snapshot; `prepareAdapterExecutionTargetRuntime` gains required `runId` and optional `workspaceRemoteDir`, and returns the realized `workspaceRemoteDir` - Each adapter's `execute.ts` (acpx, claude, codex, cursor, gemini, opencode, pi) takes the snapshot at run start and passes it through to the runtime restore - Remote execute test mocks updated to match the new `prepareWorkspaceForSshExecution` return shape and the per-run `${managedRemoteWorkspace}` cwd subdirectory ## Verification - `pnpm vitest run --no-coverage --project @paperclipai/adapter-utils --project @paperclipai/adapter-acpx-local --project @paperclipai/adapter-claude-local --project @paperclipai/adapter-codex-local --project @paperclipai/adapter-cursor-local --project @paperclipai/adapter-gemini-local --project @paperclipai/adapter-opencode-local --project @paperclipai/adapter-pi-local` — 196/196 passing - `pnpm typecheck` clean across the workspace ## Risks Medium. The restore path now writes a strict subset of what it previously did — files the remote did not touch are no longer rewritten. If any flow was relying on a touch-without-content-change being copied back (timestamp or permission propagation only), that behavior is now skipped. Snapshot capture adds an O(N-files-in-workspace) hash pass at run start; the cost is bounded by the existing exclude list. The `runId` parameter on `prepareAdapterExecutionTargetRuntime` is now required — every in-tree caller is updated; out-of-tree adapter authors need to pass it. ## Model Used Claude Opus 4.7 (1M context) ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable — new module + every adapter execute path covered - [x] If this change affects the UI, I have included before/after screenshots — N/A (no UI) - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-05-07 14:44:45 -07:00
runId,
Add SSH environment support (#4358) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The environments subsystem already models execution environments, but before this branch there was no end-to-end SSH-backed runtime path for agents to actually run work against a remote box > - That meant agents could be configured around environment concepts without a reliable way to execute adapter sessions remotely, sync workspace state, and preserve run context across supported adapters > - We also need environment selection to participate in normal Paperclip control-plane behavior: agent defaults, project/issue selection, route validation, and environment probing > - Because this capability is still experimental, the UI surface should be easy to hide and easy to remove later without undoing the underlying implementation > - This pull request adds SSH environment execution support across the runtime, adapters, routes, schema, and tests, then puts the visible environment-management UI behind an experimental flag > - The benefit is that we can validate real SSH-backed agent execution now while keeping the user-facing controls safely gated until the feature is ready to come out of experimentation ## What Changed - Added SSH-backed execution target support in the shared adapter runtime, including remote workspace preparation, skill/runtime asset sync, remote session handling, and workspace restore behavior after runs. - Added SSH execution coverage for supported local adapters, plus remote execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi. - Added environment selection and environment-management backend support needed for SSH execution, including route/service work, validation, probing, and agent default environment persistence. - Added CLI support for SSH environment lab verification and updated related docs/tests. - Added the `enableEnvironments` experimental flag and gated the environment UI behind it on company settings, agent configuration, and project configuration surfaces. ## Verification - `pnpm exec vitest run packages/adapters/claude-local/src/server/execute.remote.test.ts packages/adapters/cursor-local/src/server/execute.remote.test.ts packages/adapters/gemini-local/src/server/execute.remote.test.ts packages/adapters/opencode-local/src/server/execute.remote.test.ts packages/adapters/pi-local/src/server/execute.remote.test.ts` - `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts` - `pnpm exec vitest run server/src/__tests__/instance-settings-routes.test.ts` - `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts ui/src/lib/new-agent-runtime-config.test.ts` - `pnpm -r typecheck` - `pnpm build` - Manual verification on a branch-local dev server: - enabled the experimental flag - created an SSH environment - created a Linux Claude agent using that environment - confirmed a run executed on the Linux box and synced workspace changes back ## Risks - Medium: this touches runtime execution flow across multiple adapters, so regressions would likely show up in remote session setup, workspace sync, or environment selection precedence. - The UI flag reduces exposure, but the underlying runtime and route changes are still substantial and rely on migration correctness. - The change set is broad across adapters, control-plane services, migrations, and UI gating, so review should pay close attention to environment-selection precedence and remote workspace lifecycle behavior. ## Model Used - OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding model with tool use and code execution in the local repo workspace. The local adapter does not surface a more specific public model version string in this branch workflow. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-23 19:15:22 -07:00
target: executionTarget,
adapterKey: "cursor",
workspaceLocalDir: cwd,
Wire per-adapter sandbox install commands through test and execute paths (#5280) > **Stacked PR.** Sits on top of the e2b sandbox chain — #5278 (stdin staging) and #5279 (honest-resolvability + login-profiles). The cumulative diff against `master` includes both of those PRs' content; the files touched by *this* PR's commit are the new `maybeRunSandboxInstallCommand` helper in `packages/adapter-utils/src/execution-target.ts` and the per-adapter `index.ts`/`server/test.ts`/`server/execute.ts` wiring under `packages/adapters/{claude,codex,cursor,gemini,opencode,pi}-local/`. The honest resolvability check from #5279 is what gives this PR's install command a meaningful "did it actually land on PATH" follow-up. ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Sandbox execution targets are ephemeral — each fresh lease starts from a template image that may or may not have the agent CLIs preinstalled > - When a CLI isn't preinstalled, the resolvability probe fails at `command -v` and the hello probe never runs > - There's no shared mechanism for "before you probe or provision, install the CLI on this sandbox" > - This pull request adds a `SANDBOX_INSTALL_COMMAND` constant per adapter and a `maybeRunSandboxInstallCommand` helper that runs it via the existing sandbox login shell, captures structured output, and never throws (so the resolvability + hello probe still run after); each adapter's `test()` and `execute()` share the constant so the two callsites can't drift > - The benefit is a fresh sandbox lease without a preinstalled CLI now installs it once via `sh -lc` before the resolvability probe and before managed-runtime provisioning, with a uniform `<adapter>_install_command_run` check on the test report ## What Changed - `packages/adapter-utils/src/execution-target.ts`: add `AdapterSandboxInstallCommandCheck` and `maybeRunSandboxInstallCommand` (runs the install via existing sandbox shell, captures exit/stdout/stderr, returns a structured info/warn check, never throws) - Add `SANDBOX_INSTALL_COMMAND` to each adapter's `index.ts` so `test()` and `execute()` share a single source of truth - Wire each of the 6 affected adapter `testEnvironment()`s to call `maybeRunSandboxInstallCommand` before `ensureAdapterExecutionTargetCommandResolvable` - Pass `installCommand: SANDBOX_INSTALL_COMMAND` through `prepareAdapterExecutionTargetRuntime` in each adapter's `execute()` - Per-adapter install commands use npm globals where possible so binaries land on a PATH segment the template already exports: - claude → `npm install -g @anthropic-ai/claude-code` - codex → `npm install -g @openai/codex` - cursor → `curl https://cursor.com/install -fsS | bash` - gemini → `npm install -g @google/gemini-cli` - opencode → `npm install -g opencode-ai` - pi → `npm install -g @mariozechner/pi-coding-agent` SSH and local targets ignore `installCommand` (SSH runtime takes no such param; local short-circuits before runtime prep), so this is a no-op for non-sandbox environments. ## Verification - `pnpm typecheck` clean - `pnpm vitest run --no-coverage --project @paperclipai/adapter-utils` and per-adapter projects pass - Manual sandbox matrix (claude, codex, cursor, gemini, opencode, pi) — each goes `install_command_run → resolvable → hello_probe_passed` (Codex and Pi land on `hello_probe_auth_required`, which is the configured-credentials problem, not an install issue) - SSH no-regression: SSH Claude still passes; the helper short-circuits on non-sandbox targets ## Risks Medium — adds a network/CPU cost (npm install / curl) on every fresh sandbox lease. Cost is bounded (one-time per lease, typically tens of seconds for npm globals), and the helper never throws so a failing install still lets the report run resolvability and hello probes. If a sandbox image already has the CLI, the install is an idempotent reinstall. ## Model Used Claude Opus 4.7 (1M context) ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots — N/A (no UI) - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-05-05 08:29:28 -07:00
installCommand: SANDBOX_INSTALL_COMMAND,
detectCommand: command,
Add SSH environment support (#4358) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The environments subsystem already models execution environments, but before this branch there was no end-to-end SSH-backed runtime path for agents to actually run work against a remote box > - That meant agents could be configured around environment concepts without a reliable way to execute adapter sessions remotely, sync workspace state, and preserve run context across supported adapters > - We also need environment selection to participate in normal Paperclip control-plane behavior: agent defaults, project/issue selection, route validation, and environment probing > - Because this capability is still experimental, the UI surface should be easy to hide and easy to remove later without undoing the underlying implementation > - This pull request adds SSH environment execution support across the runtime, adapters, routes, schema, and tests, then puts the visible environment-management UI behind an experimental flag > - The benefit is that we can validate real SSH-backed agent execution now while keeping the user-facing controls safely gated until the feature is ready to come out of experimentation ## What Changed - Added SSH-backed execution target support in the shared adapter runtime, including remote workspace preparation, skill/runtime asset sync, remote session handling, and workspace restore behavior after runs. - Added SSH execution coverage for supported local adapters, plus remote execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi. - Added environment selection and environment-management backend support needed for SSH execution, including route/service work, validation, probing, and agent default environment persistence. - Added CLI support for SSH environment lab verification and updated related docs/tests. - Added the `enableEnvironments` experimental flag and gated the environment UI behind it on company settings, agent configuration, and project configuration surfaces. ## Verification - `pnpm exec vitest run packages/adapters/claude-local/src/server/execute.remote.test.ts packages/adapters/cursor-local/src/server/execute.remote.test.ts packages/adapters/gemini-local/src/server/execute.remote.test.ts packages/adapters/opencode-local/src/server/execute.remote.test.ts packages/adapters/pi-local/src/server/execute.remote.test.ts` - `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts` - `pnpm exec vitest run server/src/__tests__/instance-settings-routes.test.ts` - `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts ui/src/lib/new-agent-runtime-config.test.ts` - `pnpm -r typecheck` - `pnpm build` - Manual verification on a branch-local dev server: - enabled the experimental flag - created an SSH environment - created a Linux Claude agent using that environment - confirmed a run executed on the Linux box and synced workspace changes back ## Risks - Medium: this touches runtime execution flow across multiple adapters, so regressions would likely show up in remote session setup, workspace sync, or environment selection precedence. - The UI flag reduces exposure, but the underlying runtime and route changes are still substantial and rely on migration correctness. - The change set is broad across adapters, control-plane services, migrations, and UI gating, so review should pay close attention to environment-selection precedence and remote workspace lifecycle behavior. ## Model Used - OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding model with tool use and code execution in the local repo workspace. The local adapter does not surface a more specific public model version string in this branch workflow. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-23 19:15:22 -07:00
assets: [{
key: "skills",
localDir: localSkillsDir,
followSymlinks: true,
}],
});
restoreRemoteWorkspace = () => preparedExecutionTargetRuntime.restoreWorkspace();
Add sandbox callback bridge for remote environment API access (#4801) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Agents can run inside sandboxed environments like E2B, which are isolated from the host network > - Sandboxed agents need to call back to the Paperclip API to report progress, post comments, and update issue status > - But sandbox environments cannot reach the Paperclip server directly because they run in isolated network namespaces > - This PR adds a callback bridge that proxies API requests from the sandbox to the Paperclip server, running as a local HTTP server on the host that forwards authenticated requests > - The bridge is started automatically when an adapter launches a sandbox execution, and torn down when the run completes > - The benefit is sandboxed agents can interact with the Paperclip API without requiring network-level access to the host, enabling E2B and similar providers to work end-to-end ## What Changed - Added `sandbox-callback-bridge.ts` in `packages/adapter-utils/` — a lightweight HTTP bridge server that accepts requests from sandbox environments and proxies them to the Paperclip API with authentication - Added request validation and security policy: the bridge only forwards requests to the configured API URL, validates content types, enforces size limits, and rejects non-API paths - Wired the bridge into all remote adapter execute paths (claude, codex, cursor, gemini, pi) — the bridge starts before the agent process and the bridge URL is passed via environment variables - Updated `environment-execution-target.ts` to prefer the explicit API URL from environment lease metadata for sandbox callback routing - Fixed Claude sandbox runtime setup to work with the bridge configuration - Added comprehensive test coverage for bridge request handling, policy enforcement, and sandbox execution integration - Fixed browser bundling — the bridge module is excluded from the frontend bundle via the adapter-utils index export ## Verification - `pnpm test` — all existing and new tests pass, including bridge unit tests and sandbox execution integration tests - `pnpm typecheck` — clean - Manual: configure an E2B environment, run an agent task, verify the agent can post comments and update issue status through the bridge ## Risks - Medium. This is a new network-facing component (HTTP server on localhost). The security policy restricts forwarding to the configured API URL only and validates all requests, but any proxy introduces attack surface. The bridge binds to localhost only and is scoped to the lifetime of a single agent run. ## Model Used Codex GPT 5.4 high via Paperclip. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-29 16:37:34 -07:00
remoteRuntimeRootDir = preparedExecutionTargetRuntime.runtimeRootDir;
Add SSH environment support (#4358) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The environments subsystem already models execution environments, but before this branch there was no end-to-end SSH-backed runtime path for agents to actually run work against a remote box > - That meant agents could be configured around environment concepts without a reliable way to execute adapter sessions remotely, sync workspace state, and preserve run context across supported adapters > - We also need environment selection to participate in normal Paperclip control-plane behavior: agent defaults, project/issue selection, route validation, and environment probing > - Because this capability is still experimental, the UI surface should be easy to hide and easy to remove later without undoing the underlying implementation > - This pull request adds SSH environment execution support across the runtime, adapters, routes, schema, and tests, then puts the visible environment-management UI behind an experimental flag > - The benefit is that we can validate real SSH-backed agent execution now while keeping the user-facing controls safely gated until the feature is ready to come out of experimentation ## What Changed - Added SSH-backed execution target support in the shared adapter runtime, including remote workspace preparation, skill/runtime asset sync, remote session handling, and workspace restore behavior after runs. - Added SSH execution coverage for supported local adapters, plus remote execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi. - Added environment selection and environment-management backend support needed for SSH execution, including route/service work, validation, probing, and agent default environment persistence. - Added CLI support for SSH environment lab verification and updated related docs/tests. - Added the `enableEnvironments` experimental flag and gated the environment UI behind it on company settings, agent configuration, and project configuration surfaces. ## Verification - `pnpm exec vitest run packages/adapters/claude-local/src/server/execute.remote.test.ts packages/adapters/cursor-local/src/server/execute.remote.test.ts packages/adapters/gemini-local/src/server/execute.remote.test.ts packages/adapters/opencode-local/src/server/execute.remote.test.ts packages/adapters/pi-local/src/server/execute.remote.test.ts` - `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts` - `pnpm exec vitest run server/src/__tests__/instance-settings-routes.test.ts` - `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts ui/src/lib/new-agent-runtime-config.test.ts` - `pnpm -r typecheck` - `pnpm build` - Manual verification on a branch-local dev server: - enabled the experimental flag - created an SSH environment - created a Linux Claude agent using that environment - confirmed a run executed on the Linux box and synced workspace changes back ## Risks - Medium: this touches runtime execution flow across multiple adapters, so regressions would likely show up in remote session setup, workspace sync, or environment selection precedence. - The UI flag reduces exposure, but the underlying runtime and route changes are still substantial and rely on migration correctness. - The change set is broad across adapters, control-plane services, migrations, and UI gating, so review should pay close attention to environment-selection precedence and remote workspace lifecycle behavior. ## Model Used - OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding model with tool use and code execution in the local repo workspace. The local adapter does not surface a more specific public model version string in this branch workflow. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-23 19:15:22 -07:00
const managedHome = adapterExecutionTargetUsesManagedHome(executionTarget);
if (managedHome && preparedExecutionTargetRuntime.runtimeRootDir) {
env.HOME = preparedExecutionTargetRuntime.runtimeRootDir;
}
const remoteHomeDir = managedHome && preparedExecutionTargetRuntime.runtimeRootDir
? preparedExecutionTargetRuntime.runtimeRootDir
: await readAdapterExecutionTargetHomeDir(runId, executionTarget, {
cwd,
env,
timeoutSec,
graceSec,
onLog,
});
if (remoteHomeDir && preparedExecutionTargetRuntime.assetDirs.skills) {
const remoteSkillsDir = path.posix.join(remoteHomeDir, ".cursor", "skills");
await runAdapterExecutionTargetShellCommand(
runId,
executionTarget,
`mkdir -p ${JSON.stringify(path.posix.dirname(remoteSkillsDir))} && rm -rf ${JSON.stringify(remoteSkillsDir)} && cp -a ${JSON.stringify(preparedExecutionTargetRuntime.assetDirs.skills)} ${JSON.stringify(remoteSkillsDir)}`,
{ cwd, env, timeoutSec, graceSec, onLog },
);
}
} catch (error) {
await Promise.allSettled([
restoreRemoteWorkspace?.(),
localSkillsDir ? fs.rm(localSkillsDir, { recursive: true, force: true }).catch(() => undefined) : Promise.resolve(),
]);
throw error;
}
}
Add sandbox callback bridge for remote environment API access (#4801) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Agents can run inside sandboxed environments like E2B, which are isolated from the host network > - Sandboxed agents need to call back to the Paperclip API to report progress, post comments, and update issue status > - But sandbox environments cannot reach the Paperclip server directly because they run in isolated network namespaces > - This PR adds a callback bridge that proxies API requests from the sandbox to the Paperclip server, running as a local HTTP server on the host that forwards authenticated requests > - The bridge is started automatically when an adapter launches a sandbox execution, and torn down when the run completes > - The benefit is sandboxed agents can interact with the Paperclip API without requiring network-level access to the host, enabling E2B and similar providers to work end-to-end ## What Changed - Added `sandbox-callback-bridge.ts` in `packages/adapter-utils/` — a lightweight HTTP bridge server that accepts requests from sandbox environments and proxies them to the Paperclip API with authentication - Added request validation and security policy: the bridge only forwards requests to the configured API URL, validates content types, enforces size limits, and rejects non-API paths - Wired the bridge into all remote adapter execute paths (claude, codex, cursor, gemini, pi) — the bridge starts before the agent process and the bridge URL is passed via environment variables - Updated `environment-execution-target.ts` to prefer the explicit API URL from environment lease metadata for sandbox callback routing - Fixed Claude sandbox runtime setup to work with the bridge configuration - Added comprehensive test coverage for bridge request handling, policy enforcement, and sandbox execution integration - Fixed browser bundling — the bridge module is excluded from the frontend bundle via the adapter-utils index export ## Verification - `pnpm test` — all existing and new tests pass, including bridge unit tests and sandbox execution integration tests - `pnpm typecheck` — clean - Manual: configure an E2B environment, run an agent task, verify the agent can post comments and update issue status through the bridge ## Risks - Medium. This is a new network-facing component (HTTP server on localhost). The security policy restricts forwarding to the configured API URL only and validates all requests, but any proxy introduces attack surface. The bridge binds to localhost only and is scoped to the lifetime of a single agent run. ## Model Used Codex GPT 5.4 high via Paperclip. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-29 16:37:34 -07:00
if (executionTargetIsRemote && adapterExecutionTargetUsesPaperclipBridge(executionTarget)) {
paperclipBridge = await startAdapterExecutionTargetPaperclipBridge({
runId,
target: executionTarget,
runtimeRootDir: remoteRuntimeRootDir,
adapterKey: "cursor",
hostApiToken: env.PAPERCLIP_API_KEY,
onLog,
});
if (paperclipBridge) {
Object.assign(env, paperclipBridge.env);
loggedEnv = buildInvocationEnvForLogs(env, {
runtimeEnv: ensurePathInEnv({ ...process.env, ...env }),
includeRuntimeKeys: ["HOME"],
resolvedCommand,
});
}
}
const runtimeSessionParams = parseObject(runtime.sessionParams);
const runtimeSessionId = asString(runtimeSessionParams.sessionId, runtime.sessionId ?? "");
const runtimeSessionCwd = asString(runtimeSessionParams.cwd, "");
Add SSH environment support (#4358) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The environments subsystem already models execution environments, but before this branch there was no end-to-end SSH-backed runtime path for agents to actually run work against a remote box > - That meant agents could be configured around environment concepts without a reliable way to execute adapter sessions remotely, sync workspace state, and preserve run context across supported adapters > - We also need environment selection to participate in normal Paperclip control-plane behavior: agent defaults, project/issue selection, route validation, and environment probing > - Because this capability is still experimental, the UI surface should be easy to hide and easy to remove later without undoing the underlying implementation > - This pull request adds SSH environment execution support across the runtime, adapters, routes, schema, and tests, then puts the visible environment-management UI behind an experimental flag > - The benefit is that we can validate real SSH-backed agent execution now while keeping the user-facing controls safely gated until the feature is ready to come out of experimentation ## What Changed - Added SSH-backed execution target support in the shared adapter runtime, including remote workspace preparation, skill/runtime asset sync, remote session handling, and workspace restore behavior after runs. - Added SSH execution coverage for supported local adapters, plus remote execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi. - Added environment selection and environment-management backend support needed for SSH execution, including route/service work, validation, probing, and agent default environment persistence. - Added CLI support for SSH environment lab verification and updated related docs/tests. - Added the `enableEnvironments` experimental flag and gated the environment UI behind it on company settings, agent configuration, and project configuration surfaces. ## Verification - `pnpm exec vitest run packages/adapters/claude-local/src/server/execute.remote.test.ts packages/adapters/cursor-local/src/server/execute.remote.test.ts packages/adapters/gemini-local/src/server/execute.remote.test.ts packages/adapters/opencode-local/src/server/execute.remote.test.ts packages/adapters/pi-local/src/server/execute.remote.test.ts` - `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts` - `pnpm exec vitest run server/src/__tests__/instance-settings-routes.test.ts` - `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts ui/src/lib/new-agent-runtime-config.test.ts` - `pnpm -r typecheck` - `pnpm build` - Manual verification on a branch-local dev server: - enabled the experimental flag - created an SSH environment - created a Linux Claude agent using that environment - confirmed a run executed on the Linux box and synced workspace changes back ## Risks - Medium: this touches runtime execution flow across multiple adapters, so regressions would likely show up in remote session setup, workspace sync, or environment selection precedence. - The UI flag reduces exposure, but the underlying runtime and route changes are still substantial and rely on migration correctness. - The change set is broad across adapters, control-plane services, migrations, and UI gating, so review should pay close attention to environment-selection precedence and remote workspace lifecycle behavior. ## Model Used - OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding model with tool use and code execution in the local repo workspace. The local adapter does not surface a more specific public model version string in this branch workflow. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-23 19:15:22 -07:00
const runtimeRemoteExecution = parseObject(runtimeSessionParams.remoteExecution);
const canResumeSession =
runtimeSessionId.length > 0 &&
Add SSH environment support (#4358) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The environments subsystem already models execution environments, but before this branch there was no end-to-end SSH-backed runtime path for agents to actually run work against a remote box > - That meant agents could be configured around environment concepts without a reliable way to execute adapter sessions remotely, sync workspace state, and preserve run context across supported adapters > - We also need environment selection to participate in normal Paperclip control-plane behavior: agent defaults, project/issue selection, route validation, and environment probing > - Because this capability is still experimental, the UI surface should be easy to hide and easy to remove later without undoing the underlying implementation > - This pull request adds SSH environment execution support across the runtime, adapters, routes, schema, and tests, then puts the visible environment-management UI behind an experimental flag > - The benefit is that we can validate real SSH-backed agent execution now while keeping the user-facing controls safely gated until the feature is ready to come out of experimentation ## What Changed - Added SSH-backed execution target support in the shared adapter runtime, including remote workspace preparation, skill/runtime asset sync, remote session handling, and workspace restore behavior after runs. - Added SSH execution coverage for supported local adapters, plus remote execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi. - Added environment selection and environment-management backend support needed for SSH execution, including route/service work, validation, probing, and agent default environment persistence. - Added CLI support for SSH environment lab verification and updated related docs/tests. - Added the `enableEnvironments` experimental flag and gated the environment UI behind it on company settings, agent configuration, and project configuration surfaces. ## Verification - `pnpm exec vitest run packages/adapters/claude-local/src/server/execute.remote.test.ts packages/adapters/cursor-local/src/server/execute.remote.test.ts packages/adapters/gemini-local/src/server/execute.remote.test.ts packages/adapters/opencode-local/src/server/execute.remote.test.ts packages/adapters/pi-local/src/server/execute.remote.test.ts` - `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts` - `pnpm exec vitest run server/src/__tests__/instance-settings-routes.test.ts` - `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts ui/src/lib/new-agent-runtime-config.test.ts` - `pnpm -r typecheck` - `pnpm build` - Manual verification on a branch-local dev server: - enabled the experimental flag - created an SSH environment - created a Linux Claude agent using that environment - confirmed a run executed on the Linux box and synced workspace changes back ## Risks - Medium: this touches runtime execution flow across multiple adapters, so regressions would likely show up in remote session setup, workspace sync, or environment selection precedence. - The UI flag reduces exposure, but the underlying runtime and route changes are still substantial and rely on migration correctness. - The change set is broad across adapters, control-plane services, migrations, and UI gating, so review should pay close attention to environment-selection precedence and remote workspace lifecycle behavior. ## Model Used - OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding model with tool use and code execution in the local repo workspace. The local adapter does not surface a more specific public model version string in this branch workflow. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-23 19:15:22 -07:00
(runtimeSessionCwd.length === 0 || path.resolve(runtimeSessionCwd) === path.resolve(effectiveExecutionCwd)) &&
adapterExecutionTargetSessionMatches(runtimeRemoteExecution, executionTarget);
const sessionId = canResumeSession ? runtimeSessionId : null;
Add SSH environment support (#4358) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The environments subsystem already models execution environments, but before this branch there was no end-to-end SSH-backed runtime path for agents to actually run work against a remote box > - That meant agents could be configured around environment concepts without a reliable way to execute adapter sessions remotely, sync workspace state, and preserve run context across supported adapters > - We also need environment selection to participate in normal Paperclip control-plane behavior: agent defaults, project/issue selection, route validation, and environment probing > - Because this capability is still experimental, the UI surface should be easy to hide and easy to remove later without undoing the underlying implementation > - This pull request adds SSH environment execution support across the runtime, adapters, routes, schema, and tests, then puts the visible environment-management UI behind an experimental flag > - The benefit is that we can validate real SSH-backed agent execution now while keeping the user-facing controls safely gated until the feature is ready to come out of experimentation ## What Changed - Added SSH-backed execution target support in the shared adapter runtime, including remote workspace preparation, skill/runtime asset sync, remote session handling, and workspace restore behavior after runs. - Added SSH execution coverage for supported local adapters, plus remote execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi. - Added environment selection and environment-management backend support needed for SSH execution, including route/service work, validation, probing, and agent default environment persistence. - Added CLI support for SSH environment lab verification and updated related docs/tests. - Added the `enableEnvironments` experimental flag and gated the environment UI behind it on company settings, agent configuration, and project configuration surfaces. ## Verification - `pnpm exec vitest run packages/adapters/claude-local/src/server/execute.remote.test.ts packages/adapters/cursor-local/src/server/execute.remote.test.ts packages/adapters/gemini-local/src/server/execute.remote.test.ts packages/adapters/opencode-local/src/server/execute.remote.test.ts packages/adapters/pi-local/src/server/execute.remote.test.ts` - `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts` - `pnpm exec vitest run server/src/__tests__/instance-settings-routes.test.ts` - `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts ui/src/lib/new-agent-runtime-config.test.ts` - `pnpm -r typecheck` - `pnpm build` - Manual verification on a branch-local dev server: - enabled the experimental flag - created an SSH environment - created a Linux Claude agent using that environment - confirmed a run executed on the Linux box and synced workspace changes back ## Risks - Medium: this touches runtime execution flow across multiple adapters, so regressions would likely show up in remote session setup, workspace sync, or environment selection precedence. - The UI flag reduces exposure, but the underlying runtime and route changes are still substantial and rely on migration correctness. - The change set is broad across adapters, control-plane services, migrations, and UI gating, so review should pay close attention to environment-selection precedence and remote workspace lifecycle behavior. ## Model Used - OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding model with tool use and code execution in the local repo workspace. The local adapter does not surface a more specific public model version string in this branch workflow. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-23 19:15:22 -07:00
if (executionTargetIsRemote && runtimeSessionId && !canResumeSession) {
await onLog(
"stdout",
Add SSH environment support (#4358) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The environments subsystem already models execution environments, but before this branch there was no end-to-end SSH-backed runtime path for agents to actually run work against a remote box > - That meant agents could be configured around environment concepts without a reliable way to execute adapter sessions remotely, sync workspace state, and preserve run context across supported adapters > - We also need environment selection to participate in normal Paperclip control-plane behavior: agent defaults, project/issue selection, route validation, and environment probing > - Because this capability is still experimental, the UI surface should be easy to hide and easy to remove later without undoing the underlying implementation > - This pull request adds SSH environment execution support across the runtime, adapters, routes, schema, and tests, then puts the visible environment-management UI behind an experimental flag > - The benefit is that we can validate real SSH-backed agent execution now while keeping the user-facing controls safely gated until the feature is ready to come out of experimentation ## What Changed - Added SSH-backed execution target support in the shared adapter runtime, including remote workspace preparation, skill/runtime asset sync, remote session handling, and workspace restore behavior after runs. - Added SSH execution coverage for supported local adapters, plus remote execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi. - Added environment selection and environment-management backend support needed for SSH execution, including route/service work, validation, probing, and agent default environment persistence. - Added CLI support for SSH environment lab verification and updated related docs/tests. - Added the `enableEnvironments` experimental flag and gated the environment UI behind it on company settings, agent configuration, and project configuration surfaces. ## Verification - `pnpm exec vitest run packages/adapters/claude-local/src/server/execute.remote.test.ts packages/adapters/cursor-local/src/server/execute.remote.test.ts packages/adapters/gemini-local/src/server/execute.remote.test.ts packages/adapters/opencode-local/src/server/execute.remote.test.ts packages/adapters/pi-local/src/server/execute.remote.test.ts` - `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts` - `pnpm exec vitest run server/src/__tests__/instance-settings-routes.test.ts` - `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts ui/src/lib/new-agent-runtime-config.test.ts` - `pnpm -r typecheck` - `pnpm build` - Manual verification on a branch-local dev server: - enabled the experimental flag - created an SSH environment - created a Linux Claude agent using that environment - confirmed a run executed on the Linux box and synced workspace changes back ## Risks - Medium: this touches runtime execution flow across multiple adapters, so regressions would likely show up in remote session setup, workspace sync, or environment selection precedence. - The UI flag reduces exposure, but the underlying runtime and route changes are still substantial and rely on migration correctness. - The change set is broad across adapters, control-plane services, migrations, and UI gating, so review should pay close attention to environment-selection precedence and remote workspace lifecycle behavior. ## Model Used - OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding model with tool use and code execution in the local repo workspace. The local adapter does not surface a more specific public model version string in this branch workflow. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-23 19:15:22 -07:00
`[paperclip] Cursor session "${runtimeSessionId}" does not match the current remote execution identity and will not be resumed in "${effectiveExecutionCwd}". Starting a fresh remote session.\n`,
);
} else if (runtimeSessionId && !canResumeSession) {
await onLog(
"stdout",
`[paperclip] Cursor session "${runtimeSessionId}" was saved for cwd "${runtimeSessionCwd}" and will not be resumed in "${effectiveExecutionCwd}".\n`,
);
}
const instructionsFilePath = asString(config.instructionsFilePath, "").trim();
const instructionsDir = instructionsFilePath ? `${path.dirname(instructionsFilePath)}/` : "";
let instructionsPrefix = "";
let instructionsChars = 0;
if (instructionsFilePath) {
try {
const instructionsContents = await fs.readFile(instructionsFilePath, "utf8");
instructionsPrefix =
`${instructionsContents}\n\n` +
`The above agent instructions were loaded from ${instructionsFilePath}. ` +
`Resolve any relative file references from ${instructionsDir}.\n\n`;
instructionsChars = instructionsPrefix.length;
} catch (err) {
const reason = err instanceof Error ? err.message : String(err);
await onLog(
"stdout",
`[paperclip] Warning: could not read agent instructions file "${instructionsFilePath}": ${reason}\n`,
);
}
}
const commandNotes = (() => {
const notes: string[] = [];
if (autoTrustEnabled) {
notes.push("Auto-added --yolo to bypass interactive prompts.");
}
notes.push("Prompt is piped to Cursor via stdin.");
Add cursor sandbox support and fix SSH workspace sync (#4803) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Agents can run inside sandboxed environments like E2B, or on remote hosts via SSH > - The cursor adapter needs to resolve `cursor-agent` inside sandbox environments where it's installed in `~/.local/bin` > - But when using the default `agent` command on a sandbox target, the adapter didn't know to look in `~/.local/bin/cursor-agent`, causing "command not found" failures > - Additionally, repeated SSH runs failed because `git checkout` during workspace sync conflicted with leftover `.paperclip-runtime` files from previous runs > - This PR adds sandbox-aware command resolution for cursor and fixes the SSH workspace sync conflict > - The benefit is cursor works in E2B sandboxes out of the box, and repeated SSH runs don't fail on workspace sync ## What Changed - `cursor-local`: Added `prepareCursorSandboxCommand` — on sandbox targets, reads the remote `$HOME`, prepends `~/.local/bin` to PATH, and prefers `~/.local/bin/cursor-agent` when the default command is requested; tightened the sandbox command probe to validate the binary exists before launching; preserves explicit custom command overrides - `adapter-utils/ssh.ts`: Added `--force` to git checkout in SSH workspace sync to handle `.paperclip-runtime` untracked file conflicts from previous runs ## Verification - `pnpm test` — all existing and new tests pass, including cursor sandbox probe, sandbox execution, and custom command override tests - `pnpm typecheck` — clean - Manual: configure an E2B environment, run a cursor-local task, verify it resolves cursor-agent from the sandbox install path ## Risks - Low-medium. The `--force` flag on git checkout could discard uncommitted changes in the remote workspace, but the workspace is managed by Paperclip and should not contain user edits. ## Model Used Codex GPT 5.4 high via Paperclip. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-29 16:12:06 -07:00
if (sandboxCommand.addedPathEntry) {
notes.push(`Remote sandbox runs prepend ${sandboxCommand.addedPathEntry} to PATH.`);
}
if (sandboxCommand.preferredCommandPath) {
notes.push(`Remote sandbox runs prefer ${sandboxCommand.preferredCommandPath} when using the default Cursor entrypoint.`);
}
if (!instructionsFilePath) return notes;
if (instructionsPrefix.length > 0) {
notes.push(
`Loaded agent instructions from ${instructionsFilePath}`,
`Prepended instructions + path directive to prompt (relative references from ${instructionsDir}).`,
);
return notes;
}
notes.push(
`Configured instructionsFilePath ${instructionsFilePath}, but file could not be read; continuing without injected instructions.`,
);
return notes;
})();
const bootstrapPromptTemplate = asString(config.bootstrapPromptTemplate, "");
const templateData = {
agentId: agent.id,
companyId: agent.companyId,
runId,
company: { id: agent.companyId },
agent,
run: { id: runId, source: "on_demand" },
context,
};
const renderedBootstrapPrompt =
!sessionId && bootstrapPromptTemplate.trim().length > 0
? renderTemplate(bootstrapPromptTemplate, templateData).trim()
: "";
2026-03-28 10:33:40 -05:00
const wakePrompt = renderPaperclipWakePrompt(context.paperclipWake, { resumedSession: Boolean(sessionId) });
const shouldUseResumeDeltaPrompt = Boolean(sessionId) && wakePrompt.length > 0;
const renderedPrompt = shouldUseResumeDeltaPrompt ? "" : renderTemplate(promptTemplate, templateData);
const sessionHandoffNote = asString(context.paperclipSessionHandoffMarkdown, "").trim();
const paperclipEnvNote = renderPaperclipEnvNote(env);
const prompt = joinPromptSections([
instructionsPrefix,
renderedBootstrapPrompt,
wakePrompt,
sessionHandoffNote,
paperclipEnvNote,
renderedPrompt,
]);
const promptMetrics = {
promptChars: prompt.length,
instructionsChars,
bootstrapPromptChars: renderedBootstrapPrompt.length,
wakePromptChars: wakePrompt.length,
sessionHandoffChars: sessionHandoffNote.length,
runtimeNoteChars: paperclipEnvNote.length,
heartbeatPromptChars: renderedPrompt.length,
};
const buildArgs = (resumeSessionId: string | null) => {
Add SSH environment support (#4358) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The environments subsystem already models execution environments, but before this branch there was no end-to-end SSH-backed runtime path for agents to actually run work against a remote box > - That meant agents could be configured around environment concepts without a reliable way to execute adapter sessions remotely, sync workspace state, and preserve run context across supported adapters > - We also need environment selection to participate in normal Paperclip control-plane behavior: agent defaults, project/issue selection, route validation, and environment probing > - Because this capability is still experimental, the UI surface should be easy to hide and easy to remove later without undoing the underlying implementation > - This pull request adds SSH environment execution support across the runtime, adapters, routes, schema, and tests, then puts the visible environment-management UI behind an experimental flag > - The benefit is that we can validate real SSH-backed agent execution now while keeping the user-facing controls safely gated until the feature is ready to come out of experimentation ## What Changed - Added SSH-backed execution target support in the shared adapter runtime, including remote workspace preparation, skill/runtime asset sync, remote session handling, and workspace restore behavior after runs. - Added SSH execution coverage for supported local adapters, plus remote execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi. - Added environment selection and environment-management backend support needed for SSH execution, including route/service work, validation, probing, and agent default environment persistence. - Added CLI support for SSH environment lab verification and updated related docs/tests. - Added the `enableEnvironments` experimental flag and gated the environment UI behind it on company settings, agent configuration, and project configuration surfaces. ## Verification - `pnpm exec vitest run packages/adapters/claude-local/src/server/execute.remote.test.ts packages/adapters/cursor-local/src/server/execute.remote.test.ts packages/adapters/gemini-local/src/server/execute.remote.test.ts packages/adapters/opencode-local/src/server/execute.remote.test.ts packages/adapters/pi-local/src/server/execute.remote.test.ts` - `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts` - `pnpm exec vitest run server/src/__tests__/instance-settings-routes.test.ts` - `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts ui/src/lib/new-agent-runtime-config.test.ts` - `pnpm -r typecheck` - `pnpm build` - Manual verification on a branch-local dev server: - enabled the experimental flag - created an SSH environment - created a Linux Claude agent using that environment - confirmed a run executed on the Linux box and synced workspace changes back ## Risks - Medium: this touches runtime execution flow across multiple adapters, so regressions would likely show up in remote session setup, workspace sync, or environment selection precedence. - The UI flag reduces exposure, but the underlying runtime and route changes are still substantial and rely on migration correctness. - The change set is broad across adapters, control-plane services, migrations, and UI gating, so review should pay close attention to environment-selection precedence and remote workspace lifecycle behavior. ## Model Used - OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding model with tool use and code execution in the local repo workspace. The local adapter does not surface a more specific public model version string in this branch workflow. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-23 19:15:22 -07:00
const args = ["-p", "--output-format", "stream-json", "--workspace", effectiveExecutionCwd];
if (resumeSessionId) args.push("--resume", resumeSessionId);
if (model) args.push("--model", model);
if (mode) args.push("--mode", mode);
if (autoTrustEnabled) args.push("--yolo");
if (extraArgs.length > 0) args.push(...extraArgs);
return args;
};
const runAttempt = async (resumeSessionId: string | null) => {
const args = buildArgs(resumeSessionId);
if (onMeta) {
await onMeta({
adapterType: "cursor",
command: resolvedCommand,
Add SSH environment support (#4358) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The environments subsystem already models execution environments, but before this branch there was no end-to-end SSH-backed runtime path for agents to actually run work against a remote box > - That meant agents could be configured around environment concepts without a reliable way to execute adapter sessions remotely, sync workspace state, and preserve run context across supported adapters > - We also need environment selection to participate in normal Paperclip control-plane behavior: agent defaults, project/issue selection, route validation, and environment probing > - Because this capability is still experimental, the UI surface should be easy to hide and easy to remove later without undoing the underlying implementation > - This pull request adds SSH environment execution support across the runtime, adapters, routes, schema, and tests, then puts the visible environment-management UI behind an experimental flag > - The benefit is that we can validate real SSH-backed agent execution now while keeping the user-facing controls safely gated until the feature is ready to come out of experimentation ## What Changed - Added SSH-backed execution target support in the shared adapter runtime, including remote workspace preparation, skill/runtime asset sync, remote session handling, and workspace restore behavior after runs. - Added SSH execution coverage for supported local adapters, plus remote execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi. - Added environment selection and environment-management backend support needed for SSH execution, including route/service work, validation, probing, and agent default environment persistence. - Added CLI support for SSH environment lab verification and updated related docs/tests. - Added the `enableEnvironments` experimental flag and gated the environment UI behind it on company settings, agent configuration, and project configuration surfaces. ## Verification - `pnpm exec vitest run packages/adapters/claude-local/src/server/execute.remote.test.ts packages/adapters/cursor-local/src/server/execute.remote.test.ts packages/adapters/gemini-local/src/server/execute.remote.test.ts packages/adapters/opencode-local/src/server/execute.remote.test.ts packages/adapters/pi-local/src/server/execute.remote.test.ts` - `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts` - `pnpm exec vitest run server/src/__tests__/instance-settings-routes.test.ts` - `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts ui/src/lib/new-agent-runtime-config.test.ts` - `pnpm -r typecheck` - `pnpm build` - Manual verification on a branch-local dev server: - enabled the experimental flag - created an SSH environment - created a Linux Claude agent using that environment - confirmed a run executed on the Linux box and synced workspace changes back ## Risks - Medium: this touches runtime execution flow across multiple adapters, so regressions would likely show up in remote session setup, workspace sync, or environment selection precedence. - The UI flag reduces exposure, but the underlying runtime and route changes are still substantial and rely on migration correctness. - The change set is broad across adapters, control-plane services, migrations, and UI gating, so review should pay close attention to environment-selection precedence and remote workspace lifecycle behavior. ## Model Used - OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding model with tool use and code execution in the local repo workspace. The local adapter does not surface a more specific public model version string in this branch workflow. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-23 19:15:22 -07:00
cwd: effectiveExecutionCwd,
commandNotes,
commandArgs: args,
env: loggedEnv,
prompt,
promptMetrics,
context,
});
}
let stdoutLineBuffer = "";
const emitNormalizedStdoutLine = async (rawLine: string) => {
const normalized = normalizeCursorStreamLine(rawLine);
if (!normalized.line) return;
await onLog(normalized.stream ?? "stdout", `${normalized.line}\n`);
};
const flushStdoutChunk = async (chunk: string, finalize = false) => {
const combined = `${stdoutLineBuffer}${chunk}`;
const lines = combined.split(/\r?\n/);
stdoutLineBuffer = lines.pop() ?? "";
for (const line of lines) {
await emitNormalizedStdoutLine(line);
}
if (finalize) {
const trailing = stdoutLineBuffer.trim();
stdoutLineBuffer = "";
if (trailing) {
await emitNormalizedStdoutLine(trailing);
}
}
};
Add SSH environment support (#4358) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The environments subsystem already models execution environments, but before this branch there was no end-to-end SSH-backed runtime path for agents to actually run work against a remote box > - That meant agents could be configured around environment concepts without a reliable way to execute adapter sessions remotely, sync workspace state, and preserve run context across supported adapters > - We also need environment selection to participate in normal Paperclip control-plane behavior: agent defaults, project/issue selection, route validation, and environment probing > - Because this capability is still experimental, the UI surface should be easy to hide and easy to remove later without undoing the underlying implementation > - This pull request adds SSH environment execution support across the runtime, adapters, routes, schema, and tests, then puts the visible environment-management UI behind an experimental flag > - The benefit is that we can validate real SSH-backed agent execution now while keeping the user-facing controls safely gated until the feature is ready to come out of experimentation ## What Changed - Added SSH-backed execution target support in the shared adapter runtime, including remote workspace preparation, skill/runtime asset sync, remote session handling, and workspace restore behavior after runs. - Added SSH execution coverage for supported local adapters, plus remote execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi. - Added environment selection and environment-management backend support needed for SSH execution, including route/service work, validation, probing, and agent default environment persistence. - Added CLI support for SSH environment lab verification and updated related docs/tests. - Added the `enableEnvironments` experimental flag and gated the environment UI behind it on company settings, agent configuration, and project configuration surfaces. ## Verification - `pnpm exec vitest run packages/adapters/claude-local/src/server/execute.remote.test.ts packages/adapters/cursor-local/src/server/execute.remote.test.ts packages/adapters/gemini-local/src/server/execute.remote.test.ts packages/adapters/opencode-local/src/server/execute.remote.test.ts packages/adapters/pi-local/src/server/execute.remote.test.ts` - `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts` - `pnpm exec vitest run server/src/__tests__/instance-settings-routes.test.ts` - `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts ui/src/lib/new-agent-runtime-config.test.ts` - `pnpm -r typecheck` - `pnpm build` - Manual verification on a branch-local dev server: - enabled the experimental flag - created an SSH environment - created a Linux Claude agent using that environment - confirmed a run executed on the Linux box and synced workspace changes back ## Risks - Medium: this touches runtime execution flow across multiple adapters, so regressions would likely show up in remote session setup, workspace sync, or environment selection precedence. - The UI flag reduces exposure, but the underlying runtime and route changes are still substantial and rely on migration correctness. - The change set is broad across adapters, control-plane services, migrations, and UI gating, so review should pay close attention to environment-selection precedence and remote workspace lifecycle behavior. ## Model Used - OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding model with tool use and code execution in the local repo workspace. The local adapter does not surface a more specific public model version string in this branch workflow. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-23 19:15:22 -07:00
const proc = await runAdapterExecutionTargetProcess(runId, executionTarget, command, args, {
cwd,
env,
timeoutSec,
graceSec,
stdin: prompt,
onSpawn,
onLog: async (stream, chunk) => {
if (stream !== "stdout") {
await onLog(stream, chunk);
return;
}
await flushStdoutChunk(chunk);
},
});
await flushStdoutChunk("", true);
return {
proc,
parsed: parseCursorJsonl(proc.stdout),
};
};
const providerFromModel = resolveProviderFromModel(model);
const toResult = (
attempt: {
proc: {
exitCode: number | null;
signal: string | null;
timedOut: boolean;
stdout: string;
stderr: string;
};
parsed: ReturnType<typeof parseCursorJsonl>;
},
clearSessionOnMissingSession = false,
): AdapterExecutionResult => {
if (attempt.proc.timedOut) {
return {
exitCode: attempt.proc.exitCode,
signal: attempt.proc.signal,
timedOut: true,
errorMessage: `Timed out after ${timeoutSec}s`,
clearSession: clearSessionOnMissingSession,
};
}
const resolvedSessionId = attempt.parsed.sessionId ?? runtimeSessionId ?? runtime.sessionId ?? null;
const resolvedSessionParams = resolvedSessionId
? ({
sessionId: resolvedSessionId,
Add SSH environment support (#4358) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The environments subsystem already models execution environments, but before this branch there was no end-to-end SSH-backed runtime path for agents to actually run work against a remote box > - That meant agents could be configured around environment concepts without a reliable way to execute adapter sessions remotely, sync workspace state, and preserve run context across supported adapters > - We also need environment selection to participate in normal Paperclip control-plane behavior: agent defaults, project/issue selection, route validation, and environment probing > - Because this capability is still experimental, the UI surface should be easy to hide and easy to remove later without undoing the underlying implementation > - This pull request adds SSH environment execution support across the runtime, adapters, routes, schema, and tests, then puts the visible environment-management UI behind an experimental flag > - The benefit is that we can validate real SSH-backed agent execution now while keeping the user-facing controls safely gated until the feature is ready to come out of experimentation ## What Changed - Added SSH-backed execution target support in the shared adapter runtime, including remote workspace preparation, skill/runtime asset sync, remote session handling, and workspace restore behavior after runs. - Added SSH execution coverage for supported local adapters, plus remote execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi. - Added environment selection and environment-management backend support needed for SSH execution, including route/service work, validation, probing, and agent default environment persistence. - Added CLI support for SSH environment lab verification and updated related docs/tests. - Added the `enableEnvironments` experimental flag and gated the environment UI behind it on company settings, agent configuration, and project configuration surfaces. ## Verification - `pnpm exec vitest run packages/adapters/claude-local/src/server/execute.remote.test.ts packages/adapters/cursor-local/src/server/execute.remote.test.ts packages/adapters/gemini-local/src/server/execute.remote.test.ts packages/adapters/opencode-local/src/server/execute.remote.test.ts packages/adapters/pi-local/src/server/execute.remote.test.ts` - `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts` - `pnpm exec vitest run server/src/__tests__/instance-settings-routes.test.ts` - `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts ui/src/lib/new-agent-runtime-config.test.ts` - `pnpm -r typecheck` - `pnpm build` - Manual verification on a branch-local dev server: - enabled the experimental flag - created an SSH environment - created a Linux Claude agent using that environment - confirmed a run executed on the Linux box and synced workspace changes back ## Risks - Medium: this touches runtime execution flow across multiple adapters, so regressions would likely show up in remote session setup, workspace sync, or environment selection precedence. - The UI flag reduces exposure, but the underlying runtime and route changes are still substantial and rely on migration correctness. - The change set is broad across adapters, control-plane services, migrations, and UI gating, so review should pay close attention to environment-selection precedence and remote workspace lifecycle behavior. ## Model Used - OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding model with tool use and code execution in the local repo workspace. The local adapter does not surface a more specific public model version string in this branch workflow. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-23 19:15:22 -07:00
cwd: effectiveExecutionCwd,
...(workspaceId ? { workspaceId } : {}),
...(workspaceRepoUrl ? { repoUrl: workspaceRepoUrl } : {}),
...(workspaceRepoRef ? { repoRef: workspaceRepoRef } : {}),
Add SSH environment support (#4358) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The environments subsystem already models execution environments, but before this branch there was no end-to-end SSH-backed runtime path for agents to actually run work against a remote box > - That meant agents could be configured around environment concepts without a reliable way to execute adapter sessions remotely, sync workspace state, and preserve run context across supported adapters > - We also need environment selection to participate in normal Paperclip control-plane behavior: agent defaults, project/issue selection, route validation, and environment probing > - Because this capability is still experimental, the UI surface should be easy to hide and easy to remove later without undoing the underlying implementation > - This pull request adds SSH environment execution support across the runtime, adapters, routes, schema, and tests, then puts the visible environment-management UI behind an experimental flag > - The benefit is that we can validate real SSH-backed agent execution now while keeping the user-facing controls safely gated until the feature is ready to come out of experimentation ## What Changed - Added SSH-backed execution target support in the shared adapter runtime, including remote workspace preparation, skill/runtime asset sync, remote session handling, and workspace restore behavior after runs. - Added SSH execution coverage for supported local adapters, plus remote execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi. - Added environment selection and environment-management backend support needed for SSH execution, including route/service work, validation, probing, and agent default environment persistence. - Added CLI support for SSH environment lab verification and updated related docs/tests. - Added the `enableEnvironments` experimental flag and gated the environment UI behind it on company settings, agent configuration, and project configuration surfaces. ## Verification - `pnpm exec vitest run packages/adapters/claude-local/src/server/execute.remote.test.ts packages/adapters/cursor-local/src/server/execute.remote.test.ts packages/adapters/gemini-local/src/server/execute.remote.test.ts packages/adapters/opencode-local/src/server/execute.remote.test.ts packages/adapters/pi-local/src/server/execute.remote.test.ts` - `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts` - `pnpm exec vitest run server/src/__tests__/instance-settings-routes.test.ts` - `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts ui/src/lib/new-agent-runtime-config.test.ts` - `pnpm -r typecheck` - `pnpm build` - Manual verification on a branch-local dev server: - enabled the experimental flag - created an SSH environment - created a Linux Claude agent using that environment - confirmed a run executed on the Linux box and synced workspace changes back ## Risks - Medium: this touches runtime execution flow across multiple adapters, so regressions would likely show up in remote session setup, workspace sync, or environment selection precedence. - The UI flag reduces exposure, but the underlying runtime and route changes are still substantial and rely on migration correctness. - The change set is broad across adapters, control-plane services, migrations, and UI gating, so review should pay close attention to environment-selection precedence and remote workspace lifecycle behavior. ## Model Used - OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding model with tool use and code execution in the local repo workspace. The local adapter does not surface a more specific public model version string in this branch workflow. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-23 19:15:22 -07:00
...(executionTargetIsRemote
? {
remoteExecution: adapterExecutionTargetSessionIdentity(executionTarget),
}
: {}),
} as Record<string, unknown>)
: null;
const parsedError = typeof attempt.parsed.errorMessage === "string" ? attempt.parsed.errorMessage.trim() : "";
const stderrLine = firstNonEmptyLine(attempt.proc.stderr);
const fallbackErrorMessage =
parsedError ||
stderrLine ||
`Cursor exited with code ${attempt.proc.exitCode ?? -1}`;
return {
exitCode: attempt.proc.exitCode,
signal: attempt.proc.signal,
timedOut: false,
errorMessage:
(attempt.proc.exitCode ?? 0) === 0
? null
: fallbackErrorMessage,
usage: attempt.parsed.usage,
sessionId: resolvedSessionId,
sessionParams: resolvedSessionParams,
sessionDisplayId: resolvedSessionId,
provider: providerFromModel,
biller: resolveCursorBiller(effectiveEnv, billingType, providerFromModel),
model,
billingType,
costUsd: attempt.parsed.costUsd,
resultJson: {
stdout: attempt.proc.stdout,
stderr: attempt.proc.stderr,
},
summary: attempt.parsed.summary,
clearSession: Boolean(clearSessionOnMissingSession && !resolvedSessionId),
};
};
Add SSH environment support (#4358) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The environments subsystem already models execution environments, but before this branch there was no end-to-end SSH-backed runtime path for agents to actually run work against a remote box > - That meant agents could be configured around environment concepts without a reliable way to execute adapter sessions remotely, sync workspace state, and preserve run context across supported adapters > - We also need environment selection to participate in normal Paperclip control-plane behavior: agent defaults, project/issue selection, route validation, and environment probing > - Because this capability is still experimental, the UI surface should be easy to hide and easy to remove later without undoing the underlying implementation > - This pull request adds SSH environment execution support across the runtime, adapters, routes, schema, and tests, then puts the visible environment-management UI behind an experimental flag > - The benefit is that we can validate real SSH-backed agent execution now while keeping the user-facing controls safely gated until the feature is ready to come out of experimentation ## What Changed - Added SSH-backed execution target support in the shared adapter runtime, including remote workspace preparation, skill/runtime asset sync, remote session handling, and workspace restore behavior after runs. - Added SSH execution coverage for supported local adapters, plus remote execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi. - Added environment selection and environment-management backend support needed for SSH execution, including route/service work, validation, probing, and agent default environment persistence. - Added CLI support for SSH environment lab verification and updated related docs/tests. - Added the `enableEnvironments` experimental flag and gated the environment UI behind it on company settings, agent configuration, and project configuration surfaces. ## Verification - `pnpm exec vitest run packages/adapters/claude-local/src/server/execute.remote.test.ts packages/adapters/cursor-local/src/server/execute.remote.test.ts packages/adapters/gemini-local/src/server/execute.remote.test.ts packages/adapters/opencode-local/src/server/execute.remote.test.ts packages/adapters/pi-local/src/server/execute.remote.test.ts` - `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts` - `pnpm exec vitest run server/src/__tests__/instance-settings-routes.test.ts` - `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts ui/src/lib/new-agent-runtime-config.test.ts` - `pnpm -r typecheck` - `pnpm build` - Manual verification on a branch-local dev server: - enabled the experimental flag - created an SSH environment - created a Linux Claude agent using that environment - confirmed a run executed on the Linux box and synced workspace changes back ## Risks - Medium: this touches runtime execution flow across multiple adapters, so regressions would likely show up in remote session setup, workspace sync, or environment selection precedence. - The UI flag reduces exposure, but the underlying runtime and route changes are still substantial and rely on migration correctness. - The change set is broad across adapters, control-plane services, migrations, and UI gating, so review should pay close attention to environment-selection precedence and remote workspace lifecycle behavior. ## Model Used - OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding model with tool use and code execution in the local repo workspace. The local adapter does not surface a more specific public model version string in this branch workflow. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-23 19:15:22 -07:00
try {
const initial = await runAttempt(sessionId);
if (
sessionId &&
!initial.proc.timedOut &&
(initial.proc.exitCode ?? 0) !== 0 &&
isCursorUnknownSessionError(initial.proc.stdout, initial.proc.stderr)
) {
await onLog(
"stdout",
`[paperclip] Cursor resume session "${sessionId}" is unavailable; retrying with a fresh session.\n`,
);
const retry = await runAttempt(null);
return toResult(retry, true);
}
return toResult(initial);
} finally {
Add sandbox callback bridge for remote environment API access (#4801) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Agents can run inside sandboxed environments like E2B, which are isolated from the host network > - Sandboxed agents need to call back to the Paperclip API to report progress, post comments, and update issue status > - But sandbox environments cannot reach the Paperclip server directly because they run in isolated network namespaces > - This PR adds a callback bridge that proxies API requests from the sandbox to the Paperclip server, running as a local HTTP server on the host that forwards authenticated requests > - The bridge is started automatically when an adapter launches a sandbox execution, and torn down when the run completes > - The benefit is sandboxed agents can interact with the Paperclip API without requiring network-level access to the host, enabling E2B and similar providers to work end-to-end ## What Changed - Added `sandbox-callback-bridge.ts` in `packages/adapter-utils/` — a lightweight HTTP bridge server that accepts requests from sandbox environments and proxies them to the Paperclip API with authentication - Added request validation and security policy: the bridge only forwards requests to the configured API URL, validates content types, enforces size limits, and rejects non-API paths - Wired the bridge into all remote adapter execute paths (claude, codex, cursor, gemini, pi) — the bridge starts before the agent process and the bridge URL is passed via environment variables - Updated `environment-execution-target.ts` to prefer the explicit API URL from environment lease metadata for sandbox callback routing - Fixed Claude sandbox runtime setup to work with the bridge configuration - Added comprehensive test coverage for bridge request handling, policy enforcement, and sandbox execution integration - Fixed browser bundling — the bridge module is excluded from the frontend bundle via the adapter-utils index export ## Verification - `pnpm test` — all existing and new tests pass, including bridge unit tests and sandbox execution integration tests - `pnpm typecheck` — clean - Manual: configure an E2B environment, run an agent task, verify the agent can post comments and update issue status through the bridge ## Risks - Medium. This is a new network-facing component (HTTP server on localhost). The security policy restricts forwarding to the configured API URL only and validates all requests, but any proxy introduces attack surface. The bridge binds to localhost only and is scoped to the lifetime of a single agent run. ## Model Used Codex GPT 5.4 high via Paperclip. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-29 16:37:34 -07:00
if (paperclipBridge) {
await paperclipBridge.stop();
}
Add SSH environment support (#4358) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The environments subsystem already models execution environments, but before this branch there was no end-to-end SSH-backed runtime path for agents to actually run work against a remote box > - That meant agents could be configured around environment concepts without a reliable way to execute adapter sessions remotely, sync workspace state, and preserve run context across supported adapters > - We also need environment selection to participate in normal Paperclip control-plane behavior: agent defaults, project/issue selection, route validation, and environment probing > - Because this capability is still experimental, the UI surface should be easy to hide and easy to remove later without undoing the underlying implementation > - This pull request adds SSH environment execution support across the runtime, adapters, routes, schema, and tests, then puts the visible environment-management UI behind an experimental flag > - The benefit is that we can validate real SSH-backed agent execution now while keeping the user-facing controls safely gated until the feature is ready to come out of experimentation ## What Changed - Added SSH-backed execution target support in the shared adapter runtime, including remote workspace preparation, skill/runtime asset sync, remote session handling, and workspace restore behavior after runs. - Added SSH execution coverage for supported local adapters, plus remote execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi. - Added environment selection and environment-management backend support needed for SSH execution, including route/service work, validation, probing, and agent default environment persistence. - Added CLI support for SSH environment lab verification and updated related docs/tests. - Added the `enableEnvironments` experimental flag and gated the environment UI behind it on company settings, agent configuration, and project configuration surfaces. ## Verification - `pnpm exec vitest run packages/adapters/claude-local/src/server/execute.remote.test.ts packages/adapters/cursor-local/src/server/execute.remote.test.ts packages/adapters/gemini-local/src/server/execute.remote.test.ts packages/adapters/opencode-local/src/server/execute.remote.test.ts packages/adapters/pi-local/src/server/execute.remote.test.ts` - `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts` - `pnpm exec vitest run server/src/__tests__/instance-settings-routes.test.ts` - `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts ui/src/lib/new-agent-runtime-config.test.ts` - `pnpm -r typecheck` - `pnpm build` - Manual verification on a branch-local dev server: - enabled the experimental flag - created an SSH environment - created a Linux Claude agent using that environment - confirmed a run executed on the Linux box and synced workspace changes back ## Risks - Medium: this touches runtime execution flow across multiple adapters, so regressions would likely show up in remote session setup, workspace sync, or environment selection precedence. - The UI flag reduces exposure, but the underlying runtime and route changes are still substantial and rely on migration correctness. - The change set is broad across adapters, control-plane services, migrations, and UI gating, so review should pay close attention to environment-selection precedence and remote workspace lifecycle behavior. ## Model Used - OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding model with tool use and code execution in the local repo workspace. The local adapter does not surface a more specific public model version string in this branch workflow. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-23 19:15:22 -07:00
if (restoreRemoteWorkspace) {
await onLog(
"stdout",
`[paperclip] Restoring workspace changes from ${describeAdapterExecutionTarget(executionTarget)}.\n`,
);
await restoreRemoteWorkspace();
}
if (localSkillsDir) {
await fs.rm(localSkillsDir, { recursive: true, force: true }).catch(() => undefined);
}
}
}