Add SSH environment support (#4358)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies
> - The environments subsystem already models execution environments,
but before this branch there was no end-to-end SSH-backed runtime path
for agents to actually run work against a remote box
> - That meant agents could be configured around environment concepts
without a reliable way to execute adapter sessions remotely, sync
workspace state, and preserve run context across supported adapters
> - We also need environment selection to participate in normal
Paperclip control-plane behavior: agent defaults, project/issue
selection, route validation, and environment probing
> - Because this capability is still experimental, the UI surface should
be easy to hide and easy to remove later without undoing the underlying
implementation
> - This pull request adds SSH environment execution support across the
runtime, adapters, routes, schema, and tests, then puts the visible
environment-management UI behind an experimental flag
> - The benefit is that we can validate real SSH-backed agent execution
now while keeping the user-facing controls safely gated until the
feature is ready to come out of experimentation
## What Changed
- Added SSH-backed execution target support in the shared adapter
runtime, including remote workspace preparation, skill/runtime asset
sync, remote session handling, and workspace restore behavior after
runs.
- Added SSH execution coverage for supported local adapters, plus remote
execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi.
- Added environment selection and environment-management backend support
needed for SSH execution, including route/service work, validation,
probing, and agent default environment persistence.
- Added CLI support for SSH environment lab verification and updated
related docs/tests.
- Added the `enableEnvironments` experimental flag and gated the
environment UI behind it on company settings, agent configuration, and
project configuration surfaces.
## Verification
- `pnpm exec vitest run
packages/adapters/claude-local/src/server/execute.remote.test.ts
packages/adapters/cursor-local/src/server/execute.remote.test.ts
packages/adapters/gemini-local/src/server/execute.remote.test.ts
packages/adapters/opencode-local/src/server/execute.remote.test.ts
packages/adapters/pi-local/src/server/execute.remote.test.ts`
- `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts`
- `pnpm exec vitest run
server/src/__tests__/instance-settings-routes.test.ts`
- `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts
ui/src/lib/new-agent-runtime-config.test.ts`
- `pnpm -r typecheck`
- `pnpm build`
- Manual verification on a branch-local dev server:
- enabled the experimental flag
- created an SSH environment
- created a Linux Claude agent using that environment
- confirmed a run executed on the Linux box and synced workspace changes
back
## Risks
- Medium: this touches runtime execution flow across multiple adapters,
so regressions would likely show up in remote session setup, workspace
sync, or environment selection precedence.
- The UI flag reduces exposure, but the underlying runtime and route
changes are still substantial and rely on migration correctness.
- The change set is broad across adapters, control-plane services,
migrations, and UI gating, so review should pay close attention to
environment-selection precedence and remote workspace lifecycle
behavior.
## Model Used
- OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding
model with tool use and code execution in the local repo workspace. The
local adapter does not surface a more specific public model version
string in this branch workflow.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-23 19:15:22 -07:00
|
|
|
import path from "node:path";
|
|
|
|
|
import type { SshRemoteExecutionSpec } from "./ssh.js";
|
Add sandbox environment support (#4415)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies.
> - The environment/runtime layer decides where agent work executes and
how the control plane reaches those runtimes.
> - Today Paperclip can run locally and over SSH, but sandboxed
execution needs a first-class environment model instead of one-off
adapter behavior.
> - We also want sandbox providers to be pluggable so the core does not
hardcode every provider implementation.
> - This branch adds the Sandbox environment path, the provider
contract, and a deterministic fake provider plugin.
> - That required synchronized changes across shared contracts, plugin
SDK surfaces, server runtime orchestration, and the UI
environment/workspace flows.
> - The result is that sandbox execution becomes a core control-plane
capability while keeping provider implementations extensible and
testable.
## What Changed
- Added sandbox runtime support to the environment execution path,
including runtime URL discovery, sandbox execution targeting,
orchestration, and heartbeat integration.
- Added plugin-provider support for sandbox environments so providers
can be supplied via plugins instead of hardcoded server logic.
- Added the fake sandbox provider plugin with deterministic behavior
suitable for local and automated testing.
- Updated shared types, validators, plugin protocol definitions, and SDK
helpers to carry sandbox provider and workspace-runtime contracts across
package boundaries.
- Updated server routes and services so companies can create sandbox
environments, select them for work, and execute work through the sandbox
runtime path.
- Updated the UI environment and workspace surfaces to expose sandbox
environment configuration and selection.
- Added test coverage for sandbox runtime behavior, provider seams,
environment route guards, orchestration, and the fake provider plugin.
## Verification
- Ran locally before the final fixture-only scrub:
- `pnpm -r typecheck`
- `pnpm test:run`
- `pnpm build`
- Ran locally after the final scrub amend:
- `pnpm vitest run server/src/__tests__/runtime-api.test.ts`
- Reviewer spot checks:
- create a sandbox environment backed by the fake provider plugin
- run work through that environment
- confirm sandbox provider execution does not inherit host secrets
implicitly
## Risks
- This touches shared contracts, plugin SDK plumbing, server runtime
orchestration, and UI environment/workspace flows, so regressions would
likely show up as cross-layer mismatches rather than isolated type
errors.
- Runtime URL discovery and sandbox callback selection are sensitive to
host/bind configuration; if that logic is wrong, sandbox-backed
callbacks may fail even when execution succeeds.
- The fake provider plugin is intentionally deterministic and
test-oriented; future providers may expose capability gaps that this
branch does not yet cover.
## Model Used
- OpenAI Codex coding agent on a GPT-5-class backend in the
Paperclip/Codex harness. Exact backend model ID is not exposed
in-session. Tool-assisted workflow with shell execution, file editing,
git history inspection, and local test execution.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-24 12:15:53 -07:00
|
|
|
import {
|
|
|
|
|
prepareCommandManagedRuntime,
|
|
|
|
|
type CommandManagedRuntimeRunner,
|
|
|
|
|
} from "./command-managed-runtime.js";
|
Add SSH environment support (#4358)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies
> - The environments subsystem already models execution environments,
but before this branch there was no end-to-end SSH-backed runtime path
for agents to actually run work against a remote box
> - That meant agents could be configured around environment concepts
without a reliable way to execute adapter sessions remotely, sync
workspace state, and preserve run context across supported adapters
> - We also need environment selection to participate in normal
Paperclip control-plane behavior: agent defaults, project/issue
selection, route validation, and environment probing
> - Because this capability is still experimental, the UI surface should
be easy to hide and easy to remove later without undoing the underlying
implementation
> - This pull request adds SSH environment execution support across the
runtime, adapters, routes, schema, and tests, then puts the visible
environment-management UI behind an experimental flag
> - The benefit is that we can validate real SSH-backed agent execution
now while keeping the user-facing controls safely gated until the
feature is ready to come out of experimentation
## What Changed
- Added SSH-backed execution target support in the shared adapter
runtime, including remote workspace preparation, skill/runtime asset
sync, remote session handling, and workspace restore behavior after
runs.
- Added SSH execution coverage for supported local adapters, plus remote
execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi.
- Added environment selection and environment-management backend support
needed for SSH execution, including route/service work, validation,
probing, and agent default environment persistence.
- Added CLI support for SSH environment lab verification and updated
related docs/tests.
- Added the `enableEnvironments` experimental flag and gated the
environment UI behind it on company settings, agent configuration, and
project configuration surfaces.
## Verification
- `pnpm exec vitest run
packages/adapters/claude-local/src/server/execute.remote.test.ts
packages/adapters/cursor-local/src/server/execute.remote.test.ts
packages/adapters/gemini-local/src/server/execute.remote.test.ts
packages/adapters/opencode-local/src/server/execute.remote.test.ts
packages/adapters/pi-local/src/server/execute.remote.test.ts`
- `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts`
- `pnpm exec vitest run
server/src/__tests__/instance-settings-routes.test.ts`
- `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts
ui/src/lib/new-agent-runtime-config.test.ts`
- `pnpm -r typecheck`
- `pnpm build`
- Manual verification on a branch-local dev server:
- enabled the experimental flag
- created an SSH environment
- created a Linux Claude agent using that environment
- confirmed a run executed on the Linux box and synced workspace changes
back
## Risks
- Medium: this touches runtime execution flow across multiple adapters,
so regressions would likely show up in remote session setup, workspace
sync, or environment selection precedence.
- The UI flag reduces exposure, but the underlying runtime and route
changes are still substantial and rely on migration correctness.
- The change set is broad across adapters, control-plane services,
migrations, and UI gating, so review should pay close attention to
environment-selection precedence and remote workspace lifecycle
behavior.
## Model Used
- OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding
model with tool use and code execution in the local repo workspace. The
local adapter does not surface a more specific public model version
string in this branch workflow.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-23 19:15:22 -07:00
|
|
|
import {
|
|
|
|
|
buildRemoteExecutionSessionIdentity,
|
|
|
|
|
prepareRemoteManagedRuntime,
|
|
|
|
|
remoteExecutionSessionMatches,
|
|
|
|
|
type RemoteManagedRuntimeAsset,
|
|
|
|
|
} from "./remote-managed-runtime.js";
|
Add sandbox callback bridge for remote environment API access (#4801)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies
> - Agents can run inside sandboxed environments like E2B, which are
isolated from the host network
> - Sandboxed agents need to call back to the Paperclip API to report
progress, post comments, and update issue status
> - But sandbox environments cannot reach the Paperclip server directly
because they run in isolated network namespaces
> - This PR adds a callback bridge that proxies API requests from the
sandbox to the Paperclip server, running as a local HTTP server on the
host that forwards authenticated requests
> - The bridge is started automatically when an adapter launches a
sandbox execution, and torn down when the run completes
> - The benefit is sandboxed agents can interact with the Paperclip API
without requiring network-level access to the host, enabling E2B and
similar providers to work end-to-end
## What Changed
- Added `sandbox-callback-bridge.ts` in `packages/adapter-utils/` — a
lightweight HTTP bridge server that accepts requests from sandbox
environments and proxies them to the Paperclip API with authentication
- Added request validation and security policy: the bridge only forwards
requests to the configured API URL, validates content types, enforces
size limits, and rejects non-API paths
- Wired the bridge into all remote adapter execute paths (claude, codex,
cursor, gemini, pi) — the bridge starts before the agent process and the
bridge URL is passed via environment variables
- Updated `environment-execution-target.ts` to prefer the explicit API
URL from environment lease metadata for sandbox callback routing
- Fixed Claude sandbox runtime setup to work with the bridge
configuration
- Added comprehensive test coverage for bridge request handling, policy
enforcement, and sandbox execution integration
- Fixed browser bundling — the bridge module is excluded from the
frontend bundle via the adapter-utils index export
## Verification
- `pnpm test` — all existing and new tests pass, including bridge unit
tests and sandbox execution integration tests
- `pnpm typecheck` — clean
- Manual: configure an E2B environment, run an agent task, verify the
agent can post comments and update issue status through the bridge
## Risks
- Medium. This is a new network-facing component (HTTP server on
localhost). The security policy restricts forwarding to the configured
API URL only and validates all requests, but any proxy introduces attack
surface. The bridge binds to localhost only and is scoped to the
lifetime of a single agent run.
## Model Used
Codex GPT 5.4 high via Paperclip.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-29 16:37:34 -07:00
|
|
|
import {
|
|
|
|
|
createCommandManagedSandboxCallbackBridgeQueueClient,
|
|
|
|
|
createSandboxCallbackBridgeAsset,
|
|
|
|
|
createSandboxCallbackBridgeToken,
|
|
|
|
|
DEFAULT_SANDBOX_CALLBACK_BRIDGE_MAX_BODY_BYTES,
|
|
|
|
|
startSandboxCallbackBridgeServer,
|
|
|
|
|
startSandboxCallbackBridgeWorker,
|
|
|
|
|
} from "./sandbox-callback-bridge.js";
|
Add SSH environment support (#4358)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies
> - The environments subsystem already models execution environments,
but before this branch there was no end-to-end SSH-backed runtime path
for agents to actually run work against a remote box
> - That meant agents could be configured around environment concepts
without a reliable way to execute adapter sessions remotely, sync
workspace state, and preserve run context across supported adapters
> - We also need environment selection to participate in normal
Paperclip control-plane behavior: agent defaults, project/issue
selection, route validation, and environment probing
> - Because this capability is still experimental, the UI surface should
be easy to hide and easy to remove later without undoing the underlying
implementation
> - This pull request adds SSH environment execution support across the
runtime, adapters, routes, schema, and tests, then puts the visible
environment-management UI behind an experimental flag
> - The benefit is that we can validate real SSH-backed agent execution
now while keeping the user-facing controls safely gated until the
feature is ready to come out of experimentation
## What Changed
- Added SSH-backed execution target support in the shared adapter
runtime, including remote workspace preparation, skill/runtime asset
sync, remote session handling, and workspace restore behavior after
runs.
- Added SSH execution coverage for supported local adapters, plus remote
execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi.
- Added environment selection and environment-management backend support
needed for SSH execution, including route/service work, validation,
probing, and agent default environment persistence.
- Added CLI support for SSH environment lab verification and updated
related docs/tests.
- Added the `enableEnvironments` experimental flag and gated the
environment UI behind it on company settings, agent configuration, and
project configuration surfaces.
## Verification
- `pnpm exec vitest run
packages/adapters/claude-local/src/server/execute.remote.test.ts
packages/adapters/cursor-local/src/server/execute.remote.test.ts
packages/adapters/gemini-local/src/server/execute.remote.test.ts
packages/adapters/opencode-local/src/server/execute.remote.test.ts
packages/adapters/pi-local/src/server/execute.remote.test.ts`
- `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts`
- `pnpm exec vitest run
server/src/__tests__/instance-settings-routes.test.ts`
- `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts
ui/src/lib/new-agent-runtime-config.test.ts`
- `pnpm -r typecheck`
- `pnpm build`
- Manual verification on a branch-local dev server:
- enabled the experimental flag
- created an SSH environment
- created a Linux Claude agent using that environment
- confirmed a run executed on the Linux box and synced workspace changes
back
## Risks
- Medium: this touches runtime execution flow across multiple adapters,
so regressions would likely show up in remote session setup, workspace
sync, or environment selection precedence.
- The UI flag reduces exposure, but the underlying runtime and route
changes are still substantial and rely on migration correctness.
- The change set is broad across adapters, control-plane services,
migrations, and UI gating, so review should pay close attention to
environment-selection precedence and remote workspace lifecycle
behavior.
## Model Used
- OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding
model with tool use and code execution in the local repo workspace. The
local adapter does not surface a more specific public model version
string in this branch workflow.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-23 19:15:22 -07:00
|
|
|
import { parseSshRemoteExecutionSpec, runSshCommand, shellQuote } from "./ssh.js";
|
|
|
|
|
import {
|
|
|
|
|
ensureCommandResolvable,
|
|
|
|
|
resolveCommandForLogs,
|
|
|
|
|
runChildProcess,
|
|
|
|
|
type RunProcessResult,
|
|
|
|
|
type TerminalResultCleanupOptions,
|
|
|
|
|
} from "./server-utils.js";
|
Let sandbox providers declare shell defaults (#5114)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies
> - Agents execute in sandboxed remote environments served by pluggable
sandbox
> providers (E2B today, more later)
> - Today every sandbox command runs under `sh -lc` regardless of what
the
> provider's container actually ships
> - That misses bash-only shell init on E2B (which ships bash) and
prevents
> future providers from declaring a different default — there's no way
for a
> provider to say "I have bash, use it"
> - This PR adds a `shellCommand` field to sandbox execution targets so
providers
> can declare their preferred shell ("bash" for E2B), threads it through
the
> sandbox-managed-runtime client, callback bridge, and execution-target
shell
> helper, and validates the value at the lease-metadata boundary
> - The benefit is that sandbox commands run under the right shell on
the right
> provider, and adding new sandbox providers only needs to declare a
shell
> preference
## What Changed
- Added `packages/adapter-utils/src/sandbox-shell.ts` exporting
`preferredShellForSandbox(shellCommand)` (returns `"bash"` if input is
`"bash"`,
else `"sh"`)
- Added `shellCommand?: "bash" | "sh" | null` to
`AdapterSandboxExecutionTarget`
and `CommandManagedRuntimeSpec`; threaded it through
`runAdapterExecutionTargetShellCommand`,
`prepareAdapterExecutionTargetRuntime`,
and `startAdapterExecutionTargetPaperclipBridge`
- `createCommandManagedRuntimeClient`, `prepareCommandManagedRuntime`,
and
`createCommandManagedSandboxCallbackBridgeQueueClient` now take an
optional
`shellCommand` and use `preferredShellForSandbox` to pick the shell
- `startSandboxCallbackBridgeServer` accepts a `shellCommand` for its
server
startup, readiness probe, and stop hook
- E2B sandbox plugin declares `shellCommand: "bash"` in `leaseMetadata`
- `resolveEnvironmentExecutionTarget` reads `shellCommand` from lease
metadata
(validating against `"bash" | "sh" | null`)
- `environment-runtime.ts` adds `"shellCommand"` to
`INTERNAL_PLUGIN_SANDBOX_CONFIG_KEYS`
so the field round-trips through internal plugin config without leaking
to
external plugin metadata
- Updated tests in `command-managed-runtime.test.ts`,
`execution-target-sandbox.test.ts`, `sandbox-callback-bridge.test.ts`,
`environment-execution-target.test.ts`
## Verification
- `pnpm --filter @paperclipai/adapter-utils test`
- `pnpm --filter @paperclipai/server test --
environment-execution-target`
- `pnpm --filter @paperclipai/sandbox-providers-e2b test`
- Manual QA: boot a Paperclip instance, create an E2B-backed
environment, run a
claude_local agent against it, and confirm the run completes (verifies
bash
shell semantics flow through the callback bridge end-to-end)
## Risks
- E2B sandbox commands now run under `bash -lc` instead of `sh -lc`.
Bash is a
strict superset for the commands we issue (no busybox-only flags in our
shell
scripts), so risk is low. The shellCommand field is opt-in via lease
metadata —
providers that don't declare it stay on `sh`.
- New optional field on `CommandManagedRuntimeSpec` and
`AdapterSandboxExecutionTarget`.
Consumers ignoring the field retain previous behaviour (sh).
- Lease metadata now carries an additional field. Existing leases
without
`shellCommand` resolve to `null` and fall back to sh — backwards
compatible.
## Model Used
- OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI
- Provider: OpenAI
- Used to author the code changes in this PR
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots — N/A (no UI changes)
- [ ] I have updated relevant documentation to reflect my changes — N/A
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-03 12:19:35 -07:00
|
|
|
import { preferredShellForSandbox } from "./sandbox-shell.js";
|
Add SSH environment support (#4358)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies
> - The environments subsystem already models execution environments,
but before this branch there was no end-to-end SSH-backed runtime path
for agents to actually run work against a remote box
> - That meant agents could be configured around environment concepts
without a reliable way to execute adapter sessions remotely, sync
workspace state, and preserve run context across supported adapters
> - We also need environment selection to participate in normal
Paperclip control-plane behavior: agent defaults, project/issue
selection, route validation, and environment probing
> - Because this capability is still experimental, the UI surface should
be easy to hide and easy to remove later without undoing the underlying
implementation
> - This pull request adds SSH environment execution support across the
runtime, adapters, routes, schema, and tests, then puts the visible
environment-management UI behind an experimental flag
> - The benefit is that we can validate real SSH-backed agent execution
now while keeping the user-facing controls safely gated until the
feature is ready to come out of experimentation
## What Changed
- Added SSH-backed execution target support in the shared adapter
runtime, including remote workspace preparation, skill/runtime asset
sync, remote session handling, and workspace restore behavior after
runs.
- Added SSH execution coverage for supported local adapters, plus remote
execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi.
- Added environment selection and environment-management backend support
needed for SSH execution, including route/service work, validation,
probing, and agent default environment persistence.
- Added CLI support for SSH environment lab verification and updated
related docs/tests.
- Added the `enableEnvironments` experimental flag and gated the
environment UI behind it on company settings, agent configuration, and
project configuration surfaces.
## Verification
- `pnpm exec vitest run
packages/adapters/claude-local/src/server/execute.remote.test.ts
packages/adapters/cursor-local/src/server/execute.remote.test.ts
packages/adapters/gemini-local/src/server/execute.remote.test.ts
packages/adapters/opencode-local/src/server/execute.remote.test.ts
packages/adapters/pi-local/src/server/execute.remote.test.ts`
- `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts`
- `pnpm exec vitest run
server/src/__tests__/instance-settings-routes.test.ts`
- `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts
ui/src/lib/new-agent-runtime-config.test.ts`
- `pnpm -r typecheck`
- `pnpm build`
- Manual verification on a branch-local dev server:
- enabled the experimental flag
- created an SSH environment
- created a Linux Claude agent using that environment
- confirmed a run executed on the Linux box and synced workspace changes
back
## Risks
- Medium: this touches runtime execution flow across multiple adapters,
so regressions would likely show up in remote session setup, workspace
sync, or environment selection precedence.
- The UI flag reduces exposure, but the underlying runtime and route
changes are still substantial and rely on migration correctness.
- The change set is broad across adapters, control-plane services,
migrations, and UI gating, so review should pay close attention to
environment-selection precedence and remote workspace lifecycle
behavior.
## Model Used
- OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding
model with tool use and code execution in the local repo workspace. The
local adapter does not surface a more specific public model version
string in this branch workflow.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-23 19:15:22 -07:00
|
|
|
|
|
|
|
|
export interface AdapterLocalExecutionTarget {
|
|
|
|
|
kind: "local";
|
|
|
|
|
environmentId?: string | null;
|
|
|
|
|
leaseId?: string | null;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
export interface AdapterSshExecutionTarget {
|
|
|
|
|
kind: "remote";
|
|
|
|
|
transport: "ssh";
|
|
|
|
|
environmentId?: string | null;
|
|
|
|
|
leaseId?: string | null;
|
|
|
|
|
remoteCwd: string;
|
|
|
|
|
paperclipApiUrl?: string | null;
|
|
|
|
|
spec: SshRemoteExecutionSpec;
|
|
|
|
|
}
|
|
|
|
|
|
Add sandbox environment support (#4415)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies.
> - The environment/runtime layer decides where agent work executes and
how the control plane reaches those runtimes.
> - Today Paperclip can run locally and over SSH, but sandboxed
execution needs a first-class environment model instead of one-off
adapter behavior.
> - We also want sandbox providers to be pluggable so the core does not
hardcode every provider implementation.
> - This branch adds the Sandbox environment path, the provider
contract, and a deterministic fake provider plugin.
> - That required synchronized changes across shared contracts, plugin
SDK surfaces, server runtime orchestration, and the UI
environment/workspace flows.
> - The result is that sandbox execution becomes a core control-plane
capability while keeping provider implementations extensible and
testable.
## What Changed
- Added sandbox runtime support to the environment execution path,
including runtime URL discovery, sandbox execution targeting,
orchestration, and heartbeat integration.
- Added plugin-provider support for sandbox environments so providers
can be supplied via plugins instead of hardcoded server logic.
- Added the fake sandbox provider plugin with deterministic behavior
suitable for local and automated testing.
- Updated shared types, validators, plugin protocol definitions, and SDK
helpers to carry sandbox provider and workspace-runtime contracts across
package boundaries.
- Updated server routes and services so companies can create sandbox
environments, select them for work, and execute work through the sandbox
runtime path.
- Updated the UI environment and workspace surfaces to expose sandbox
environment configuration and selection.
- Added test coverage for sandbox runtime behavior, provider seams,
environment route guards, orchestration, and the fake provider plugin.
## Verification
- Ran locally before the final fixture-only scrub:
- `pnpm -r typecheck`
- `pnpm test:run`
- `pnpm build`
- Ran locally after the final scrub amend:
- `pnpm vitest run server/src/__tests__/runtime-api.test.ts`
- Reviewer spot checks:
- create a sandbox environment backed by the fake provider plugin
- run work through that environment
- confirm sandbox provider execution does not inherit host secrets
implicitly
## Risks
- This touches shared contracts, plugin SDK plumbing, server runtime
orchestration, and UI environment/workspace flows, so regressions would
likely show up as cross-layer mismatches rather than isolated type
errors.
- Runtime URL discovery and sandbox callback selection are sensitive to
host/bind configuration; if that logic is wrong, sandbox-backed
callbacks may fail even when execution succeeds.
- The fake provider plugin is intentionally deterministic and
test-oriented; future providers may expose capability gaps that this
branch does not yet cover.
## Model Used
- OpenAI Codex coding agent on a GPT-5-class backend in the
Paperclip/Codex harness. Exact backend model ID is not exposed
in-session. Tool-assisted workflow with shell execution, file editing,
git history inspection, and local test execution.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-24 12:15:53 -07:00
|
|
|
export interface AdapterSandboxExecutionTarget {
|
|
|
|
|
kind: "remote";
|
|
|
|
|
transport: "sandbox";
|
|
|
|
|
providerKey?: string | null;
|
Let sandbox providers declare shell defaults (#5114)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies
> - Agents execute in sandboxed remote environments served by pluggable
sandbox
> providers (E2B today, more later)
> - Today every sandbox command runs under `sh -lc` regardless of what
the
> provider's container actually ships
> - That misses bash-only shell init on E2B (which ships bash) and
prevents
> future providers from declaring a different default — there's no way
for a
> provider to say "I have bash, use it"
> - This PR adds a `shellCommand` field to sandbox execution targets so
providers
> can declare their preferred shell ("bash" for E2B), threads it through
the
> sandbox-managed-runtime client, callback bridge, and execution-target
shell
> helper, and validates the value at the lease-metadata boundary
> - The benefit is that sandbox commands run under the right shell on
the right
> provider, and adding new sandbox providers only needs to declare a
shell
> preference
## What Changed
- Added `packages/adapter-utils/src/sandbox-shell.ts` exporting
`preferredShellForSandbox(shellCommand)` (returns `"bash"` if input is
`"bash"`,
else `"sh"`)
- Added `shellCommand?: "bash" | "sh" | null` to
`AdapterSandboxExecutionTarget`
and `CommandManagedRuntimeSpec`; threaded it through
`runAdapterExecutionTargetShellCommand`,
`prepareAdapterExecutionTargetRuntime`,
and `startAdapterExecutionTargetPaperclipBridge`
- `createCommandManagedRuntimeClient`, `prepareCommandManagedRuntime`,
and
`createCommandManagedSandboxCallbackBridgeQueueClient` now take an
optional
`shellCommand` and use `preferredShellForSandbox` to pick the shell
- `startSandboxCallbackBridgeServer` accepts a `shellCommand` for its
server
startup, readiness probe, and stop hook
- E2B sandbox plugin declares `shellCommand: "bash"` in `leaseMetadata`
- `resolveEnvironmentExecutionTarget` reads `shellCommand` from lease
metadata
(validating against `"bash" | "sh" | null`)
- `environment-runtime.ts` adds `"shellCommand"` to
`INTERNAL_PLUGIN_SANDBOX_CONFIG_KEYS`
so the field round-trips through internal plugin config without leaking
to
external plugin metadata
- Updated tests in `command-managed-runtime.test.ts`,
`execution-target-sandbox.test.ts`, `sandbox-callback-bridge.test.ts`,
`environment-execution-target.test.ts`
## Verification
- `pnpm --filter @paperclipai/adapter-utils test`
- `pnpm --filter @paperclipai/server test --
environment-execution-target`
- `pnpm --filter @paperclipai/sandbox-providers-e2b test`
- Manual QA: boot a Paperclip instance, create an E2B-backed
environment, run a
claude_local agent against it, and confirm the run completes (verifies
bash
shell semantics flow through the callback bridge end-to-end)
## Risks
- E2B sandbox commands now run under `bash -lc` instead of `sh -lc`.
Bash is a
strict superset for the commands we issue (no busybox-only flags in our
shell
scripts), so risk is low. The shellCommand field is opt-in via lease
metadata —
providers that don't declare it stay on `sh`.
- New optional field on `CommandManagedRuntimeSpec` and
`AdapterSandboxExecutionTarget`.
Consumers ignoring the field retain previous behaviour (sh).
- Lease metadata now carries an additional field. Existing leases
without
`shellCommand` resolve to `null` and fall back to sh — backwards
compatible.
## Model Used
- OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI
- Provider: OpenAI
- Used to author the code changes in this PR
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots — N/A (no UI changes)
- [ ] I have updated relevant documentation to reflect my changes — N/A
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-03 12:19:35 -07:00
|
|
|
shellCommand?: "bash" | "sh" | null;
|
Add sandbox environment support (#4415)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies.
> - The environment/runtime layer decides where agent work executes and
how the control plane reaches those runtimes.
> - Today Paperclip can run locally and over SSH, but sandboxed
execution needs a first-class environment model instead of one-off
adapter behavior.
> - We also want sandbox providers to be pluggable so the core does not
hardcode every provider implementation.
> - This branch adds the Sandbox environment path, the provider
contract, and a deterministic fake provider plugin.
> - That required synchronized changes across shared contracts, plugin
SDK surfaces, server runtime orchestration, and the UI
environment/workspace flows.
> - The result is that sandbox execution becomes a core control-plane
capability while keeping provider implementations extensible and
testable.
## What Changed
- Added sandbox runtime support to the environment execution path,
including runtime URL discovery, sandbox execution targeting,
orchestration, and heartbeat integration.
- Added plugin-provider support for sandbox environments so providers
can be supplied via plugins instead of hardcoded server logic.
- Added the fake sandbox provider plugin with deterministic behavior
suitable for local and automated testing.
- Updated shared types, validators, plugin protocol definitions, and SDK
helpers to carry sandbox provider and workspace-runtime contracts across
package boundaries.
- Updated server routes and services so companies can create sandbox
environments, select them for work, and execute work through the sandbox
runtime path.
- Updated the UI environment and workspace surfaces to expose sandbox
environment configuration and selection.
- Added test coverage for sandbox runtime behavior, provider seams,
environment route guards, orchestration, and the fake provider plugin.
## Verification
- Ran locally before the final fixture-only scrub:
- `pnpm -r typecheck`
- `pnpm test:run`
- `pnpm build`
- Ran locally after the final scrub amend:
- `pnpm vitest run server/src/__tests__/runtime-api.test.ts`
- Reviewer spot checks:
- create a sandbox environment backed by the fake provider plugin
- run work through that environment
- confirm sandbox provider execution does not inherit host secrets
implicitly
## Risks
- This touches shared contracts, plugin SDK plumbing, server runtime
orchestration, and UI environment/workspace flows, so regressions would
likely show up as cross-layer mismatches rather than isolated type
errors.
- Runtime URL discovery and sandbox callback selection are sensitive to
host/bind configuration; if that logic is wrong, sandbox-backed
callbacks may fail even when execution succeeds.
- The fake provider plugin is intentionally deterministic and
test-oriented; future providers may expose capability gaps that this
branch does not yet cover.
## Model Used
- OpenAI Codex coding agent on a GPT-5-class backend in the
Paperclip/Codex harness. Exact backend model ID is not exposed
in-session. Tool-assisted workflow with shell execution, file editing,
git history inspection, and local test execution.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-24 12:15:53 -07:00
|
|
|
environmentId?: string | null;
|
|
|
|
|
leaseId?: string | null;
|
|
|
|
|
remoteCwd: string;
|
|
|
|
|
paperclipApiUrl?: string | null;
|
Add sandbox callback bridge for remote environment API access (#4801)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies
> - Agents can run inside sandboxed environments like E2B, which are
isolated from the host network
> - Sandboxed agents need to call back to the Paperclip API to report
progress, post comments, and update issue status
> - But sandbox environments cannot reach the Paperclip server directly
because they run in isolated network namespaces
> - This PR adds a callback bridge that proxies API requests from the
sandbox to the Paperclip server, running as a local HTTP server on the
host that forwards authenticated requests
> - The bridge is started automatically when an adapter launches a
sandbox execution, and torn down when the run completes
> - The benefit is sandboxed agents can interact with the Paperclip API
without requiring network-level access to the host, enabling E2B and
similar providers to work end-to-end
## What Changed
- Added `sandbox-callback-bridge.ts` in `packages/adapter-utils/` — a
lightweight HTTP bridge server that accepts requests from sandbox
environments and proxies them to the Paperclip API with authentication
- Added request validation and security policy: the bridge only forwards
requests to the configured API URL, validates content types, enforces
size limits, and rejects non-API paths
- Wired the bridge into all remote adapter execute paths (claude, codex,
cursor, gemini, pi) — the bridge starts before the agent process and the
bridge URL is passed via environment variables
- Updated `environment-execution-target.ts` to prefer the explicit API
URL from environment lease metadata for sandbox callback routing
- Fixed Claude sandbox runtime setup to work with the bridge
configuration
- Added comprehensive test coverage for bridge request handling, policy
enforcement, and sandbox execution integration
- Fixed browser bundling — the bridge module is excluded from the
frontend bundle via the adapter-utils index export
## Verification
- `pnpm test` — all existing and new tests pass, including bridge unit
tests and sandbox execution integration tests
- `pnpm typecheck` — clean
- Manual: configure an E2B environment, run an agent task, verify the
agent can post comments and update issue status through the bridge
## Risks
- Medium. This is a new network-facing component (HTTP server on
localhost). The security policy restricts forwarding to the configured
API URL only and validates all requests, but any proxy introduces attack
surface. The bridge binds to localhost only and is scoped to the
lifetime of a single agent run.
## Model Used
Codex GPT 5.4 high via Paperclip.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-29 16:37:34 -07:00
|
|
|
paperclipTransport?: "direct" | "bridge";
|
Add sandbox environment support (#4415)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies.
> - The environment/runtime layer decides where agent work executes and
how the control plane reaches those runtimes.
> - Today Paperclip can run locally and over SSH, but sandboxed
execution needs a first-class environment model instead of one-off
adapter behavior.
> - We also want sandbox providers to be pluggable so the core does not
hardcode every provider implementation.
> - This branch adds the Sandbox environment path, the provider
contract, and a deterministic fake provider plugin.
> - That required synchronized changes across shared contracts, plugin
SDK surfaces, server runtime orchestration, and the UI
environment/workspace flows.
> - The result is that sandbox execution becomes a core control-plane
capability while keeping provider implementations extensible and
testable.
## What Changed
- Added sandbox runtime support to the environment execution path,
including runtime URL discovery, sandbox execution targeting,
orchestration, and heartbeat integration.
- Added plugin-provider support for sandbox environments so providers
can be supplied via plugins instead of hardcoded server logic.
- Added the fake sandbox provider plugin with deterministic behavior
suitable for local and automated testing.
- Updated shared types, validators, plugin protocol definitions, and SDK
helpers to carry sandbox provider and workspace-runtime contracts across
package boundaries.
- Updated server routes and services so companies can create sandbox
environments, select them for work, and execute work through the sandbox
runtime path.
- Updated the UI environment and workspace surfaces to expose sandbox
environment configuration and selection.
- Added test coverage for sandbox runtime behavior, provider seams,
environment route guards, orchestration, and the fake provider plugin.
## Verification
- Ran locally before the final fixture-only scrub:
- `pnpm -r typecheck`
- `pnpm test:run`
- `pnpm build`
- Ran locally after the final scrub amend:
- `pnpm vitest run server/src/__tests__/runtime-api.test.ts`
- Reviewer spot checks:
- create a sandbox environment backed by the fake provider plugin
- run work through that environment
- confirm sandbox provider execution does not inherit host secrets
implicitly
## Risks
- This touches shared contracts, plugin SDK plumbing, server runtime
orchestration, and UI environment/workspace flows, so regressions would
likely show up as cross-layer mismatches rather than isolated type
errors.
- Runtime URL discovery and sandbox callback selection are sensitive to
host/bind configuration; if that logic is wrong, sandbox-backed
callbacks may fail even when execution succeeds.
- The fake provider plugin is intentionally deterministic and
test-oriented; future providers may expose capability gaps that this
branch does not yet cover.
## Model Used
- OpenAI Codex coding agent on a GPT-5-class backend in the
Paperclip/Codex harness. Exact backend model ID is not exposed
in-session. Tool-assisted workflow with shell execution, file editing,
git history inspection, and local test execution.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-24 12:15:53 -07:00
|
|
|
timeoutMs?: number | null;
|
|
|
|
|
runner?: CommandManagedRuntimeRunner;
|
|
|
|
|
}
|
|
|
|
|
|
Add SSH environment support (#4358)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies
> - The environments subsystem already models execution environments,
but before this branch there was no end-to-end SSH-backed runtime path
for agents to actually run work against a remote box
> - That meant agents could be configured around environment concepts
without a reliable way to execute adapter sessions remotely, sync
workspace state, and preserve run context across supported adapters
> - We also need environment selection to participate in normal
Paperclip control-plane behavior: agent defaults, project/issue
selection, route validation, and environment probing
> - Because this capability is still experimental, the UI surface should
be easy to hide and easy to remove later without undoing the underlying
implementation
> - This pull request adds SSH environment execution support across the
runtime, adapters, routes, schema, and tests, then puts the visible
environment-management UI behind an experimental flag
> - The benefit is that we can validate real SSH-backed agent execution
now while keeping the user-facing controls safely gated until the
feature is ready to come out of experimentation
## What Changed
- Added SSH-backed execution target support in the shared adapter
runtime, including remote workspace preparation, skill/runtime asset
sync, remote session handling, and workspace restore behavior after
runs.
- Added SSH execution coverage for supported local adapters, plus remote
execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi.
- Added environment selection and environment-management backend support
needed for SSH execution, including route/service work, validation,
probing, and agent default environment persistence.
- Added CLI support for SSH environment lab verification and updated
related docs/tests.
- Added the `enableEnvironments` experimental flag and gated the
environment UI behind it on company settings, agent configuration, and
project configuration surfaces.
## Verification
- `pnpm exec vitest run
packages/adapters/claude-local/src/server/execute.remote.test.ts
packages/adapters/cursor-local/src/server/execute.remote.test.ts
packages/adapters/gemini-local/src/server/execute.remote.test.ts
packages/adapters/opencode-local/src/server/execute.remote.test.ts
packages/adapters/pi-local/src/server/execute.remote.test.ts`
- `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts`
- `pnpm exec vitest run
server/src/__tests__/instance-settings-routes.test.ts`
- `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts
ui/src/lib/new-agent-runtime-config.test.ts`
- `pnpm -r typecheck`
- `pnpm build`
- Manual verification on a branch-local dev server:
- enabled the experimental flag
- created an SSH environment
- created a Linux Claude agent using that environment
- confirmed a run executed on the Linux box and synced workspace changes
back
## Risks
- Medium: this touches runtime execution flow across multiple adapters,
so regressions would likely show up in remote session setup, workspace
sync, or environment selection precedence.
- The UI flag reduces exposure, but the underlying runtime and route
changes are still substantial and rely on migration correctness.
- The change set is broad across adapters, control-plane services,
migrations, and UI gating, so review should pay close attention to
environment-selection precedence and remote workspace lifecycle
behavior.
## Model Used
- OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding
model with tool use and code execution in the local repo workspace. The
local adapter does not surface a more specific public model version
string in this branch workflow.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-23 19:15:22 -07:00
|
|
|
export type AdapterExecutionTarget =
|
|
|
|
|
| AdapterLocalExecutionTarget
|
Add sandbox environment support (#4415)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies.
> - The environment/runtime layer decides where agent work executes and
how the control plane reaches those runtimes.
> - Today Paperclip can run locally and over SSH, but sandboxed
execution needs a first-class environment model instead of one-off
adapter behavior.
> - We also want sandbox providers to be pluggable so the core does not
hardcode every provider implementation.
> - This branch adds the Sandbox environment path, the provider
contract, and a deterministic fake provider plugin.
> - That required synchronized changes across shared contracts, plugin
SDK surfaces, server runtime orchestration, and the UI
environment/workspace flows.
> - The result is that sandbox execution becomes a core control-plane
capability while keeping provider implementations extensible and
testable.
## What Changed
- Added sandbox runtime support to the environment execution path,
including runtime URL discovery, sandbox execution targeting,
orchestration, and heartbeat integration.
- Added plugin-provider support for sandbox environments so providers
can be supplied via plugins instead of hardcoded server logic.
- Added the fake sandbox provider plugin with deterministic behavior
suitable for local and automated testing.
- Updated shared types, validators, plugin protocol definitions, and SDK
helpers to carry sandbox provider and workspace-runtime contracts across
package boundaries.
- Updated server routes and services so companies can create sandbox
environments, select them for work, and execute work through the sandbox
runtime path.
- Updated the UI environment and workspace surfaces to expose sandbox
environment configuration and selection.
- Added test coverage for sandbox runtime behavior, provider seams,
environment route guards, orchestration, and the fake provider plugin.
## Verification
- Ran locally before the final fixture-only scrub:
- `pnpm -r typecheck`
- `pnpm test:run`
- `pnpm build`
- Ran locally after the final scrub amend:
- `pnpm vitest run server/src/__tests__/runtime-api.test.ts`
- Reviewer spot checks:
- create a sandbox environment backed by the fake provider plugin
- run work through that environment
- confirm sandbox provider execution does not inherit host secrets
implicitly
## Risks
- This touches shared contracts, plugin SDK plumbing, server runtime
orchestration, and UI environment/workspace flows, so regressions would
likely show up as cross-layer mismatches rather than isolated type
errors.
- Runtime URL discovery and sandbox callback selection are sensitive to
host/bind configuration; if that logic is wrong, sandbox-backed
callbacks may fail even when execution succeeds.
- The fake provider plugin is intentionally deterministic and
test-oriented; future providers may expose capability gaps that this
branch does not yet cover.
## Model Used
- OpenAI Codex coding agent on a GPT-5-class backend in the
Paperclip/Codex harness. Exact backend model ID is not exposed
in-session. Tool-assisted workflow with shell execution, file editing,
git history inspection, and local test execution.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-24 12:15:53 -07:00
|
|
|
| AdapterSshExecutionTarget
|
|
|
|
|
| AdapterSandboxExecutionTarget;
|
Add SSH environment support (#4358)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies
> - The environments subsystem already models execution environments,
but before this branch there was no end-to-end SSH-backed runtime path
for agents to actually run work against a remote box
> - That meant agents could be configured around environment concepts
without a reliable way to execute adapter sessions remotely, sync
workspace state, and preserve run context across supported adapters
> - We also need environment selection to participate in normal
Paperclip control-plane behavior: agent defaults, project/issue
selection, route validation, and environment probing
> - Because this capability is still experimental, the UI surface should
be easy to hide and easy to remove later without undoing the underlying
implementation
> - This pull request adds SSH environment execution support across the
runtime, adapters, routes, schema, and tests, then puts the visible
environment-management UI behind an experimental flag
> - The benefit is that we can validate real SSH-backed agent execution
now while keeping the user-facing controls safely gated until the
feature is ready to come out of experimentation
## What Changed
- Added SSH-backed execution target support in the shared adapter
runtime, including remote workspace preparation, skill/runtime asset
sync, remote session handling, and workspace restore behavior after
runs.
- Added SSH execution coverage for supported local adapters, plus remote
execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi.
- Added environment selection and environment-management backend support
needed for SSH execution, including route/service work, validation,
probing, and agent default environment persistence.
- Added CLI support for SSH environment lab verification and updated
related docs/tests.
- Added the `enableEnvironments` experimental flag and gated the
environment UI behind it on company settings, agent configuration, and
project configuration surfaces.
## Verification
- `pnpm exec vitest run
packages/adapters/claude-local/src/server/execute.remote.test.ts
packages/adapters/cursor-local/src/server/execute.remote.test.ts
packages/adapters/gemini-local/src/server/execute.remote.test.ts
packages/adapters/opencode-local/src/server/execute.remote.test.ts
packages/adapters/pi-local/src/server/execute.remote.test.ts`
- `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts`
- `pnpm exec vitest run
server/src/__tests__/instance-settings-routes.test.ts`
- `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts
ui/src/lib/new-agent-runtime-config.test.ts`
- `pnpm -r typecheck`
- `pnpm build`
- Manual verification on a branch-local dev server:
- enabled the experimental flag
- created an SSH environment
- created a Linux Claude agent using that environment
- confirmed a run executed on the Linux box and synced workspace changes
back
## Risks
- Medium: this touches runtime execution flow across multiple adapters,
so regressions would likely show up in remote session setup, workspace
sync, or environment selection precedence.
- The UI flag reduces exposure, but the underlying runtime and route
changes are still substantial and rely on migration correctness.
- The change set is broad across adapters, control-plane services,
migrations, and UI gating, so review should pay close attention to
environment-selection precedence and remote workspace lifecycle
behavior.
## Model Used
- OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding
model with tool use and code execution in the local repo workspace. The
local adapter does not surface a more specific public model version
string in this branch workflow.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-23 19:15:22 -07:00
|
|
|
|
|
|
|
|
export type AdapterRemoteExecutionSpec = SshRemoteExecutionSpec;
|
|
|
|
|
|
|
|
|
|
export type AdapterManagedRuntimeAsset = RemoteManagedRuntimeAsset;
|
|
|
|
|
|
|
|
|
|
export interface PreparedAdapterExecutionTargetRuntime {
|
|
|
|
|
target: AdapterExecutionTarget;
|
|
|
|
|
runtimeRootDir: string | null;
|
|
|
|
|
assetDirs: Record<string, string>;
|
|
|
|
|
restoreWorkspace(): Promise<void>;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
export interface AdapterExecutionTargetProcessOptions {
|
|
|
|
|
cwd: string;
|
|
|
|
|
env: Record<string, string>;
|
|
|
|
|
stdin?: string;
|
|
|
|
|
timeoutSec: number;
|
|
|
|
|
graceSec: number;
|
|
|
|
|
onLog: (stream: "stdout" | "stderr", chunk: string) => Promise<void>;
|
|
|
|
|
onSpawn?: (meta: { pid: number; processGroupId: number | null; startedAt: string }) => Promise<void>;
|
|
|
|
|
terminalResultCleanup?: TerminalResultCleanupOptions;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
export interface AdapterExecutionTargetShellOptions {
|
|
|
|
|
cwd: string;
|
|
|
|
|
env: Record<string, string>;
|
|
|
|
|
timeoutSec?: number;
|
|
|
|
|
graceSec?: number;
|
|
|
|
|
onLog?: (stream: "stdout" | "stderr", chunk: string) => Promise<void>;
|
|
|
|
|
}
|
|
|
|
|
|
Add sandbox callback bridge for remote environment API access (#4801)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies
> - Agents can run inside sandboxed environments like E2B, which are
isolated from the host network
> - Sandboxed agents need to call back to the Paperclip API to report
progress, post comments, and update issue status
> - But sandbox environments cannot reach the Paperclip server directly
because they run in isolated network namespaces
> - This PR adds a callback bridge that proxies API requests from the
sandbox to the Paperclip server, running as a local HTTP server on the
host that forwards authenticated requests
> - The bridge is started automatically when an adapter launches a
sandbox execution, and torn down when the run completes
> - The benefit is sandboxed agents can interact with the Paperclip API
without requiring network-level access to the host, enabling E2B and
similar providers to work end-to-end
## What Changed
- Added `sandbox-callback-bridge.ts` in `packages/adapter-utils/` — a
lightweight HTTP bridge server that accepts requests from sandbox
environments and proxies them to the Paperclip API with authentication
- Added request validation and security policy: the bridge only forwards
requests to the configured API URL, validates content types, enforces
size limits, and rejects non-API paths
- Wired the bridge into all remote adapter execute paths (claude, codex,
cursor, gemini, pi) — the bridge starts before the agent process and the
bridge URL is passed via environment variables
- Updated `environment-execution-target.ts` to prefer the explicit API
URL from environment lease metadata for sandbox callback routing
- Fixed Claude sandbox runtime setup to work with the bridge
configuration
- Added comprehensive test coverage for bridge request handling, policy
enforcement, and sandbox execution integration
- Fixed browser bundling — the bridge module is excluded from the
frontend bundle via the adapter-utils index export
## Verification
- `pnpm test` — all existing and new tests pass, including bridge unit
tests and sandbox execution integration tests
- `pnpm typecheck` — clean
- Manual: configure an E2B environment, run an agent task, verify the
agent can post comments and update issue status through the bridge
## Risks
- Medium. This is a new network-facing component (HTTP server on
localhost). The security policy restricts forwarding to the configured
API URL only and validates all requests, but any proxy introduces attack
surface. The bridge binds to localhost only and is scoped to the
lifetime of a single agent run.
## Model Used
Codex GPT 5.4 high via Paperclip.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-29 16:37:34 -07:00
|
|
|
export interface AdapterExecutionTargetPaperclipBridgeHandle {
|
|
|
|
|
env: Record<string, string>;
|
|
|
|
|
stop(): Promise<void>;
|
|
|
|
|
}
|
|
|
|
|
|
Add SSH environment support (#4358)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies
> - The environments subsystem already models execution environments,
but before this branch there was no end-to-end SSH-backed runtime path
for agents to actually run work against a remote box
> - That meant agents could be configured around environment concepts
without a reliable way to execute adapter sessions remotely, sync
workspace state, and preserve run context across supported adapters
> - We also need environment selection to participate in normal
Paperclip control-plane behavior: agent defaults, project/issue
selection, route validation, and environment probing
> - Because this capability is still experimental, the UI surface should
be easy to hide and easy to remove later without undoing the underlying
implementation
> - This pull request adds SSH environment execution support across the
runtime, adapters, routes, schema, and tests, then puts the visible
environment-management UI behind an experimental flag
> - The benefit is that we can validate real SSH-backed agent execution
now while keeping the user-facing controls safely gated until the
feature is ready to come out of experimentation
## What Changed
- Added SSH-backed execution target support in the shared adapter
runtime, including remote workspace preparation, skill/runtime asset
sync, remote session handling, and workspace restore behavior after
runs.
- Added SSH execution coverage for supported local adapters, plus remote
execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi.
- Added environment selection and environment-management backend support
needed for SSH execution, including route/service work, validation,
probing, and agent default environment persistence.
- Added CLI support for SSH environment lab verification and updated
related docs/tests.
- Added the `enableEnvironments` experimental flag and gated the
environment UI behind it on company settings, agent configuration, and
project configuration surfaces.
## Verification
- `pnpm exec vitest run
packages/adapters/claude-local/src/server/execute.remote.test.ts
packages/adapters/cursor-local/src/server/execute.remote.test.ts
packages/adapters/gemini-local/src/server/execute.remote.test.ts
packages/adapters/opencode-local/src/server/execute.remote.test.ts
packages/adapters/pi-local/src/server/execute.remote.test.ts`
- `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts`
- `pnpm exec vitest run
server/src/__tests__/instance-settings-routes.test.ts`
- `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts
ui/src/lib/new-agent-runtime-config.test.ts`
- `pnpm -r typecheck`
- `pnpm build`
- Manual verification on a branch-local dev server:
- enabled the experimental flag
- created an SSH environment
- created a Linux Claude agent using that environment
- confirmed a run executed on the Linux box and synced workspace changes
back
## Risks
- Medium: this touches runtime execution flow across multiple adapters,
so regressions would likely show up in remote session setup, workspace
sync, or environment selection precedence.
- The UI flag reduces exposure, but the underlying runtime and route
changes are still substantial and rely on migration correctness.
- The change set is broad across adapters, control-plane services,
migrations, and UI gating, so review should pay close attention to
environment-selection precedence and remote workspace lifecycle
behavior.
## Model Used
- OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding
model with tool use and code execution in the local repo workspace. The
local adapter does not surface a more specific public model version
string in this branch workflow.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-23 19:15:22 -07:00
|
|
|
function parseObject(value: unknown): Record<string, unknown> {
|
|
|
|
|
return value && typeof value === "object" && !Array.isArray(value)
|
|
|
|
|
? (value as Record<string, unknown>)
|
|
|
|
|
: {};
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
function readString(value: unknown): string | null {
|
|
|
|
|
return typeof value === "string" && value.trim().length > 0 ? value.trim() : null;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
function readStringMeta(parsed: Record<string, unknown>, key: string): string | null {
|
|
|
|
|
return readString(parsed[key]);
|
|
|
|
|
}
|
|
|
|
|
|
Add sandbox callback bridge for remote environment API access (#4801)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies
> - Agents can run inside sandboxed environments like E2B, which are
isolated from the host network
> - Sandboxed agents need to call back to the Paperclip API to report
progress, post comments, and update issue status
> - But sandbox environments cannot reach the Paperclip server directly
because they run in isolated network namespaces
> - This PR adds a callback bridge that proxies API requests from the
sandbox to the Paperclip server, running as a local HTTP server on the
host that forwards authenticated requests
> - The bridge is started automatically when an adapter launches a
sandbox execution, and torn down when the run completes
> - The benefit is sandboxed agents can interact with the Paperclip API
without requiring network-level access to the host, enabling E2B and
similar providers to work end-to-end
## What Changed
- Added `sandbox-callback-bridge.ts` in `packages/adapter-utils/` — a
lightweight HTTP bridge server that accepts requests from sandbox
environments and proxies them to the Paperclip API with authentication
- Added request validation and security policy: the bridge only forwards
requests to the configured API URL, validates content types, enforces
size limits, and rejects non-API paths
- Wired the bridge into all remote adapter execute paths (claude, codex,
cursor, gemini, pi) — the bridge starts before the agent process and the
bridge URL is passed via environment variables
- Updated `environment-execution-target.ts` to prefer the explicit API
URL from environment lease metadata for sandbox callback routing
- Fixed Claude sandbox runtime setup to work with the bridge
configuration
- Added comprehensive test coverage for bridge request handling, policy
enforcement, and sandbox execution integration
- Fixed browser bundling — the bridge module is excluded from the
frontend bundle via the adapter-utils index export
## Verification
- `pnpm test` — all existing and new tests pass, including bridge unit
tests and sandbox execution integration tests
- `pnpm typecheck` — clean
- Manual: configure an E2B environment, run an agent task, verify the
agent can post comments and update issue status through the bridge
## Risks
- Medium. This is a new network-facing component (HTTP server on
localhost). The security policy restricts forwarding to the configured
API URL only and validates all requests, but any proxy introduces attack
surface. The bridge binds to localhost only and is scoped to the
lifetime of a single agent run.
## Model Used
Codex GPT 5.4 high via Paperclip.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-29 16:37:34 -07:00
|
|
|
function resolveHostForUrl(rawHost: string): string {
|
|
|
|
|
const host = rawHost.trim();
|
|
|
|
|
if (!host || host === "0.0.0.0" || host === "::") return "localhost";
|
|
|
|
|
if (host.includes(":") && !host.startsWith("[") && !host.endsWith("]")) return `[${host}]`;
|
|
|
|
|
return host;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
function resolveDefaultPaperclipApiUrl(): string {
|
|
|
|
|
const runtimeHost = resolveHostForUrl(
|
|
|
|
|
process.env.PAPERCLIP_LISTEN_HOST ?? process.env.HOST ?? "localhost",
|
|
|
|
|
);
|
|
|
|
|
// 3100 matches the default Paperclip dev server port when the runtime does not provide one.
|
|
|
|
|
const runtimePort = process.env.PAPERCLIP_LISTEN_PORT ?? process.env.PORT ?? "3100";
|
|
|
|
|
return `http://${runtimeHost}:${runtimePort}`;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
function resolveSandboxPaperclipTransport(
|
|
|
|
|
target: Pick<AdapterSandboxExecutionTarget, "paperclipTransport" | "paperclipApiUrl">,
|
|
|
|
|
): "direct" | "bridge" {
|
|
|
|
|
if (target.paperclipTransport === "direct" || target.paperclipTransport === "bridge") {
|
|
|
|
|
return target.paperclipTransport;
|
|
|
|
|
}
|
|
|
|
|
return target.paperclipApiUrl ? "direct" : "bridge";
|
|
|
|
|
}
|
|
|
|
|
|
Add SSH environment support (#4358)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies
> - The environments subsystem already models execution environments,
but before this branch there was no end-to-end SSH-backed runtime path
for agents to actually run work against a remote box
> - That meant agents could be configured around environment concepts
without a reliable way to execute adapter sessions remotely, sync
workspace state, and preserve run context across supported adapters
> - We also need environment selection to participate in normal
Paperclip control-plane behavior: agent defaults, project/issue
selection, route validation, and environment probing
> - Because this capability is still experimental, the UI surface should
be easy to hide and easy to remove later without undoing the underlying
implementation
> - This pull request adds SSH environment execution support across the
runtime, adapters, routes, schema, and tests, then puts the visible
environment-management UI behind an experimental flag
> - The benefit is that we can validate real SSH-backed agent execution
now while keeping the user-facing controls safely gated until the
feature is ready to come out of experimentation
## What Changed
- Added SSH-backed execution target support in the shared adapter
runtime, including remote workspace preparation, skill/runtime asset
sync, remote session handling, and workspace restore behavior after
runs.
- Added SSH execution coverage for supported local adapters, plus remote
execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi.
- Added environment selection and environment-management backend support
needed for SSH execution, including route/service work, validation,
probing, and agent default environment persistence.
- Added CLI support for SSH environment lab verification and updated
related docs/tests.
- Added the `enableEnvironments` experimental flag and gated the
environment UI behind it on company settings, agent configuration, and
project configuration surfaces.
## Verification
- `pnpm exec vitest run
packages/adapters/claude-local/src/server/execute.remote.test.ts
packages/adapters/cursor-local/src/server/execute.remote.test.ts
packages/adapters/gemini-local/src/server/execute.remote.test.ts
packages/adapters/opencode-local/src/server/execute.remote.test.ts
packages/adapters/pi-local/src/server/execute.remote.test.ts`
- `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts`
- `pnpm exec vitest run
server/src/__tests__/instance-settings-routes.test.ts`
- `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts
ui/src/lib/new-agent-runtime-config.test.ts`
- `pnpm -r typecheck`
- `pnpm build`
- Manual verification on a branch-local dev server:
- enabled the experimental flag
- created an SSH environment
- created a Linux Claude agent using that environment
- confirmed a run executed on the Linux box and synced workspace changes
back
## Risks
- Medium: this touches runtime execution flow across multiple adapters,
so regressions would likely show up in remote session setup, workspace
sync, or environment selection precedence.
- The UI flag reduces exposure, but the underlying runtime and route
changes are still substantial and rely on migration correctness.
- The change set is broad across adapters, control-plane services,
migrations, and UI gating, so review should pay close attention to
environment-selection precedence and remote workspace lifecycle
behavior.
## Model Used
- OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding
model with tool use and code execution in the local repo workspace. The
local adapter does not surface a more specific public model version
string in this branch workflow.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-23 19:15:22 -07:00
|
|
|
function isAdapterExecutionTargetInstance(value: unknown): value is AdapterExecutionTarget {
|
|
|
|
|
const parsed = parseObject(value);
|
|
|
|
|
if (parsed.kind === "local") return true;
|
|
|
|
|
if (parsed.kind !== "remote") return false;
|
|
|
|
|
if (parsed.transport === "ssh") return parseSshRemoteExecutionSpec(parseObject(parsed.spec)) !== null;
|
Add sandbox environment support (#4415)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies.
> - The environment/runtime layer decides where agent work executes and
how the control plane reaches those runtimes.
> - Today Paperclip can run locally and over SSH, but sandboxed
execution needs a first-class environment model instead of one-off
adapter behavior.
> - We also want sandbox providers to be pluggable so the core does not
hardcode every provider implementation.
> - This branch adds the Sandbox environment path, the provider
contract, and a deterministic fake provider plugin.
> - That required synchronized changes across shared contracts, plugin
SDK surfaces, server runtime orchestration, and the UI
environment/workspace flows.
> - The result is that sandbox execution becomes a core control-plane
capability while keeping provider implementations extensible and
testable.
## What Changed
- Added sandbox runtime support to the environment execution path,
including runtime URL discovery, sandbox execution targeting,
orchestration, and heartbeat integration.
- Added plugin-provider support for sandbox environments so providers
can be supplied via plugins instead of hardcoded server logic.
- Added the fake sandbox provider plugin with deterministic behavior
suitable for local and automated testing.
- Updated shared types, validators, plugin protocol definitions, and SDK
helpers to carry sandbox provider and workspace-runtime contracts across
package boundaries.
- Updated server routes and services so companies can create sandbox
environments, select them for work, and execute work through the sandbox
runtime path.
- Updated the UI environment and workspace surfaces to expose sandbox
environment configuration and selection.
- Added test coverage for sandbox runtime behavior, provider seams,
environment route guards, orchestration, and the fake provider plugin.
## Verification
- Ran locally before the final fixture-only scrub:
- `pnpm -r typecheck`
- `pnpm test:run`
- `pnpm build`
- Ran locally after the final scrub amend:
- `pnpm vitest run server/src/__tests__/runtime-api.test.ts`
- Reviewer spot checks:
- create a sandbox environment backed by the fake provider plugin
- run work through that environment
- confirm sandbox provider execution does not inherit host secrets
implicitly
## Risks
- This touches shared contracts, plugin SDK plumbing, server runtime
orchestration, and UI environment/workspace flows, so regressions would
likely show up as cross-layer mismatches rather than isolated type
errors.
- Runtime URL discovery and sandbox callback selection are sensitive to
host/bind configuration; if that logic is wrong, sandbox-backed
callbacks may fail even when execution succeeds.
- The fake provider plugin is intentionally deterministic and
test-oriented; future providers may expose capability gaps that this
branch does not yet cover.
## Model Used
- OpenAI Codex coding agent on a GPT-5-class backend in the
Paperclip/Codex harness. Exact backend model ID is not exposed
in-session. Tool-assisted workflow with shell execution, file editing,
git history inspection, and local test execution.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-24 12:15:53 -07:00
|
|
|
if (parsed.transport !== "sandbox") return false;
|
|
|
|
|
return readStringMeta(parsed, "remoteCwd") !== null;
|
Add SSH environment support (#4358)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies
> - The environments subsystem already models execution environments,
but before this branch there was no end-to-end SSH-backed runtime path
for agents to actually run work against a remote box
> - That meant agents could be configured around environment concepts
without a reliable way to execute adapter sessions remotely, sync
workspace state, and preserve run context across supported adapters
> - We also need environment selection to participate in normal
Paperclip control-plane behavior: agent defaults, project/issue
selection, route validation, and environment probing
> - Because this capability is still experimental, the UI surface should
be easy to hide and easy to remove later without undoing the underlying
implementation
> - This pull request adds SSH environment execution support across the
runtime, adapters, routes, schema, and tests, then puts the visible
environment-management UI behind an experimental flag
> - The benefit is that we can validate real SSH-backed agent execution
now while keeping the user-facing controls safely gated until the
feature is ready to come out of experimentation
## What Changed
- Added SSH-backed execution target support in the shared adapter
runtime, including remote workspace preparation, skill/runtime asset
sync, remote session handling, and workspace restore behavior after
runs.
- Added SSH execution coverage for supported local adapters, plus remote
execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi.
- Added environment selection and environment-management backend support
needed for SSH execution, including route/service work, validation,
probing, and agent default environment persistence.
- Added CLI support for SSH environment lab verification and updated
related docs/tests.
- Added the `enableEnvironments` experimental flag and gated the
environment UI behind it on company settings, agent configuration, and
project configuration surfaces.
## Verification
- `pnpm exec vitest run
packages/adapters/claude-local/src/server/execute.remote.test.ts
packages/adapters/cursor-local/src/server/execute.remote.test.ts
packages/adapters/gemini-local/src/server/execute.remote.test.ts
packages/adapters/opencode-local/src/server/execute.remote.test.ts
packages/adapters/pi-local/src/server/execute.remote.test.ts`
- `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts`
- `pnpm exec vitest run
server/src/__tests__/instance-settings-routes.test.ts`
- `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts
ui/src/lib/new-agent-runtime-config.test.ts`
- `pnpm -r typecheck`
- `pnpm build`
- Manual verification on a branch-local dev server:
- enabled the experimental flag
- created an SSH environment
- created a Linux Claude agent using that environment
- confirmed a run executed on the Linux box and synced workspace changes
back
## Risks
- Medium: this touches runtime execution flow across multiple adapters,
so regressions would likely show up in remote session setup, workspace
sync, or environment selection precedence.
- The UI flag reduces exposure, but the underlying runtime and route
changes are still substantial and rely on migration correctness.
- The change set is broad across adapters, control-plane services,
migrations, and UI gating, so review should pay close attention to
environment-selection precedence and remote workspace lifecycle
behavior.
## Model Used
- OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding
model with tool use and code execution in the local repo workspace. The
local adapter does not surface a more specific public model version
string in this branch workflow.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-23 19:15:22 -07:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
export function adapterExecutionTargetToRemoteSpec(
|
|
|
|
|
target: AdapterExecutionTarget | null | undefined,
|
|
|
|
|
): AdapterRemoteExecutionSpec | null {
|
|
|
|
|
return target?.kind === "remote" && target.transport === "ssh" ? target.spec : null;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
export function adapterExecutionTargetIsRemote(
|
|
|
|
|
target: AdapterExecutionTarget | null | undefined,
|
|
|
|
|
): boolean {
|
|
|
|
|
return target?.kind === "remote";
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
export function adapterExecutionTargetUsesManagedHome(
|
|
|
|
|
target: AdapterExecutionTarget | null | undefined,
|
|
|
|
|
): boolean {
|
Add sandbox environment support (#4415)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies.
> - The environment/runtime layer decides where agent work executes and
how the control plane reaches those runtimes.
> - Today Paperclip can run locally and over SSH, but sandboxed
execution needs a first-class environment model instead of one-off
adapter behavior.
> - We also want sandbox providers to be pluggable so the core does not
hardcode every provider implementation.
> - This branch adds the Sandbox environment path, the provider
contract, and a deterministic fake provider plugin.
> - That required synchronized changes across shared contracts, plugin
SDK surfaces, server runtime orchestration, and the UI
environment/workspace flows.
> - The result is that sandbox execution becomes a core control-plane
capability while keeping provider implementations extensible and
testable.
## What Changed
- Added sandbox runtime support to the environment execution path,
including runtime URL discovery, sandbox execution targeting,
orchestration, and heartbeat integration.
- Added plugin-provider support for sandbox environments so providers
can be supplied via plugins instead of hardcoded server logic.
- Added the fake sandbox provider plugin with deterministic behavior
suitable for local and automated testing.
- Updated shared types, validators, plugin protocol definitions, and SDK
helpers to carry sandbox provider and workspace-runtime contracts across
package boundaries.
- Updated server routes and services so companies can create sandbox
environments, select them for work, and execute work through the sandbox
runtime path.
- Updated the UI environment and workspace surfaces to expose sandbox
environment configuration and selection.
- Added test coverage for sandbox runtime behavior, provider seams,
environment route guards, orchestration, and the fake provider plugin.
## Verification
- Ran locally before the final fixture-only scrub:
- `pnpm -r typecheck`
- `pnpm test:run`
- `pnpm build`
- Ran locally after the final scrub amend:
- `pnpm vitest run server/src/__tests__/runtime-api.test.ts`
- Reviewer spot checks:
- create a sandbox environment backed by the fake provider plugin
- run work through that environment
- confirm sandbox provider execution does not inherit host secrets
implicitly
## Risks
- This touches shared contracts, plugin SDK plumbing, server runtime
orchestration, and UI environment/workspace flows, so regressions would
likely show up as cross-layer mismatches rather than isolated type
errors.
- Runtime URL discovery and sandbox callback selection are sensitive to
host/bind configuration; if that logic is wrong, sandbox-backed
callbacks may fail even when execution succeeds.
- The fake provider plugin is intentionally deterministic and
test-oriented; future providers may expose capability gaps that this
branch does not yet cover.
## Model Used
- OpenAI Codex coding agent on a GPT-5-class backend in the
Paperclip/Codex harness. Exact backend model ID is not exposed
in-session. Tool-assisted workflow with shell execution, file editing,
git history inspection, and local test execution.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-24 12:15:53 -07:00
|
|
|
return target?.kind === "remote" && target.transport === "sandbox";
|
Add SSH environment support (#4358)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies
> - The environments subsystem already models execution environments,
but before this branch there was no end-to-end SSH-backed runtime path
for agents to actually run work against a remote box
> - That meant agents could be configured around environment concepts
without a reliable way to execute adapter sessions remotely, sync
workspace state, and preserve run context across supported adapters
> - We also need environment selection to participate in normal
Paperclip control-plane behavior: agent defaults, project/issue
selection, route validation, and environment probing
> - Because this capability is still experimental, the UI surface should
be easy to hide and easy to remove later without undoing the underlying
implementation
> - This pull request adds SSH environment execution support across the
runtime, adapters, routes, schema, and tests, then puts the visible
environment-management UI behind an experimental flag
> - The benefit is that we can validate real SSH-backed agent execution
now while keeping the user-facing controls safely gated until the
feature is ready to come out of experimentation
## What Changed
- Added SSH-backed execution target support in the shared adapter
runtime, including remote workspace preparation, skill/runtime asset
sync, remote session handling, and workspace restore behavior after
runs.
- Added SSH execution coverage for supported local adapters, plus remote
execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi.
- Added environment selection and environment-management backend support
needed for SSH execution, including route/service work, validation,
probing, and agent default environment persistence.
- Added CLI support for SSH environment lab verification and updated
related docs/tests.
- Added the `enableEnvironments` experimental flag and gated the
environment UI behind it on company settings, agent configuration, and
project configuration surfaces.
## Verification
- `pnpm exec vitest run
packages/adapters/claude-local/src/server/execute.remote.test.ts
packages/adapters/cursor-local/src/server/execute.remote.test.ts
packages/adapters/gemini-local/src/server/execute.remote.test.ts
packages/adapters/opencode-local/src/server/execute.remote.test.ts
packages/adapters/pi-local/src/server/execute.remote.test.ts`
- `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts`
- `pnpm exec vitest run
server/src/__tests__/instance-settings-routes.test.ts`
- `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts
ui/src/lib/new-agent-runtime-config.test.ts`
- `pnpm -r typecheck`
- `pnpm build`
- Manual verification on a branch-local dev server:
- enabled the experimental flag
- created an SSH environment
- created a Linux Claude agent using that environment
- confirmed a run executed on the Linux box and synced workspace changes
back
## Risks
- Medium: this touches runtime execution flow across multiple adapters,
so regressions would likely show up in remote session setup, workspace
sync, or environment selection precedence.
- The UI flag reduces exposure, but the underlying runtime and route
changes are still substantial and rely on migration correctness.
- The change set is broad across adapters, control-plane services,
migrations, and UI gating, so review should pay close attention to
environment-selection precedence and remote workspace lifecycle
behavior.
## Model Used
- OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding
model with tool use and code execution in the local repo workspace. The
local adapter does not surface a more specific public model version
string in this branch workflow.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-23 19:15:22 -07:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
export function adapterExecutionTargetRemoteCwd(
|
|
|
|
|
target: AdapterExecutionTarget | null | undefined,
|
|
|
|
|
localCwd: string,
|
|
|
|
|
): string {
|
|
|
|
|
return target?.kind === "remote" ? target.remoteCwd : localCwd;
|
|
|
|
|
}
|
|
|
|
|
|
Add dedicated environment settings page and test-in-environment (#4798)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies
> - Agents run inside environments (local, SSH, E2B sandbox)
> - Operators need to configure and manage these environments
> - But environment settings were buried inside the general company
settings page, making them hard to find
> - Additionally, when testing an agent from the configuration form, the
test always ran locally regardless of which environment was selected
> - This PR moves environments into a dedicated top-level company
settings section and wires the "Test Environment" button to run inside
the selected environment
> - The benefit is operators can find and manage environments more
easily, and the test button now validates the actual environment the
agent will use
## What Changed
- Added a dedicated `CompanyEnvironments` settings page with its own
route and sidebar entry
- Updated `CompanySettingsSidebar` and `CompanySettingsNav` to include
the new environments section
- Modified the agent test route (`POST /agents/:id/test`) to accept an
optional `environmentId` parameter
- Updated all adapter `test.ts` handlers to resolve and use the
specified execution target environment
- Added `resolveTestExecutionTarget` to `execution-target.ts` for remote
environment test resolution with cwd fallback
- Moved the "Test Environment" button and its feedback display into the
`NewAgent` page footer for better UX flow
## Verification
- `pnpm test` — all existing and new tests pass
- `pnpm typecheck` — clean
- Manual: navigate to Company Settings, confirm "Environments" appears
as a top-level section
- Manual: configure an agent with a non-local environment, click "Test
Environment", confirm the test runs inside that environment
## Risks
- Low risk. UI-only routing change for the settings page. The
test-in-environment change adds an optional parameter with a local
fallback, so existing behavior is preserved when no environment is
specified.
## Model Used
Codex GPT 5.4 high via Paperclip.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-29 15:56:13 -07:00
|
|
|
export function resolveAdapterExecutionTargetCwd(
|
|
|
|
|
target: AdapterExecutionTarget | null | undefined,
|
|
|
|
|
configuredCwd: string | null | undefined,
|
|
|
|
|
localFallbackCwd: string,
|
|
|
|
|
): string {
|
|
|
|
|
if (typeof configuredCwd === "string" && configuredCwd.trim().length > 0) {
|
|
|
|
|
return configuredCwd;
|
|
|
|
|
}
|
|
|
|
|
return adapterExecutionTargetRemoteCwd(target, localFallbackCwd);
|
|
|
|
|
}
|
|
|
|
|
|
Add SSH environment support (#4358)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies
> - The environments subsystem already models execution environments,
but before this branch there was no end-to-end SSH-backed runtime path
for agents to actually run work against a remote box
> - That meant agents could be configured around environment concepts
without a reliable way to execute adapter sessions remotely, sync
workspace state, and preserve run context across supported adapters
> - We also need environment selection to participate in normal
Paperclip control-plane behavior: agent defaults, project/issue
selection, route validation, and environment probing
> - Because this capability is still experimental, the UI surface should
be easy to hide and easy to remove later without undoing the underlying
implementation
> - This pull request adds SSH environment execution support across the
runtime, adapters, routes, schema, and tests, then puts the visible
environment-management UI behind an experimental flag
> - The benefit is that we can validate real SSH-backed agent execution
now while keeping the user-facing controls safely gated until the
feature is ready to come out of experimentation
## What Changed
- Added SSH-backed execution target support in the shared adapter
runtime, including remote workspace preparation, skill/runtime asset
sync, remote session handling, and workspace restore behavior after
runs.
- Added SSH execution coverage for supported local adapters, plus remote
execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi.
- Added environment selection and environment-management backend support
needed for SSH execution, including route/service work, validation,
probing, and agent default environment persistence.
- Added CLI support for SSH environment lab verification and updated
related docs/tests.
- Added the `enableEnvironments` experimental flag and gated the
environment UI behind it on company settings, agent configuration, and
project configuration surfaces.
## Verification
- `pnpm exec vitest run
packages/adapters/claude-local/src/server/execute.remote.test.ts
packages/adapters/cursor-local/src/server/execute.remote.test.ts
packages/adapters/gemini-local/src/server/execute.remote.test.ts
packages/adapters/opencode-local/src/server/execute.remote.test.ts
packages/adapters/pi-local/src/server/execute.remote.test.ts`
- `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts`
- `pnpm exec vitest run
server/src/__tests__/instance-settings-routes.test.ts`
- `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts
ui/src/lib/new-agent-runtime-config.test.ts`
- `pnpm -r typecheck`
- `pnpm build`
- Manual verification on a branch-local dev server:
- enabled the experimental flag
- created an SSH environment
- created a Linux Claude agent using that environment
- confirmed a run executed on the Linux box and synced workspace changes
back
## Risks
- Medium: this touches runtime execution flow across multiple adapters,
so regressions would likely show up in remote session setup, workspace
sync, or environment selection precedence.
- The UI flag reduces exposure, but the underlying runtime and route
changes are still substantial and rely on migration correctness.
- The change set is broad across adapters, control-plane services,
migrations, and UI gating, so review should pay close attention to
environment-selection precedence and remote workspace lifecycle
behavior.
## Model Used
- OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding
model with tool use and code execution in the local repo workspace. The
local adapter does not surface a more specific public model version
string in this branch workflow.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-23 19:15:22 -07:00
|
|
|
export function adapterExecutionTargetPaperclipApiUrl(
|
|
|
|
|
target: AdapterExecutionTarget | null | undefined,
|
|
|
|
|
): string | null {
|
|
|
|
|
if (target?.kind !== "remote") return null;
|
Add sandbox environment support (#4415)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies.
> - The environment/runtime layer decides where agent work executes and
how the control plane reaches those runtimes.
> - Today Paperclip can run locally and over SSH, but sandboxed
execution needs a first-class environment model instead of one-off
adapter behavior.
> - We also want sandbox providers to be pluggable so the core does not
hardcode every provider implementation.
> - This branch adds the Sandbox environment path, the provider
contract, and a deterministic fake provider plugin.
> - That required synchronized changes across shared contracts, plugin
SDK surfaces, server runtime orchestration, and the UI
environment/workspace flows.
> - The result is that sandbox execution becomes a core control-plane
capability while keeping provider implementations extensible and
testable.
## What Changed
- Added sandbox runtime support to the environment execution path,
including runtime URL discovery, sandbox execution targeting,
orchestration, and heartbeat integration.
- Added plugin-provider support for sandbox environments so providers
can be supplied via plugins instead of hardcoded server logic.
- Added the fake sandbox provider plugin with deterministic behavior
suitable for local and automated testing.
- Updated shared types, validators, plugin protocol definitions, and SDK
helpers to carry sandbox provider and workspace-runtime contracts across
package boundaries.
- Updated server routes and services so companies can create sandbox
environments, select them for work, and execute work through the sandbox
runtime path.
- Updated the UI environment and workspace surfaces to expose sandbox
environment configuration and selection.
- Added test coverage for sandbox runtime behavior, provider seams,
environment route guards, orchestration, and the fake provider plugin.
## Verification
- Ran locally before the final fixture-only scrub:
- `pnpm -r typecheck`
- `pnpm test:run`
- `pnpm build`
- Ran locally after the final scrub amend:
- `pnpm vitest run server/src/__tests__/runtime-api.test.ts`
- Reviewer spot checks:
- create a sandbox environment backed by the fake provider plugin
- run work through that environment
- confirm sandbox provider execution does not inherit host secrets
implicitly
## Risks
- This touches shared contracts, plugin SDK plumbing, server runtime
orchestration, and UI environment/workspace flows, so regressions would
likely show up as cross-layer mismatches rather than isolated type
errors.
- Runtime URL discovery and sandbox callback selection are sensitive to
host/bind configuration; if that logic is wrong, sandbox-backed
callbacks may fail even when execution succeeds.
- The fake provider plugin is intentionally deterministic and
test-oriented; future providers may expose capability gaps that this
branch does not yet cover.
## Model Used
- OpenAI Codex coding agent on a GPT-5-class backend in the
Paperclip/Codex harness. Exact backend model ID is not exposed
in-session. Tool-assisted workflow with shell execution, file editing,
git history inspection, and local test execution.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-24 12:15:53 -07:00
|
|
|
if (target.transport === "ssh") return target.paperclipApiUrl ?? target.spec.paperclipApiUrl ?? null;
|
Add sandbox callback bridge for remote environment API access (#4801)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies
> - Agents can run inside sandboxed environments like E2B, which are
isolated from the host network
> - Sandboxed agents need to call back to the Paperclip API to report
progress, post comments, and update issue status
> - But sandbox environments cannot reach the Paperclip server directly
because they run in isolated network namespaces
> - This PR adds a callback bridge that proxies API requests from the
sandbox to the Paperclip server, running as a local HTTP server on the
host that forwards authenticated requests
> - The bridge is started automatically when an adapter launches a
sandbox execution, and torn down when the run completes
> - The benefit is sandboxed agents can interact with the Paperclip API
without requiring network-level access to the host, enabling E2B and
similar providers to work end-to-end
## What Changed
- Added `sandbox-callback-bridge.ts` in `packages/adapter-utils/` — a
lightweight HTTP bridge server that accepts requests from sandbox
environments and proxies them to the Paperclip API with authentication
- Added request validation and security policy: the bridge only forwards
requests to the configured API URL, validates content types, enforces
size limits, and rejects non-API paths
- Wired the bridge into all remote adapter execute paths (claude, codex,
cursor, gemini, pi) — the bridge starts before the agent process and the
bridge URL is passed via environment variables
- Updated `environment-execution-target.ts` to prefer the explicit API
URL from environment lease metadata for sandbox callback routing
- Fixed Claude sandbox runtime setup to work with the bridge
configuration
- Added comprehensive test coverage for bridge request handling, policy
enforcement, and sandbox execution integration
- Fixed browser bundling — the bridge module is excluded from the
frontend bundle via the adapter-utils index export
## Verification
- `pnpm test` — all existing and new tests pass, including bridge unit
tests and sandbox execution integration tests
- `pnpm typecheck` — clean
- Manual: configure an E2B environment, run an agent task, verify the
agent can post comments and update issue status through the bridge
## Risks
- Medium. This is a new network-facing component (HTTP server on
localhost). The security policy restricts forwarding to the configured
API URL only and validates all requests, but any proxy introduces attack
surface. The bridge binds to localhost only and is scoped to the
lifetime of a single agent run.
## Model Used
Codex GPT 5.4 high via Paperclip.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-29 16:37:34 -07:00
|
|
|
if (resolveSandboxPaperclipTransport(target) === "bridge") return null;
|
Add sandbox environment support (#4415)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies.
> - The environment/runtime layer decides where agent work executes and
how the control plane reaches those runtimes.
> - Today Paperclip can run locally and over SSH, but sandboxed
execution needs a first-class environment model instead of one-off
adapter behavior.
> - We also want sandbox providers to be pluggable so the core does not
hardcode every provider implementation.
> - This branch adds the Sandbox environment path, the provider
contract, and a deterministic fake provider plugin.
> - That required synchronized changes across shared contracts, plugin
SDK surfaces, server runtime orchestration, and the UI
environment/workspace flows.
> - The result is that sandbox execution becomes a core control-plane
capability while keeping provider implementations extensible and
testable.
## What Changed
- Added sandbox runtime support to the environment execution path,
including runtime URL discovery, sandbox execution targeting,
orchestration, and heartbeat integration.
- Added plugin-provider support for sandbox environments so providers
can be supplied via plugins instead of hardcoded server logic.
- Added the fake sandbox provider plugin with deterministic behavior
suitable for local and automated testing.
- Updated shared types, validators, plugin protocol definitions, and SDK
helpers to carry sandbox provider and workspace-runtime contracts across
package boundaries.
- Updated server routes and services so companies can create sandbox
environments, select them for work, and execute work through the sandbox
runtime path.
- Updated the UI environment and workspace surfaces to expose sandbox
environment configuration and selection.
- Added test coverage for sandbox runtime behavior, provider seams,
environment route guards, orchestration, and the fake provider plugin.
## Verification
- Ran locally before the final fixture-only scrub:
- `pnpm -r typecheck`
- `pnpm test:run`
- `pnpm build`
- Ran locally after the final scrub amend:
- `pnpm vitest run server/src/__tests__/runtime-api.test.ts`
- Reviewer spot checks:
- create a sandbox environment backed by the fake provider plugin
- run work through that environment
- confirm sandbox provider execution does not inherit host secrets
implicitly
## Risks
- This touches shared contracts, plugin SDK plumbing, server runtime
orchestration, and UI environment/workspace flows, so regressions would
likely show up as cross-layer mismatches rather than isolated type
errors.
- Runtime URL discovery and sandbox callback selection are sensitive to
host/bind configuration; if that logic is wrong, sandbox-backed
callbacks may fail even when execution succeeds.
- The fake provider plugin is intentionally deterministic and
test-oriented; future providers may expose capability gaps that this
branch does not yet cover.
## Model Used
- OpenAI Codex coding agent on a GPT-5-class backend in the
Paperclip/Codex harness. Exact backend model ID is not exposed
in-session. Tool-assisted workflow with shell execution, file editing,
git history inspection, and local test execution.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-24 12:15:53 -07:00
|
|
|
return target.paperclipApiUrl ?? null;
|
Add SSH environment support (#4358)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies
> - The environments subsystem already models execution environments,
but before this branch there was no end-to-end SSH-backed runtime path
for agents to actually run work against a remote box
> - That meant agents could be configured around environment concepts
without a reliable way to execute adapter sessions remotely, sync
workspace state, and preserve run context across supported adapters
> - We also need environment selection to participate in normal
Paperclip control-plane behavior: agent defaults, project/issue
selection, route validation, and environment probing
> - Because this capability is still experimental, the UI surface should
be easy to hide and easy to remove later without undoing the underlying
implementation
> - This pull request adds SSH environment execution support across the
runtime, adapters, routes, schema, and tests, then puts the visible
environment-management UI behind an experimental flag
> - The benefit is that we can validate real SSH-backed agent execution
now while keeping the user-facing controls safely gated until the
feature is ready to come out of experimentation
## What Changed
- Added SSH-backed execution target support in the shared adapter
runtime, including remote workspace preparation, skill/runtime asset
sync, remote session handling, and workspace restore behavior after
runs.
- Added SSH execution coverage for supported local adapters, plus remote
execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi.
- Added environment selection and environment-management backend support
needed for SSH execution, including route/service work, validation,
probing, and agent default environment persistence.
- Added CLI support for SSH environment lab verification and updated
related docs/tests.
- Added the `enableEnvironments` experimental flag and gated the
environment UI behind it on company settings, agent configuration, and
project configuration surfaces.
## Verification
- `pnpm exec vitest run
packages/adapters/claude-local/src/server/execute.remote.test.ts
packages/adapters/cursor-local/src/server/execute.remote.test.ts
packages/adapters/gemini-local/src/server/execute.remote.test.ts
packages/adapters/opencode-local/src/server/execute.remote.test.ts
packages/adapters/pi-local/src/server/execute.remote.test.ts`
- `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts`
- `pnpm exec vitest run
server/src/__tests__/instance-settings-routes.test.ts`
- `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts
ui/src/lib/new-agent-runtime-config.test.ts`
- `pnpm -r typecheck`
- `pnpm build`
- Manual verification on a branch-local dev server:
- enabled the experimental flag
- created an SSH environment
- created a Linux Claude agent using that environment
- confirmed a run executed on the Linux box and synced workspace changes
back
## Risks
- Medium: this touches runtime execution flow across multiple adapters,
so regressions would likely show up in remote session setup, workspace
sync, or environment selection precedence.
- The UI flag reduces exposure, but the underlying runtime and route
changes are still substantial and rely on migration correctness.
- The change set is broad across adapters, control-plane services,
migrations, and UI gating, so review should pay close attention to
environment-selection precedence and remote workspace lifecycle
behavior.
## Model Used
- OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding
model with tool use and code execution in the local repo workspace. The
local adapter does not surface a more specific public model version
string in this branch workflow.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-23 19:15:22 -07:00
|
|
|
}
|
|
|
|
|
|
Add sandbox callback bridge for remote environment API access (#4801)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies
> - Agents can run inside sandboxed environments like E2B, which are
isolated from the host network
> - Sandboxed agents need to call back to the Paperclip API to report
progress, post comments, and update issue status
> - But sandbox environments cannot reach the Paperclip server directly
because they run in isolated network namespaces
> - This PR adds a callback bridge that proxies API requests from the
sandbox to the Paperclip server, running as a local HTTP server on the
host that forwards authenticated requests
> - The bridge is started automatically when an adapter launches a
sandbox execution, and torn down when the run completes
> - The benefit is sandboxed agents can interact with the Paperclip API
without requiring network-level access to the host, enabling E2B and
similar providers to work end-to-end
## What Changed
- Added `sandbox-callback-bridge.ts` in `packages/adapter-utils/` — a
lightweight HTTP bridge server that accepts requests from sandbox
environments and proxies them to the Paperclip API with authentication
- Added request validation and security policy: the bridge only forwards
requests to the configured API URL, validates content types, enforces
size limits, and rejects non-API paths
- Wired the bridge into all remote adapter execute paths (claude, codex,
cursor, gemini, pi) — the bridge starts before the agent process and the
bridge URL is passed via environment variables
- Updated `environment-execution-target.ts` to prefer the explicit API
URL from environment lease metadata for sandbox callback routing
- Fixed Claude sandbox runtime setup to work with the bridge
configuration
- Added comprehensive test coverage for bridge request handling, policy
enforcement, and sandbox execution integration
- Fixed browser bundling — the bridge module is excluded from the
frontend bundle via the adapter-utils index export
## Verification
- `pnpm test` — all existing and new tests pass, including bridge unit
tests and sandbox execution integration tests
- `pnpm typecheck` — clean
- Manual: configure an E2B environment, run an agent task, verify the
agent can post comments and update issue status through the bridge
## Risks
- Medium. This is a new network-facing component (HTTP server on
localhost). The security policy restricts forwarding to the configured
API URL only and validates all requests, but any proxy introduces attack
surface. The bridge binds to localhost only and is scoped to the
lifetime of a single agent run.
## Model Used
Codex GPT 5.4 high via Paperclip.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-29 16:37:34 -07:00
|
|
|
export function adapterExecutionTargetUsesPaperclipBridge(
|
|
|
|
|
target: AdapterExecutionTarget | null | undefined,
|
|
|
|
|
): boolean {
|
|
|
|
|
return target?.kind === "remote" &&
|
|
|
|
|
target.transport === "sandbox" &&
|
|
|
|
|
resolveSandboxPaperclipTransport(target) === "bridge";
|
|
|
|
|
}
|
|
|
|
|
|
Add SSH environment support (#4358)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies
> - The environments subsystem already models execution environments,
but before this branch there was no end-to-end SSH-backed runtime path
for agents to actually run work against a remote box
> - That meant agents could be configured around environment concepts
without a reliable way to execute adapter sessions remotely, sync
workspace state, and preserve run context across supported adapters
> - We also need environment selection to participate in normal
Paperclip control-plane behavior: agent defaults, project/issue
selection, route validation, and environment probing
> - Because this capability is still experimental, the UI surface should
be easy to hide and easy to remove later without undoing the underlying
implementation
> - This pull request adds SSH environment execution support across the
runtime, adapters, routes, schema, and tests, then puts the visible
environment-management UI behind an experimental flag
> - The benefit is that we can validate real SSH-backed agent execution
now while keeping the user-facing controls safely gated until the
feature is ready to come out of experimentation
## What Changed
- Added SSH-backed execution target support in the shared adapter
runtime, including remote workspace preparation, skill/runtime asset
sync, remote session handling, and workspace restore behavior after
runs.
- Added SSH execution coverage for supported local adapters, plus remote
execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi.
- Added environment selection and environment-management backend support
needed for SSH execution, including route/service work, validation,
probing, and agent default environment persistence.
- Added CLI support for SSH environment lab verification and updated
related docs/tests.
- Added the `enableEnvironments` experimental flag and gated the
environment UI behind it on company settings, agent configuration, and
project configuration surfaces.
## Verification
- `pnpm exec vitest run
packages/adapters/claude-local/src/server/execute.remote.test.ts
packages/adapters/cursor-local/src/server/execute.remote.test.ts
packages/adapters/gemini-local/src/server/execute.remote.test.ts
packages/adapters/opencode-local/src/server/execute.remote.test.ts
packages/adapters/pi-local/src/server/execute.remote.test.ts`
- `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts`
- `pnpm exec vitest run
server/src/__tests__/instance-settings-routes.test.ts`
- `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts
ui/src/lib/new-agent-runtime-config.test.ts`
- `pnpm -r typecheck`
- `pnpm build`
- Manual verification on a branch-local dev server:
- enabled the experimental flag
- created an SSH environment
- created a Linux Claude agent using that environment
- confirmed a run executed on the Linux box and synced workspace changes
back
## Risks
- Medium: this touches runtime execution flow across multiple adapters,
so regressions would likely show up in remote session setup, workspace
sync, or environment selection precedence.
- The UI flag reduces exposure, but the underlying runtime and route
changes are still substantial and rely on migration correctness.
- The change set is broad across adapters, control-plane services,
migrations, and UI gating, so review should pay close attention to
environment-selection precedence and remote workspace lifecycle
behavior.
## Model Used
- OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding
model with tool use and code execution in the local repo workspace. The
local adapter does not surface a more specific public model version
string in this branch workflow.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-23 19:15:22 -07:00
|
|
|
export function describeAdapterExecutionTarget(
|
|
|
|
|
target: AdapterExecutionTarget | null | undefined,
|
|
|
|
|
): string {
|
|
|
|
|
if (!target || target.kind === "local") return "local environment";
|
Add sandbox environment support (#4415)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies.
> - The environment/runtime layer decides where agent work executes and
how the control plane reaches those runtimes.
> - Today Paperclip can run locally and over SSH, but sandboxed
execution needs a first-class environment model instead of one-off
adapter behavior.
> - We also want sandbox providers to be pluggable so the core does not
hardcode every provider implementation.
> - This branch adds the Sandbox environment path, the provider
contract, and a deterministic fake provider plugin.
> - That required synchronized changes across shared contracts, plugin
SDK surfaces, server runtime orchestration, and the UI
environment/workspace flows.
> - The result is that sandbox execution becomes a core control-plane
capability while keeping provider implementations extensible and
testable.
## What Changed
- Added sandbox runtime support to the environment execution path,
including runtime URL discovery, sandbox execution targeting,
orchestration, and heartbeat integration.
- Added plugin-provider support for sandbox environments so providers
can be supplied via plugins instead of hardcoded server logic.
- Added the fake sandbox provider plugin with deterministic behavior
suitable for local and automated testing.
- Updated shared types, validators, plugin protocol definitions, and SDK
helpers to carry sandbox provider and workspace-runtime contracts across
package boundaries.
- Updated server routes and services so companies can create sandbox
environments, select them for work, and execute work through the sandbox
runtime path.
- Updated the UI environment and workspace surfaces to expose sandbox
environment configuration and selection.
- Added test coverage for sandbox runtime behavior, provider seams,
environment route guards, orchestration, and the fake provider plugin.
## Verification
- Ran locally before the final fixture-only scrub:
- `pnpm -r typecheck`
- `pnpm test:run`
- `pnpm build`
- Ran locally after the final scrub amend:
- `pnpm vitest run server/src/__tests__/runtime-api.test.ts`
- Reviewer spot checks:
- create a sandbox environment backed by the fake provider plugin
- run work through that environment
- confirm sandbox provider execution does not inherit host secrets
implicitly
## Risks
- This touches shared contracts, plugin SDK plumbing, server runtime
orchestration, and UI environment/workspace flows, so regressions would
likely show up as cross-layer mismatches rather than isolated type
errors.
- Runtime URL discovery and sandbox callback selection are sensitive to
host/bind configuration; if that logic is wrong, sandbox-backed
callbacks may fail even when execution succeeds.
- The fake provider plugin is intentionally deterministic and
test-oriented; future providers may expose capability gaps that this
branch does not yet cover.
## Model Used
- OpenAI Codex coding agent on a GPT-5-class backend in the
Paperclip/Codex harness. Exact backend model ID is not exposed
in-session. Tool-assisted workflow with shell execution, file editing,
git history inspection, and local test execution.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-24 12:15:53 -07:00
|
|
|
if (target.transport === "ssh") {
|
|
|
|
|
return `SSH environment ${target.spec.username}@${target.spec.host}:${target.spec.port}`;
|
|
|
|
|
}
|
|
|
|
|
return `sandbox environment${target.providerKey ? ` (${target.providerKey})` : ""}`;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
function requireSandboxRunner(target: AdapterSandboxExecutionTarget): CommandManagedRuntimeRunner {
|
|
|
|
|
if (target.runner) return target.runner;
|
|
|
|
|
throw new Error(
|
|
|
|
|
"Sandbox execution target is missing its provider runtime runner. Sandbox commands must execute through the environment runtime.",
|
|
|
|
|
);
|
Add SSH environment support (#4358)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies
> - The environments subsystem already models execution environments,
but before this branch there was no end-to-end SSH-backed runtime path
for agents to actually run work against a remote box
> - That meant agents could be configured around environment concepts
without a reliable way to execute adapter sessions remotely, sync
workspace state, and preserve run context across supported adapters
> - We also need environment selection to participate in normal
Paperclip control-plane behavior: agent defaults, project/issue
selection, route validation, and environment probing
> - Because this capability is still experimental, the UI surface should
be easy to hide and easy to remove later without undoing the underlying
implementation
> - This pull request adds SSH environment execution support across the
runtime, adapters, routes, schema, and tests, then puts the visible
environment-management UI behind an experimental flag
> - The benefit is that we can validate real SSH-backed agent execution
now while keeping the user-facing controls safely gated until the
feature is ready to come out of experimentation
## What Changed
- Added SSH-backed execution target support in the shared adapter
runtime, including remote workspace preparation, skill/runtime asset
sync, remote session handling, and workspace restore behavior after
runs.
- Added SSH execution coverage for supported local adapters, plus remote
execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi.
- Added environment selection and environment-management backend support
needed for SSH execution, including route/service work, validation,
probing, and agent default environment persistence.
- Added CLI support for SSH environment lab verification and updated
related docs/tests.
- Added the `enableEnvironments` experimental flag and gated the
environment UI behind it on company settings, agent configuration, and
project configuration surfaces.
## Verification
- `pnpm exec vitest run
packages/adapters/claude-local/src/server/execute.remote.test.ts
packages/adapters/cursor-local/src/server/execute.remote.test.ts
packages/adapters/gemini-local/src/server/execute.remote.test.ts
packages/adapters/opencode-local/src/server/execute.remote.test.ts
packages/adapters/pi-local/src/server/execute.remote.test.ts`
- `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts`
- `pnpm exec vitest run
server/src/__tests__/instance-settings-routes.test.ts`
- `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts
ui/src/lib/new-agent-runtime-config.test.ts`
- `pnpm -r typecheck`
- `pnpm build`
- Manual verification on a branch-local dev server:
- enabled the experimental flag
- created an SSH environment
- created a Linux Claude agent using that environment
- confirmed a run executed on the Linux box and synced workspace changes
back
## Risks
- Medium: this touches runtime execution flow across multiple adapters,
so regressions would likely show up in remote session setup, workspace
sync, or environment selection precedence.
- The UI flag reduces exposure, but the underlying runtime and route
changes are still substantial and rely on migration correctness.
- The change set is broad across adapters, control-plane services,
migrations, and UI gating, so review should pay close attention to
environment-selection precedence and remote workspace lifecycle
behavior.
## Model Used
- OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding
model with tool use and code execution in the local repo workspace. The
local adapter does not surface a more specific public model version
string in this branch workflow.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-23 19:15:22 -07:00
|
|
|
}
|
|
|
|
|
|
Let sandbox providers declare shell defaults (#5114)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies
> - Agents execute in sandboxed remote environments served by pluggable
sandbox
> providers (E2B today, more later)
> - Today every sandbox command runs under `sh -lc` regardless of what
the
> provider's container actually ships
> - That misses bash-only shell init on E2B (which ships bash) and
prevents
> future providers from declaring a different default — there's no way
for a
> provider to say "I have bash, use it"
> - This PR adds a `shellCommand` field to sandbox execution targets so
providers
> can declare their preferred shell ("bash" for E2B), threads it through
the
> sandbox-managed-runtime client, callback bridge, and execution-target
shell
> helper, and validates the value at the lease-metadata boundary
> - The benefit is that sandbox commands run under the right shell on
the right
> provider, and adding new sandbox providers only needs to declare a
shell
> preference
## What Changed
- Added `packages/adapter-utils/src/sandbox-shell.ts` exporting
`preferredShellForSandbox(shellCommand)` (returns `"bash"` if input is
`"bash"`,
else `"sh"`)
- Added `shellCommand?: "bash" | "sh" | null` to
`AdapterSandboxExecutionTarget`
and `CommandManagedRuntimeSpec`; threaded it through
`runAdapterExecutionTargetShellCommand`,
`prepareAdapterExecutionTargetRuntime`,
and `startAdapterExecutionTargetPaperclipBridge`
- `createCommandManagedRuntimeClient`, `prepareCommandManagedRuntime`,
and
`createCommandManagedSandboxCallbackBridgeQueueClient` now take an
optional
`shellCommand` and use `preferredShellForSandbox` to pick the shell
- `startSandboxCallbackBridgeServer` accepts a `shellCommand` for its
server
startup, readiness probe, and stop hook
- E2B sandbox plugin declares `shellCommand: "bash"` in `leaseMetadata`
- `resolveEnvironmentExecutionTarget` reads `shellCommand` from lease
metadata
(validating against `"bash" | "sh" | null`)
- `environment-runtime.ts` adds `"shellCommand"` to
`INTERNAL_PLUGIN_SANDBOX_CONFIG_KEYS`
so the field round-trips through internal plugin config without leaking
to
external plugin metadata
- Updated tests in `command-managed-runtime.test.ts`,
`execution-target-sandbox.test.ts`, `sandbox-callback-bridge.test.ts`,
`environment-execution-target.test.ts`
## Verification
- `pnpm --filter @paperclipai/adapter-utils test`
- `pnpm --filter @paperclipai/server test --
environment-execution-target`
- `pnpm --filter @paperclipai/sandbox-providers-e2b test`
- Manual QA: boot a Paperclip instance, create an E2B-backed
environment, run a
claude_local agent against it, and confirm the run completes (verifies
bash
shell semantics flow through the callback bridge end-to-end)
## Risks
- E2B sandbox commands now run under `bash -lc` instead of `sh -lc`.
Bash is a
strict superset for the commands we issue (no busybox-only flags in our
shell
scripts), so risk is low. The shellCommand field is opt-in via lease
metadata —
providers that don't declare it stay on `sh`.
- New optional field on `CommandManagedRuntimeSpec` and
`AdapterSandboxExecutionTarget`.
Consumers ignoring the field retain previous behaviour (sh).
- Lease metadata now carries an additional field. Existing leases
without
`shellCommand` resolve to `null` and fall back to sh — backwards
compatible.
## Model Used
- OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI
- Provider: OpenAI
- Used to author the code changes in this PR
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots — N/A (no UI changes)
- [ ] I have updated relevant documentation to reflect my changes — N/A
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-03 12:19:35 -07:00
|
|
|
function preferredSandboxShell(target: AdapterSandboxExecutionTarget): "bash" | "sh" {
|
|
|
|
|
return preferredShellForSandbox(target.shellCommand);
|
|
|
|
|
}
|
|
|
|
|
|
Add SSH environment support (#4358)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies
> - The environments subsystem already models execution environments,
but before this branch there was no end-to-end SSH-backed runtime path
for agents to actually run work against a remote box
> - That meant agents could be configured around environment concepts
without a reliable way to execute adapter sessions remotely, sync
workspace state, and preserve run context across supported adapters
> - We also need environment selection to participate in normal
Paperclip control-plane behavior: agent defaults, project/issue
selection, route validation, and environment probing
> - Because this capability is still experimental, the UI surface should
be easy to hide and easy to remove later without undoing the underlying
implementation
> - This pull request adds SSH environment execution support across the
runtime, adapters, routes, schema, and tests, then puts the visible
environment-management UI behind an experimental flag
> - The benefit is that we can validate real SSH-backed agent execution
now while keeping the user-facing controls safely gated until the
feature is ready to come out of experimentation
## What Changed
- Added SSH-backed execution target support in the shared adapter
runtime, including remote workspace preparation, skill/runtime asset
sync, remote session handling, and workspace restore behavior after
runs.
- Added SSH execution coverage for supported local adapters, plus remote
execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi.
- Added environment selection and environment-management backend support
needed for SSH execution, including route/service work, validation,
probing, and agent default environment persistence.
- Added CLI support for SSH environment lab verification and updated
related docs/tests.
- Added the `enableEnvironments` experimental flag and gated the
environment UI behind it on company settings, agent configuration, and
project configuration surfaces.
## Verification
- `pnpm exec vitest run
packages/adapters/claude-local/src/server/execute.remote.test.ts
packages/adapters/cursor-local/src/server/execute.remote.test.ts
packages/adapters/gemini-local/src/server/execute.remote.test.ts
packages/adapters/opencode-local/src/server/execute.remote.test.ts
packages/adapters/pi-local/src/server/execute.remote.test.ts`
- `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts`
- `pnpm exec vitest run
server/src/__tests__/instance-settings-routes.test.ts`
- `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts
ui/src/lib/new-agent-runtime-config.test.ts`
- `pnpm -r typecheck`
- `pnpm build`
- Manual verification on a branch-local dev server:
- enabled the experimental flag
- created an SSH environment
- created a Linux Claude agent using that environment
- confirmed a run executed on the Linux box and synced workspace changes
back
## Risks
- Medium: this touches runtime execution flow across multiple adapters,
so regressions would likely show up in remote session setup, workspace
sync, or environment selection precedence.
- The UI flag reduces exposure, but the underlying runtime and route
changes are still substantial and rely on migration correctness.
- The change set is broad across adapters, control-plane services,
migrations, and UI gating, so review should pay close attention to
environment-selection precedence and remote workspace lifecycle
behavior.
## Model Used
- OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding
model with tool use and code execution in the local repo workspace. The
local adapter does not surface a more specific public model version
string in this branch workflow.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-23 19:15:22 -07:00
|
|
|
export async function ensureAdapterExecutionTargetCommandResolvable(
|
|
|
|
|
command: string,
|
|
|
|
|
target: AdapterExecutionTarget | null | undefined,
|
|
|
|
|
cwd: string,
|
|
|
|
|
env: NodeJS.ProcessEnv,
|
|
|
|
|
) {
|
Add sandbox environment support (#4415)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies.
> - The environment/runtime layer decides where agent work executes and
how the control plane reaches those runtimes.
> - Today Paperclip can run locally and over SSH, but sandboxed
execution needs a first-class environment model instead of one-off
adapter behavior.
> - We also want sandbox providers to be pluggable so the core does not
hardcode every provider implementation.
> - This branch adds the Sandbox environment path, the provider
contract, and a deterministic fake provider plugin.
> - That required synchronized changes across shared contracts, plugin
SDK surfaces, server runtime orchestration, and the UI
environment/workspace flows.
> - The result is that sandbox execution becomes a core control-plane
capability while keeping provider implementations extensible and
testable.
## What Changed
- Added sandbox runtime support to the environment execution path,
including runtime URL discovery, sandbox execution targeting,
orchestration, and heartbeat integration.
- Added plugin-provider support for sandbox environments so providers
can be supplied via plugins instead of hardcoded server logic.
- Added the fake sandbox provider plugin with deterministic behavior
suitable for local and automated testing.
- Updated shared types, validators, plugin protocol definitions, and SDK
helpers to carry sandbox provider and workspace-runtime contracts across
package boundaries.
- Updated server routes and services so companies can create sandbox
environments, select them for work, and execute work through the sandbox
runtime path.
- Updated the UI environment and workspace surfaces to expose sandbox
environment configuration and selection.
- Added test coverage for sandbox runtime behavior, provider seams,
environment route guards, orchestration, and the fake provider plugin.
## Verification
- Ran locally before the final fixture-only scrub:
- `pnpm -r typecheck`
- `pnpm test:run`
- `pnpm build`
- Ran locally after the final scrub amend:
- `pnpm vitest run server/src/__tests__/runtime-api.test.ts`
- Reviewer spot checks:
- create a sandbox environment backed by the fake provider plugin
- run work through that environment
- confirm sandbox provider execution does not inherit host secrets
implicitly
## Risks
- This touches shared contracts, plugin SDK plumbing, server runtime
orchestration, and UI environment/workspace flows, so regressions would
likely show up as cross-layer mismatches rather than isolated type
errors.
- Runtime URL discovery and sandbox callback selection are sensitive to
host/bind configuration; if that logic is wrong, sandbox-backed
callbacks may fail even when execution succeeds.
- The fake provider plugin is intentionally deterministic and
test-oriented; future providers may expose capability gaps that this
branch does not yet cover.
## Model Used
- OpenAI Codex coding agent on a GPT-5-class backend in the
Paperclip/Codex harness. Exact backend model ID is not exposed
in-session. Tool-assisted workflow with shell execution, file editing,
git history inspection, and local test execution.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-24 12:15:53 -07:00
|
|
|
if (target?.kind === "remote" && target.transport === "sandbox") {
|
|
|
|
|
return;
|
|
|
|
|
}
|
Add SSH environment support (#4358)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies
> - The environments subsystem already models execution environments,
but before this branch there was no end-to-end SSH-backed runtime path
for agents to actually run work against a remote box
> - That meant agents could be configured around environment concepts
without a reliable way to execute adapter sessions remotely, sync
workspace state, and preserve run context across supported adapters
> - We also need environment selection to participate in normal
Paperclip control-plane behavior: agent defaults, project/issue
selection, route validation, and environment probing
> - Because this capability is still experimental, the UI surface should
be easy to hide and easy to remove later without undoing the underlying
implementation
> - This pull request adds SSH environment execution support across the
runtime, adapters, routes, schema, and tests, then puts the visible
environment-management UI behind an experimental flag
> - The benefit is that we can validate real SSH-backed agent execution
now while keeping the user-facing controls safely gated until the
feature is ready to come out of experimentation
## What Changed
- Added SSH-backed execution target support in the shared adapter
runtime, including remote workspace preparation, skill/runtime asset
sync, remote session handling, and workspace restore behavior after
runs.
- Added SSH execution coverage for supported local adapters, plus remote
execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi.
- Added environment selection and environment-management backend support
needed for SSH execution, including route/service work, validation,
probing, and agent default environment persistence.
- Added CLI support for SSH environment lab verification and updated
related docs/tests.
- Added the `enableEnvironments` experimental flag and gated the
environment UI behind it on company settings, agent configuration, and
project configuration surfaces.
## Verification
- `pnpm exec vitest run
packages/adapters/claude-local/src/server/execute.remote.test.ts
packages/adapters/cursor-local/src/server/execute.remote.test.ts
packages/adapters/gemini-local/src/server/execute.remote.test.ts
packages/adapters/opencode-local/src/server/execute.remote.test.ts
packages/adapters/pi-local/src/server/execute.remote.test.ts`
- `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts`
- `pnpm exec vitest run
server/src/__tests__/instance-settings-routes.test.ts`
- `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts
ui/src/lib/new-agent-runtime-config.test.ts`
- `pnpm -r typecheck`
- `pnpm build`
- Manual verification on a branch-local dev server:
- enabled the experimental flag
- created an SSH environment
- created a Linux Claude agent using that environment
- confirmed a run executed on the Linux box and synced workspace changes
back
## Risks
- Medium: this touches runtime execution flow across multiple adapters,
so regressions would likely show up in remote session setup, workspace
sync, or environment selection precedence.
- The UI flag reduces exposure, but the underlying runtime and route
changes are still substantial and rely on migration correctness.
- The change set is broad across adapters, control-plane services,
migrations, and UI gating, so review should pay close attention to
environment-selection precedence and remote workspace lifecycle
behavior.
## Model Used
- OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding
model with tool use and code execution in the local repo workspace. The
local adapter does not surface a more specific public model version
string in this branch workflow.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-23 19:15:22 -07:00
|
|
|
await ensureCommandResolvable(command, cwd, env, {
|
|
|
|
|
remoteExecution: adapterExecutionTargetToRemoteSpec(target),
|
|
|
|
|
});
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
export async function resolveAdapterExecutionTargetCommandForLogs(
|
|
|
|
|
command: string,
|
|
|
|
|
target: AdapterExecutionTarget | null | undefined,
|
|
|
|
|
cwd: string,
|
|
|
|
|
env: NodeJS.ProcessEnv,
|
|
|
|
|
): Promise<string> {
|
Add sandbox environment support (#4415)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies.
> - The environment/runtime layer decides where agent work executes and
how the control plane reaches those runtimes.
> - Today Paperclip can run locally and over SSH, but sandboxed
execution needs a first-class environment model instead of one-off
adapter behavior.
> - We also want sandbox providers to be pluggable so the core does not
hardcode every provider implementation.
> - This branch adds the Sandbox environment path, the provider
contract, and a deterministic fake provider plugin.
> - That required synchronized changes across shared contracts, plugin
SDK surfaces, server runtime orchestration, and the UI
environment/workspace flows.
> - The result is that sandbox execution becomes a core control-plane
capability while keeping provider implementations extensible and
testable.
## What Changed
- Added sandbox runtime support to the environment execution path,
including runtime URL discovery, sandbox execution targeting,
orchestration, and heartbeat integration.
- Added plugin-provider support for sandbox environments so providers
can be supplied via plugins instead of hardcoded server logic.
- Added the fake sandbox provider plugin with deterministic behavior
suitable for local and automated testing.
- Updated shared types, validators, plugin protocol definitions, and SDK
helpers to carry sandbox provider and workspace-runtime contracts across
package boundaries.
- Updated server routes and services so companies can create sandbox
environments, select them for work, and execute work through the sandbox
runtime path.
- Updated the UI environment and workspace surfaces to expose sandbox
environment configuration and selection.
- Added test coverage for sandbox runtime behavior, provider seams,
environment route guards, orchestration, and the fake provider plugin.
## Verification
- Ran locally before the final fixture-only scrub:
- `pnpm -r typecheck`
- `pnpm test:run`
- `pnpm build`
- Ran locally after the final scrub amend:
- `pnpm vitest run server/src/__tests__/runtime-api.test.ts`
- Reviewer spot checks:
- create a sandbox environment backed by the fake provider plugin
- run work through that environment
- confirm sandbox provider execution does not inherit host secrets
implicitly
## Risks
- This touches shared contracts, plugin SDK plumbing, server runtime
orchestration, and UI environment/workspace flows, so regressions would
likely show up as cross-layer mismatches rather than isolated type
errors.
- Runtime URL discovery and sandbox callback selection are sensitive to
host/bind configuration; if that logic is wrong, sandbox-backed
callbacks may fail even when execution succeeds.
- The fake provider plugin is intentionally deterministic and
test-oriented; future providers may expose capability gaps that this
branch does not yet cover.
## Model Used
- OpenAI Codex coding agent on a GPT-5-class backend in the
Paperclip/Codex harness. Exact backend model ID is not exposed
in-session. Tool-assisted workflow with shell execution, file editing,
git history inspection, and local test execution.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-24 12:15:53 -07:00
|
|
|
if (target?.kind === "remote" && target.transport === "sandbox") {
|
|
|
|
|
return `sandbox://${target.providerKey ?? "provider"}/${target.leaseId ?? "lease"}/${target.remoteCwd} :: ${command}`;
|
|
|
|
|
}
|
Add SSH environment support (#4358)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies
> - The environments subsystem already models execution environments,
but before this branch there was no end-to-end SSH-backed runtime path
for agents to actually run work against a remote box
> - That meant agents could be configured around environment concepts
without a reliable way to execute adapter sessions remotely, sync
workspace state, and preserve run context across supported adapters
> - We also need environment selection to participate in normal
Paperclip control-plane behavior: agent defaults, project/issue
selection, route validation, and environment probing
> - Because this capability is still experimental, the UI surface should
be easy to hide and easy to remove later without undoing the underlying
implementation
> - This pull request adds SSH environment execution support across the
runtime, adapters, routes, schema, and tests, then puts the visible
environment-management UI behind an experimental flag
> - The benefit is that we can validate real SSH-backed agent execution
now while keeping the user-facing controls safely gated until the
feature is ready to come out of experimentation
## What Changed
- Added SSH-backed execution target support in the shared adapter
runtime, including remote workspace preparation, skill/runtime asset
sync, remote session handling, and workspace restore behavior after
runs.
- Added SSH execution coverage for supported local adapters, plus remote
execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi.
- Added environment selection and environment-management backend support
needed for SSH execution, including route/service work, validation,
probing, and agent default environment persistence.
- Added CLI support for SSH environment lab verification and updated
related docs/tests.
- Added the `enableEnvironments` experimental flag and gated the
environment UI behind it on company settings, agent configuration, and
project configuration surfaces.
## Verification
- `pnpm exec vitest run
packages/adapters/claude-local/src/server/execute.remote.test.ts
packages/adapters/cursor-local/src/server/execute.remote.test.ts
packages/adapters/gemini-local/src/server/execute.remote.test.ts
packages/adapters/opencode-local/src/server/execute.remote.test.ts
packages/adapters/pi-local/src/server/execute.remote.test.ts`
- `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts`
- `pnpm exec vitest run
server/src/__tests__/instance-settings-routes.test.ts`
- `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts
ui/src/lib/new-agent-runtime-config.test.ts`
- `pnpm -r typecheck`
- `pnpm build`
- Manual verification on a branch-local dev server:
- enabled the experimental flag
- created an SSH environment
- created a Linux Claude agent using that environment
- confirmed a run executed on the Linux box and synced workspace changes
back
## Risks
- Medium: this touches runtime execution flow across multiple adapters,
so regressions would likely show up in remote session setup, workspace
sync, or environment selection precedence.
- The UI flag reduces exposure, but the underlying runtime and route
changes are still substantial and rely on migration correctness.
- The change set is broad across adapters, control-plane services,
migrations, and UI gating, so review should pay close attention to
environment-selection precedence and remote workspace lifecycle
behavior.
## Model Used
- OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding
model with tool use and code execution in the local repo workspace. The
local adapter does not surface a more specific public model version
string in this branch workflow.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-23 19:15:22 -07:00
|
|
|
return await resolveCommandForLogs(command, cwd, env, {
|
|
|
|
|
remoteExecution: adapterExecutionTargetToRemoteSpec(target),
|
|
|
|
|
});
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
export async function runAdapterExecutionTargetProcess(
|
|
|
|
|
runId: string,
|
|
|
|
|
target: AdapterExecutionTarget | null | undefined,
|
|
|
|
|
command: string,
|
|
|
|
|
args: string[],
|
|
|
|
|
options: AdapterExecutionTargetProcessOptions,
|
|
|
|
|
): Promise<RunProcessResult> {
|
Add sandbox environment support (#4415)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies.
> - The environment/runtime layer decides where agent work executes and
how the control plane reaches those runtimes.
> - Today Paperclip can run locally and over SSH, but sandboxed
execution needs a first-class environment model instead of one-off
adapter behavior.
> - We also want sandbox providers to be pluggable so the core does not
hardcode every provider implementation.
> - This branch adds the Sandbox environment path, the provider
contract, and a deterministic fake provider plugin.
> - That required synchronized changes across shared contracts, plugin
SDK surfaces, server runtime orchestration, and the UI
environment/workspace flows.
> - The result is that sandbox execution becomes a core control-plane
capability while keeping provider implementations extensible and
testable.
## What Changed
- Added sandbox runtime support to the environment execution path,
including runtime URL discovery, sandbox execution targeting,
orchestration, and heartbeat integration.
- Added plugin-provider support for sandbox environments so providers
can be supplied via plugins instead of hardcoded server logic.
- Added the fake sandbox provider plugin with deterministic behavior
suitable for local and automated testing.
- Updated shared types, validators, plugin protocol definitions, and SDK
helpers to carry sandbox provider and workspace-runtime contracts across
package boundaries.
- Updated server routes and services so companies can create sandbox
environments, select them for work, and execute work through the sandbox
runtime path.
- Updated the UI environment and workspace surfaces to expose sandbox
environment configuration and selection.
- Added test coverage for sandbox runtime behavior, provider seams,
environment route guards, orchestration, and the fake provider plugin.
## Verification
- Ran locally before the final fixture-only scrub:
- `pnpm -r typecheck`
- `pnpm test:run`
- `pnpm build`
- Ran locally after the final scrub amend:
- `pnpm vitest run server/src/__tests__/runtime-api.test.ts`
- Reviewer spot checks:
- create a sandbox environment backed by the fake provider plugin
- run work through that environment
- confirm sandbox provider execution does not inherit host secrets
implicitly
## Risks
- This touches shared contracts, plugin SDK plumbing, server runtime
orchestration, and UI environment/workspace flows, so regressions would
likely show up as cross-layer mismatches rather than isolated type
errors.
- Runtime URL discovery and sandbox callback selection are sensitive to
host/bind configuration; if that logic is wrong, sandbox-backed
callbacks may fail even when execution succeeds.
- The fake provider plugin is intentionally deterministic and
test-oriented; future providers may expose capability gaps that this
branch does not yet cover.
## Model Used
- OpenAI Codex coding agent on a GPT-5-class backend in the
Paperclip/Codex harness. Exact backend model ID is not exposed
in-session. Tool-assisted workflow with shell execution, file editing,
git history inspection, and local test execution.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-24 12:15:53 -07:00
|
|
|
if (target?.kind === "remote" && target.transport === "sandbox") {
|
|
|
|
|
const runner = requireSandboxRunner(target);
|
|
|
|
|
return await runner.execute({
|
|
|
|
|
command,
|
|
|
|
|
args,
|
|
|
|
|
cwd: target.remoteCwd,
|
|
|
|
|
env: options.env,
|
|
|
|
|
stdin: options.stdin,
|
|
|
|
|
timeoutMs: options.timeoutSec > 0 ? options.timeoutSec * 1000 : target.timeoutMs ?? undefined,
|
|
|
|
|
onLog: options.onLog,
|
|
|
|
|
onSpawn: options.onSpawn
|
|
|
|
|
? async (meta) => options.onSpawn?.({ ...meta, processGroupId: null })
|
|
|
|
|
: undefined,
|
|
|
|
|
});
|
|
|
|
|
}
|
|
|
|
|
|
Add SSH environment support (#4358)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies
> - The environments subsystem already models execution environments,
but before this branch there was no end-to-end SSH-backed runtime path
for agents to actually run work against a remote box
> - That meant agents could be configured around environment concepts
without a reliable way to execute adapter sessions remotely, sync
workspace state, and preserve run context across supported adapters
> - We also need environment selection to participate in normal
Paperclip control-plane behavior: agent defaults, project/issue
selection, route validation, and environment probing
> - Because this capability is still experimental, the UI surface should
be easy to hide and easy to remove later without undoing the underlying
implementation
> - This pull request adds SSH environment execution support across the
runtime, adapters, routes, schema, and tests, then puts the visible
environment-management UI behind an experimental flag
> - The benefit is that we can validate real SSH-backed agent execution
now while keeping the user-facing controls safely gated until the
feature is ready to come out of experimentation
## What Changed
- Added SSH-backed execution target support in the shared adapter
runtime, including remote workspace preparation, skill/runtime asset
sync, remote session handling, and workspace restore behavior after
runs.
- Added SSH execution coverage for supported local adapters, plus remote
execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi.
- Added environment selection and environment-management backend support
needed for SSH execution, including route/service work, validation,
probing, and agent default environment persistence.
- Added CLI support for SSH environment lab verification and updated
related docs/tests.
- Added the `enableEnvironments` experimental flag and gated the
environment UI behind it on company settings, agent configuration, and
project configuration surfaces.
## Verification
- `pnpm exec vitest run
packages/adapters/claude-local/src/server/execute.remote.test.ts
packages/adapters/cursor-local/src/server/execute.remote.test.ts
packages/adapters/gemini-local/src/server/execute.remote.test.ts
packages/adapters/opencode-local/src/server/execute.remote.test.ts
packages/adapters/pi-local/src/server/execute.remote.test.ts`
- `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts`
- `pnpm exec vitest run
server/src/__tests__/instance-settings-routes.test.ts`
- `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts
ui/src/lib/new-agent-runtime-config.test.ts`
- `pnpm -r typecheck`
- `pnpm build`
- Manual verification on a branch-local dev server:
- enabled the experimental flag
- created an SSH environment
- created a Linux Claude agent using that environment
- confirmed a run executed on the Linux box and synced workspace changes
back
## Risks
- Medium: this touches runtime execution flow across multiple adapters,
so regressions would likely show up in remote session setup, workspace
sync, or environment selection precedence.
- The UI flag reduces exposure, but the underlying runtime and route
changes are still substantial and rely on migration correctness.
- The change set is broad across adapters, control-plane services,
migrations, and UI gating, so review should pay close attention to
environment-selection precedence and remote workspace lifecycle
behavior.
## Model Used
- OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding
model with tool use and code execution in the local repo workspace. The
local adapter does not surface a more specific public model version
string in this branch workflow.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-23 19:15:22 -07:00
|
|
|
return await runChildProcess(runId, command, args, {
|
|
|
|
|
cwd: options.cwd,
|
|
|
|
|
env: options.env,
|
|
|
|
|
stdin: options.stdin,
|
|
|
|
|
timeoutSec: options.timeoutSec,
|
|
|
|
|
graceSec: options.graceSec,
|
|
|
|
|
onLog: options.onLog,
|
|
|
|
|
onSpawn: options.onSpawn,
|
|
|
|
|
terminalResultCleanup: options.terminalResultCleanup,
|
|
|
|
|
remoteExecution: adapterExecutionTargetToRemoteSpec(target),
|
|
|
|
|
});
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
export async function runAdapterExecutionTargetShellCommand(
|
|
|
|
|
runId: string,
|
|
|
|
|
target: AdapterExecutionTarget | null | undefined,
|
|
|
|
|
command: string,
|
|
|
|
|
options: AdapterExecutionTargetShellOptions,
|
|
|
|
|
): Promise<RunProcessResult> {
|
|
|
|
|
const onLog = options.onLog ?? (async () => {});
|
|
|
|
|
if (target?.kind === "remote") {
|
|
|
|
|
const startedAt = new Date().toISOString();
|
Add sandbox environment support (#4415)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies.
> - The environment/runtime layer decides where agent work executes and
how the control plane reaches those runtimes.
> - Today Paperclip can run locally and over SSH, but sandboxed
execution needs a first-class environment model instead of one-off
adapter behavior.
> - We also want sandbox providers to be pluggable so the core does not
hardcode every provider implementation.
> - This branch adds the Sandbox environment path, the provider
contract, and a deterministic fake provider plugin.
> - That required synchronized changes across shared contracts, plugin
SDK surfaces, server runtime orchestration, and the UI
environment/workspace flows.
> - The result is that sandbox execution becomes a core control-plane
capability while keeping provider implementations extensible and
testable.
## What Changed
- Added sandbox runtime support to the environment execution path,
including runtime URL discovery, sandbox execution targeting,
orchestration, and heartbeat integration.
- Added plugin-provider support for sandbox environments so providers
can be supplied via plugins instead of hardcoded server logic.
- Added the fake sandbox provider plugin with deterministic behavior
suitable for local and automated testing.
- Updated shared types, validators, plugin protocol definitions, and SDK
helpers to carry sandbox provider and workspace-runtime contracts across
package boundaries.
- Updated server routes and services so companies can create sandbox
environments, select them for work, and execute work through the sandbox
runtime path.
- Updated the UI environment and workspace surfaces to expose sandbox
environment configuration and selection.
- Added test coverage for sandbox runtime behavior, provider seams,
environment route guards, orchestration, and the fake provider plugin.
## Verification
- Ran locally before the final fixture-only scrub:
- `pnpm -r typecheck`
- `pnpm test:run`
- `pnpm build`
- Ran locally after the final scrub amend:
- `pnpm vitest run server/src/__tests__/runtime-api.test.ts`
- Reviewer spot checks:
- create a sandbox environment backed by the fake provider plugin
- run work through that environment
- confirm sandbox provider execution does not inherit host secrets
implicitly
## Risks
- This touches shared contracts, plugin SDK plumbing, server runtime
orchestration, and UI environment/workspace flows, so regressions would
likely show up as cross-layer mismatches rather than isolated type
errors.
- Runtime URL discovery and sandbox callback selection are sensitive to
host/bind configuration; if that logic is wrong, sandbox-backed
callbacks may fail even when execution succeeds.
- The fake provider plugin is intentionally deterministic and
test-oriented; future providers may expose capability gaps that this
branch does not yet cover.
## Model Used
- OpenAI Codex coding agent on a GPT-5-class backend in the
Paperclip/Codex harness. Exact backend model ID is not exposed
in-session. Tool-assisted workflow with shell execution, file editing,
git history inspection, and local test execution.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-24 12:15:53 -07:00
|
|
|
if (target.transport === "ssh") {
|
|
|
|
|
try {
|
|
|
|
|
const result = await runSshCommand(target.spec, `sh -lc ${shellQuote(command)}`, {
|
|
|
|
|
timeoutMs: (options.timeoutSec ?? 15) * 1000,
|
|
|
|
|
});
|
|
|
|
|
if (result.stdout) await onLog("stdout", result.stdout);
|
|
|
|
|
if (result.stderr) await onLog("stderr", result.stderr);
|
|
|
|
|
return {
|
|
|
|
|
exitCode: 0,
|
|
|
|
|
signal: null,
|
|
|
|
|
timedOut: false,
|
|
|
|
|
stdout: result.stdout,
|
|
|
|
|
stderr: result.stderr,
|
|
|
|
|
pid: null,
|
|
|
|
|
startedAt,
|
|
|
|
|
};
|
|
|
|
|
} catch (error) {
|
|
|
|
|
const timedOutError = error as NodeJS.ErrnoException & {
|
|
|
|
|
stdout?: string;
|
|
|
|
|
stderr?: string;
|
|
|
|
|
signal?: string | null;
|
|
|
|
|
};
|
|
|
|
|
const stdout = timedOutError.stdout ?? "";
|
|
|
|
|
const stderr = timedOutError.stderr ?? "";
|
|
|
|
|
if (typeof timedOutError.code === "number") {
|
|
|
|
|
if (stdout) await onLog("stdout", stdout);
|
|
|
|
|
if (stderr) await onLog("stderr", stderr);
|
|
|
|
|
return {
|
|
|
|
|
exitCode: timedOutError.code,
|
|
|
|
|
signal: timedOutError.signal ?? null,
|
|
|
|
|
timedOut: false,
|
|
|
|
|
stdout,
|
|
|
|
|
stderr,
|
|
|
|
|
pid: null,
|
|
|
|
|
startedAt,
|
|
|
|
|
};
|
|
|
|
|
}
|
|
|
|
|
if (timedOutError.code !== "ETIMEDOUT") {
|
|
|
|
|
throw error;
|
|
|
|
|
}
|
Add SSH environment support (#4358)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies
> - The environments subsystem already models execution environments,
but before this branch there was no end-to-end SSH-backed runtime path
for agents to actually run work against a remote box
> - That meant agents could be configured around environment concepts
without a reliable way to execute adapter sessions remotely, sync
workspace state, and preserve run context across supported adapters
> - We also need environment selection to participate in normal
Paperclip control-plane behavior: agent defaults, project/issue
selection, route validation, and environment probing
> - Because this capability is still experimental, the UI surface should
be easy to hide and easy to remove later without undoing the underlying
implementation
> - This pull request adds SSH environment execution support across the
runtime, adapters, routes, schema, and tests, then puts the visible
environment-management UI behind an experimental flag
> - The benefit is that we can validate real SSH-backed agent execution
now while keeping the user-facing controls safely gated until the
feature is ready to come out of experimentation
## What Changed
- Added SSH-backed execution target support in the shared adapter
runtime, including remote workspace preparation, skill/runtime asset
sync, remote session handling, and workspace restore behavior after
runs.
- Added SSH execution coverage for supported local adapters, plus remote
execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi.
- Added environment selection and environment-management backend support
needed for SSH execution, including route/service work, validation,
probing, and agent default environment persistence.
- Added CLI support for SSH environment lab verification and updated
related docs/tests.
- Added the `enableEnvironments` experimental flag and gated the
environment UI behind it on company settings, agent configuration, and
project configuration surfaces.
## Verification
- `pnpm exec vitest run
packages/adapters/claude-local/src/server/execute.remote.test.ts
packages/adapters/cursor-local/src/server/execute.remote.test.ts
packages/adapters/gemini-local/src/server/execute.remote.test.ts
packages/adapters/opencode-local/src/server/execute.remote.test.ts
packages/adapters/pi-local/src/server/execute.remote.test.ts`
- `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts`
- `pnpm exec vitest run
server/src/__tests__/instance-settings-routes.test.ts`
- `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts
ui/src/lib/new-agent-runtime-config.test.ts`
- `pnpm -r typecheck`
- `pnpm build`
- Manual verification on a branch-local dev server:
- enabled the experimental flag
- created an SSH environment
- created a Linux Claude agent using that environment
- confirmed a run executed on the Linux box and synced workspace changes
back
## Risks
- Medium: this touches runtime execution flow across multiple adapters,
so regressions would likely show up in remote session setup, workspace
sync, or environment selection precedence.
- The UI flag reduces exposure, but the underlying runtime and route
changes are still substantial and rely on migration correctness.
- The change set is broad across adapters, control-plane services,
migrations, and UI gating, so review should pay close attention to
environment-selection precedence and remote workspace lifecycle
behavior.
## Model Used
- OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding
model with tool use and code execution in the local repo workspace. The
local adapter does not surface a more specific public model version
string in this branch workflow.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-23 19:15:22 -07:00
|
|
|
if (stdout) await onLog("stdout", stdout);
|
|
|
|
|
if (stderr) await onLog("stderr", stderr);
|
|
|
|
|
return {
|
Add sandbox environment support (#4415)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies.
> - The environment/runtime layer decides where agent work executes and
how the control plane reaches those runtimes.
> - Today Paperclip can run locally and over SSH, but sandboxed
execution needs a first-class environment model instead of one-off
adapter behavior.
> - We also want sandbox providers to be pluggable so the core does not
hardcode every provider implementation.
> - This branch adds the Sandbox environment path, the provider
contract, and a deterministic fake provider plugin.
> - That required synchronized changes across shared contracts, plugin
SDK surfaces, server runtime orchestration, and the UI
environment/workspace flows.
> - The result is that sandbox execution becomes a core control-plane
capability while keeping provider implementations extensible and
testable.
## What Changed
- Added sandbox runtime support to the environment execution path,
including runtime URL discovery, sandbox execution targeting,
orchestration, and heartbeat integration.
- Added plugin-provider support for sandbox environments so providers
can be supplied via plugins instead of hardcoded server logic.
- Added the fake sandbox provider plugin with deterministic behavior
suitable for local and automated testing.
- Updated shared types, validators, plugin protocol definitions, and SDK
helpers to carry sandbox provider and workspace-runtime contracts across
package boundaries.
- Updated server routes and services so companies can create sandbox
environments, select them for work, and execute work through the sandbox
runtime path.
- Updated the UI environment and workspace surfaces to expose sandbox
environment configuration and selection.
- Added test coverage for sandbox runtime behavior, provider seams,
environment route guards, orchestration, and the fake provider plugin.
## Verification
- Ran locally before the final fixture-only scrub:
- `pnpm -r typecheck`
- `pnpm test:run`
- `pnpm build`
- Ran locally after the final scrub amend:
- `pnpm vitest run server/src/__tests__/runtime-api.test.ts`
- Reviewer spot checks:
- create a sandbox environment backed by the fake provider plugin
- run work through that environment
- confirm sandbox provider execution does not inherit host secrets
implicitly
## Risks
- This touches shared contracts, plugin SDK plumbing, server runtime
orchestration, and UI environment/workspace flows, so regressions would
likely show up as cross-layer mismatches rather than isolated type
errors.
- Runtime URL discovery and sandbox callback selection are sensitive to
host/bind configuration; if that logic is wrong, sandbox-backed
callbacks may fail even when execution succeeds.
- The fake provider plugin is intentionally deterministic and
test-oriented; future providers may expose capability gaps that this
branch does not yet cover.
## Model Used
- OpenAI Codex coding agent on a GPT-5-class backend in the
Paperclip/Codex harness. Exact backend model ID is not exposed
in-session. Tool-assisted workflow with shell execution, file editing,
git history inspection, and local test execution.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-24 12:15:53 -07:00
|
|
|
exitCode: null,
|
Add SSH environment support (#4358)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies
> - The environments subsystem already models execution environments,
but before this branch there was no end-to-end SSH-backed runtime path
for agents to actually run work against a remote box
> - That meant agents could be configured around environment concepts
without a reliable way to execute adapter sessions remotely, sync
workspace state, and preserve run context across supported adapters
> - We also need environment selection to participate in normal
Paperclip control-plane behavior: agent defaults, project/issue
selection, route validation, and environment probing
> - Because this capability is still experimental, the UI surface should
be easy to hide and easy to remove later without undoing the underlying
implementation
> - This pull request adds SSH environment execution support across the
runtime, adapters, routes, schema, and tests, then puts the visible
environment-management UI behind an experimental flag
> - The benefit is that we can validate real SSH-backed agent execution
now while keeping the user-facing controls safely gated until the
feature is ready to come out of experimentation
## What Changed
- Added SSH-backed execution target support in the shared adapter
runtime, including remote workspace preparation, skill/runtime asset
sync, remote session handling, and workspace restore behavior after
runs.
- Added SSH execution coverage for supported local adapters, plus remote
execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi.
- Added environment selection and environment-management backend support
needed for SSH execution, including route/service work, validation,
probing, and agent default environment persistence.
- Added CLI support for SSH environment lab verification and updated
related docs/tests.
- Added the `enableEnvironments` experimental flag and gated the
environment UI behind it on company settings, agent configuration, and
project configuration surfaces.
## Verification
- `pnpm exec vitest run
packages/adapters/claude-local/src/server/execute.remote.test.ts
packages/adapters/cursor-local/src/server/execute.remote.test.ts
packages/adapters/gemini-local/src/server/execute.remote.test.ts
packages/adapters/opencode-local/src/server/execute.remote.test.ts
packages/adapters/pi-local/src/server/execute.remote.test.ts`
- `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts`
- `pnpm exec vitest run
server/src/__tests__/instance-settings-routes.test.ts`
- `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts
ui/src/lib/new-agent-runtime-config.test.ts`
- `pnpm -r typecheck`
- `pnpm build`
- Manual verification on a branch-local dev server:
- enabled the experimental flag
- created an SSH environment
- created a Linux Claude agent using that environment
- confirmed a run executed on the Linux box and synced workspace changes
back
## Risks
- Medium: this touches runtime execution flow across multiple adapters,
so regressions would likely show up in remote session setup, workspace
sync, or environment selection precedence.
- The UI flag reduces exposure, but the underlying runtime and route
changes are still substantial and rely on migration correctness.
- The change set is broad across adapters, control-plane services,
migrations, and UI gating, so review should pay close attention to
environment-selection precedence and remote workspace lifecycle
behavior.
## Model Used
- OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding
model with tool use and code execution in the local repo workspace. The
local adapter does not surface a more specific public model version
string in this branch workflow.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-23 19:15:22 -07:00
|
|
|
signal: timedOutError.signal ?? null,
|
Add sandbox environment support (#4415)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies.
> - The environment/runtime layer decides where agent work executes and
how the control plane reaches those runtimes.
> - Today Paperclip can run locally and over SSH, but sandboxed
execution needs a first-class environment model instead of one-off
adapter behavior.
> - We also want sandbox providers to be pluggable so the core does not
hardcode every provider implementation.
> - This branch adds the Sandbox environment path, the provider
contract, and a deterministic fake provider plugin.
> - That required synchronized changes across shared contracts, plugin
SDK surfaces, server runtime orchestration, and the UI
environment/workspace flows.
> - The result is that sandbox execution becomes a core control-plane
capability while keeping provider implementations extensible and
testable.
## What Changed
- Added sandbox runtime support to the environment execution path,
including runtime URL discovery, sandbox execution targeting,
orchestration, and heartbeat integration.
- Added plugin-provider support for sandbox environments so providers
can be supplied via plugins instead of hardcoded server logic.
- Added the fake sandbox provider plugin with deterministic behavior
suitable for local and automated testing.
- Updated shared types, validators, plugin protocol definitions, and SDK
helpers to carry sandbox provider and workspace-runtime contracts across
package boundaries.
- Updated server routes and services so companies can create sandbox
environments, select them for work, and execute work through the sandbox
runtime path.
- Updated the UI environment and workspace surfaces to expose sandbox
environment configuration and selection.
- Added test coverage for sandbox runtime behavior, provider seams,
environment route guards, orchestration, and the fake provider plugin.
## Verification
- Ran locally before the final fixture-only scrub:
- `pnpm -r typecheck`
- `pnpm test:run`
- `pnpm build`
- Ran locally after the final scrub amend:
- `pnpm vitest run server/src/__tests__/runtime-api.test.ts`
- Reviewer spot checks:
- create a sandbox environment backed by the fake provider plugin
- run work through that environment
- confirm sandbox provider execution does not inherit host secrets
implicitly
## Risks
- This touches shared contracts, plugin SDK plumbing, server runtime
orchestration, and UI environment/workspace flows, so regressions would
likely show up as cross-layer mismatches rather than isolated type
errors.
- Runtime URL discovery and sandbox callback selection are sensitive to
host/bind configuration; if that logic is wrong, sandbox-backed
callbacks may fail even when execution succeeds.
- The fake provider plugin is intentionally deterministic and
test-oriented; future providers may expose capability gaps that this
branch does not yet cover.
## Model Used
- OpenAI Codex coding agent on a GPT-5-class backend in the
Paperclip/Codex harness. Exact backend model ID is not exposed
in-session. Tool-assisted workflow with shell execution, file editing,
git history inspection, and local test execution.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-24 12:15:53 -07:00
|
|
|
timedOut: true,
|
Add SSH environment support (#4358)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies
> - The environments subsystem already models execution environments,
but before this branch there was no end-to-end SSH-backed runtime path
for agents to actually run work against a remote box
> - That meant agents could be configured around environment concepts
without a reliable way to execute adapter sessions remotely, sync
workspace state, and preserve run context across supported adapters
> - We also need environment selection to participate in normal
Paperclip control-plane behavior: agent defaults, project/issue
selection, route validation, and environment probing
> - Because this capability is still experimental, the UI surface should
be easy to hide and easy to remove later without undoing the underlying
implementation
> - This pull request adds SSH environment execution support across the
runtime, adapters, routes, schema, and tests, then puts the visible
environment-management UI behind an experimental flag
> - The benefit is that we can validate real SSH-backed agent execution
now while keeping the user-facing controls safely gated until the
feature is ready to come out of experimentation
## What Changed
- Added SSH-backed execution target support in the shared adapter
runtime, including remote workspace preparation, skill/runtime asset
sync, remote session handling, and workspace restore behavior after
runs.
- Added SSH execution coverage for supported local adapters, plus remote
execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi.
- Added environment selection and environment-management backend support
needed for SSH execution, including route/service work, validation,
probing, and agent default environment persistence.
- Added CLI support for SSH environment lab verification and updated
related docs/tests.
- Added the `enableEnvironments` experimental flag and gated the
environment UI behind it on company settings, agent configuration, and
project configuration surfaces.
## Verification
- `pnpm exec vitest run
packages/adapters/claude-local/src/server/execute.remote.test.ts
packages/adapters/cursor-local/src/server/execute.remote.test.ts
packages/adapters/gemini-local/src/server/execute.remote.test.ts
packages/adapters/opencode-local/src/server/execute.remote.test.ts
packages/adapters/pi-local/src/server/execute.remote.test.ts`
- `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts`
- `pnpm exec vitest run
server/src/__tests__/instance-settings-routes.test.ts`
- `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts
ui/src/lib/new-agent-runtime-config.test.ts`
- `pnpm -r typecheck`
- `pnpm build`
- Manual verification on a branch-local dev server:
- enabled the experimental flag
- created an SSH environment
- created a Linux Claude agent using that environment
- confirmed a run executed on the Linux box and synced workspace changes
back
## Risks
- Medium: this touches runtime execution flow across multiple adapters,
so regressions would likely show up in remote session setup, workspace
sync, or environment selection precedence.
- The UI flag reduces exposure, but the underlying runtime and route
changes are still substantial and rely on migration correctness.
- The change set is broad across adapters, control-plane services,
migrations, and UI gating, so review should pay close attention to
environment-selection precedence and remote workspace lifecycle
behavior.
## Model Used
- OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding
model with tool use and code execution in the local repo workspace. The
local adapter does not surface a more specific public model version
string in this branch workflow.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-23 19:15:22 -07:00
|
|
|
stdout,
|
|
|
|
|
stderr,
|
|
|
|
|
pid: null,
|
|
|
|
|
startedAt,
|
|
|
|
|
};
|
|
|
|
|
}
|
|
|
|
|
}
|
Add sandbox environment support (#4415)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies.
> - The environment/runtime layer decides where agent work executes and
how the control plane reaches those runtimes.
> - Today Paperclip can run locally and over SSH, but sandboxed
execution needs a first-class environment model instead of one-off
adapter behavior.
> - We also want sandbox providers to be pluggable so the core does not
hardcode every provider implementation.
> - This branch adds the Sandbox environment path, the provider
contract, and a deterministic fake provider plugin.
> - That required synchronized changes across shared contracts, plugin
SDK surfaces, server runtime orchestration, and the UI
environment/workspace flows.
> - The result is that sandbox execution becomes a core control-plane
capability while keeping provider implementations extensible and
testable.
## What Changed
- Added sandbox runtime support to the environment execution path,
including runtime URL discovery, sandbox execution targeting,
orchestration, and heartbeat integration.
- Added plugin-provider support for sandbox environments so providers
can be supplied via plugins instead of hardcoded server logic.
- Added the fake sandbox provider plugin with deterministic behavior
suitable for local and automated testing.
- Updated shared types, validators, plugin protocol definitions, and SDK
helpers to carry sandbox provider and workspace-runtime contracts across
package boundaries.
- Updated server routes and services so companies can create sandbox
environments, select them for work, and execute work through the sandbox
runtime path.
- Updated the UI environment and workspace surfaces to expose sandbox
environment configuration and selection.
- Added test coverage for sandbox runtime behavior, provider seams,
environment route guards, orchestration, and the fake provider plugin.
## Verification
- Ran locally before the final fixture-only scrub:
- `pnpm -r typecheck`
- `pnpm test:run`
- `pnpm build`
- Ran locally after the final scrub amend:
- `pnpm vitest run server/src/__tests__/runtime-api.test.ts`
- Reviewer spot checks:
- create a sandbox environment backed by the fake provider plugin
- run work through that environment
- confirm sandbox provider execution does not inherit host secrets
implicitly
## Risks
- This touches shared contracts, plugin SDK plumbing, server runtime
orchestration, and UI environment/workspace flows, so regressions would
likely show up as cross-layer mismatches rather than isolated type
errors.
- Runtime URL discovery and sandbox callback selection are sensitive to
host/bind configuration; if that logic is wrong, sandbox-backed
callbacks may fail even when execution succeeds.
- The fake provider plugin is intentionally deterministic and
test-oriented; future providers may expose capability gaps that this
branch does not yet cover.
## Model Used
- OpenAI Codex coding agent on a GPT-5-class backend in the
Paperclip/Codex harness. Exact backend model ID is not exposed
in-session. Tool-assisted workflow with shell execution, file editing,
git history inspection, and local test execution.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-24 12:15:53 -07:00
|
|
|
|
Let sandbox providers declare shell defaults (#5114)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies
> - Agents execute in sandboxed remote environments served by pluggable
sandbox
> providers (E2B today, more later)
> - Today every sandbox command runs under `sh -lc` regardless of what
the
> provider's container actually ships
> - That misses bash-only shell init on E2B (which ships bash) and
prevents
> future providers from declaring a different default — there's no way
for a
> provider to say "I have bash, use it"
> - This PR adds a `shellCommand` field to sandbox execution targets so
providers
> can declare their preferred shell ("bash" for E2B), threads it through
the
> sandbox-managed-runtime client, callback bridge, and execution-target
shell
> helper, and validates the value at the lease-metadata boundary
> - The benefit is that sandbox commands run under the right shell on
the right
> provider, and adding new sandbox providers only needs to declare a
shell
> preference
## What Changed
- Added `packages/adapter-utils/src/sandbox-shell.ts` exporting
`preferredShellForSandbox(shellCommand)` (returns `"bash"` if input is
`"bash"`,
else `"sh"`)
- Added `shellCommand?: "bash" | "sh" | null` to
`AdapterSandboxExecutionTarget`
and `CommandManagedRuntimeSpec`; threaded it through
`runAdapterExecutionTargetShellCommand`,
`prepareAdapterExecutionTargetRuntime`,
and `startAdapterExecutionTargetPaperclipBridge`
- `createCommandManagedRuntimeClient`, `prepareCommandManagedRuntime`,
and
`createCommandManagedSandboxCallbackBridgeQueueClient` now take an
optional
`shellCommand` and use `preferredShellForSandbox` to pick the shell
- `startSandboxCallbackBridgeServer` accepts a `shellCommand` for its
server
startup, readiness probe, and stop hook
- E2B sandbox plugin declares `shellCommand: "bash"` in `leaseMetadata`
- `resolveEnvironmentExecutionTarget` reads `shellCommand` from lease
metadata
(validating against `"bash" | "sh" | null`)
- `environment-runtime.ts` adds `"shellCommand"` to
`INTERNAL_PLUGIN_SANDBOX_CONFIG_KEYS`
so the field round-trips through internal plugin config without leaking
to
external plugin metadata
- Updated tests in `command-managed-runtime.test.ts`,
`execution-target-sandbox.test.ts`, `sandbox-callback-bridge.test.ts`,
`environment-execution-target.test.ts`
## Verification
- `pnpm --filter @paperclipai/adapter-utils test`
- `pnpm --filter @paperclipai/server test --
environment-execution-target`
- `pnpm --filter @paperclipai/sandbox-providers-e2b test`
- Manual QA: boot a Paperclip instance, create an E2B-backed
environment, run a
claude_local agent against it, and confirm the run completes (verifies
bash
shell semantics flow through the callback bridge end-to-end)
## Risks
- E2B sandbox commands now run under `bash -lc` instead of `sh -lc`.
Bash is a
strict superset for the commands we issue (no busybox-only flags in our
shell
scripts), so risk is low. The shellCommand field is opt-in via lease
metadata —
providers that don't declare it stay on `sh`.
- New optional field on `CommandManagedRuntimeSpec` and
`AdapterSandboxExecutionTarget`.
Consumers ignoring the field retain previous behaviour (sh).
- Lease metadata now carries an additional field. Existing leases
without
`shellCommand` resolve to `null` and fall back to sh — backwards
compatible.
## Model Used
- OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI
- Provider: OpenAI
- Used to author the code changes in this PR
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots — N/A (no UI changes)
- [ ] I have updated relevant documentation to reflect my changes — N/A
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-03 12:19:35 -07:00
|
|
|
const shellCommand = preferredSandboxShell(target);
|
Add sandbox environment support (#4415)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies.
> - The environment/runtime layer decides where agent work executes and
how the control plane reaches those runtimes.
> - Today Paperclip can run locally and over SSH, but sandboxed
execution needs a first-class environment model instead of one-off
adapter behavior.
> - We also want sandbox providers to be pluggable so the core does not
hardcode every provider implementation.
> - This branch adds the Sandbox environment path, the provider
contract, and a deterministic fake provider plugin.
> - That required synchronized changes across shared contracts, plugin
SDK surfaces, server runtime orchestration, and the UI
environment/workspace flows.
> - The result is that sandbox execution becomes a core control-plane
capability while keeping provider implementations extensible and
testable.
## What Changed
- Added sandbox runtime support to the environment execution path,
including runtime URL discovery, sandbox execution targeting,
orchestration, and heartbeat integration.
- Added plugin-provider support for sandbox environments so providers
can be supplied via plugins instead of hardcoded server logic.
- Added the fake sandbox provider plugin with deterministic behavior
suitable for local and automated testing.
- Updated shared types, validators, plugin protocol definitions, and SDK
helpers to carry sandbox provider and workspace-runtime contracts across
package boundaries.
- Updated server routes and services so companies can create sandbox
environments, select them for work, and execute work through the sandbox
runtime path.
- Updated the UI environment and workspace surfaces to expose sandbox
environment configuration and selection.
- Added test coverage for sandbox runtime behavior, provider seams,
environment route guards, orchestration, and the fake provider plugin.
## Verification
- Ran locally before the final fixture-only scrub:
- `pnpm -r typecheck`
- `pnpm test:run`
- `pnpm build`
- Ran locally after the final scrub amend:
- `pnpm vitest run server/src/__tests__/runtime-api.test.ts`
- Reviewer spot checks:
- create a sandbox environment backed by the fake provider plugin
- run work through that environment
- confirm sandbox provider execution does not inherit host secrets
implicitly
## Risks
- This touches shared contracts, plugin SDK plumbing, server runtime
orchestration, and UI environment/workspace flows, so regressions would
likely show up as cross-layer mismatches rather than isolated type
errors.
- Runtime URL discovery and sandbox callback selection are sensitive to
host/bind configuration; if that logic is wrong, sandbox-backed
callbacks may fail even when execution succeeds.
- The fake provider plugin is intentionally deterministic and
test-oriented; future providers may expose capability gaps that this
branch does not yet cover.
## Model Used
- OpenAI Codex coding agent on a GPT-5-class backend in the
Paperclip/Codex harness. Exact backend model ID is not exposed
in-session. Tool-assisted workflow with shell execution, file editing,
git history inspection, and local test execution.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-24 12:15:53 -07:00
|
|
|
return await requireSandboxRunner(target).execute({
|
Let sandbox providers declare shell defaults (#5114)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies
> - Agents execute in sandboxed remote environments served by pluggable
sandbox
> providers (E2B today, more later)
> - Today every sandbox command runs under `sh -lc` regardless of what
the
> provider's container actually ships
> - That misses bash-only shell init on E2B (which ships bash) and
prevents
> future providers from declaring a different default — there's no way
for a
> provider to say "I have bash, use it"
> - This PR adds a `shellCommand` field to sandbox execution targets so
providers
> can declare their preferred shell ("bash" for E2B), threads it through
the
> sandbox-managed-runtime client, callback bridge, and execution-target
shell
> helper, and validates the value at the lease-metadata boundary
> - The benefit is that sandbox commands run under the right shell on
the right
> provider, and adding new sandbox providers only needs to declare a
shell
> preference
## What Changed
- Added `packages/adapter-utils/src/sandbox-shell.ts` exporting
`preferredShellForSandbox(shellCommand)` (returns `"bash"` if input is
`"bash"`,
else `"sh"`)
- Added `shellCommand?: "bash" | "sh" | null` to
`AdapterSandboxExecutionTarget`
and `CommandManagedRuntimeSpec`; threaded it through
`runAdapterExecutionTargetShellCommand`,
`prepareAdapterExecutionTargetRuntime`,
and `startAdapterExecutionTargetPaperclipBridge`
- `createCommandManagedRuntimeClient`, `prepareCommandManagedRuntime`,
and
`createCommandManagedSandboxCallbackBridgeQueueClient` now take an
optional
`shellCommand` and use `preferredShellForSandbox` to pick the shell
- `startSandboxCallbackBridgeServer` accepts a `shellCommand` for its
server
startup, readiness probe, and stop hook
- E2B sandbox plugin declares `shellCommand: "bash"` in `leaseMetadata`
- `resolveEnvironmentExecutionTarget` reads `shellCommand` from lease
metadata
(validating against `"bash" | "sh" | null`)
- `environment-runtime.ts` adds `"shellCommand"` to
`INTERNAL_PLUGIN_SANDBOX_CONFIG_KEYS`
so the field round-trips through internal plugin config without leaking
to
external plugin metadata
- Updated tests in `command-managed-runtime.test.ts`,
`execution-target-sandbox.test.ts`, `sandbox-callback-bridge.test.ts`,
`environment-execution-target.test.ts`
## Verification
- `pnpm --filter @paperclipai/adapter-utils test`
- `pnpm --filter @paperclipai/server test --
environment-execution-target`
- `pnpm --filter @paperclipai/sandbox-providers-e2b test`
- Manual QA: boot a Paperclip instance, create an E2B-backed
environment, run a
claude_local agent against it, and confirm the run completes (verifies
bash
shell semantics flow through the callback bridge end-to-end)
## Risks
- E2B sandbox commands now run under `bash -lc` instead of `sh -lc`.
Bash is a
strict superset for the commands we issue (no busybox-only flags in our
shell
scripts), so risk is low. The shellCommand field is opt-in via lease
metadata —
providers that don't declare it stay on `sh`.
- New optional field on `CommandManagedRuntimeSpec` and
`AdapterSandboxExecutionTarget`.
Consumers ignoring the field retain previous behaviour (sh).
- Lease metadata now carries an additional field. Existing leases
without
`shellCommand` resolve to `null` and fall back to sh — backwards
compatible.
## Model Used
- OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI
- Provider: OpenAI
- Used to author the code changes in this PR
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots — N/A (no UI changes)
- [ ] I have updated relevant documentation to reflect my changes — N/A
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-03 12:19:35 -07:00
|
|
|
command: shellCommand,
|
Add sandbox environment support (#4415)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies.
> - The environment/runtime layer decides where agent work executes and
how the control plane reaches those runtimes.
> - Today Paperclip can run locally and over SSH, but sandboxed
execution needs a first-class environment model instead of one-off
adapter behavior.
> - We also want sandbox providers to be pluggable so the core does not
hardcode every provider implementation.
> - This branch adds the Sandbox environment path, the provider
contract, and a deterministic fake provider plugin.
> - That required synchronized changes across shared contracts, plugin
SDK surfaces, server runtime orchestration, and the UI
environment/workspace flows.
> - The result is that sandbox execution becomes a core control-plane
capability while keeping provider implementations extensible and
testable.
## What Changed
- Added sandbox runtime support to the environment execution path,
including runtime URL discovery, sandbox execution targeting,
orchestration, and heartbeat integration.
- Added plugin-provider support for sandbox environments so providers
can be supplied via plugins instead of hardcoded server logic.
- Added the fake sandbox provider plugin with deterministic behavior
suitable for local and automated testing.
- Updated shared types, validators, plugin protocol definitions, and SDK
helpers to carry sandbox provider and workspace-runtime contracts across
package boundaries.
- Updated server routes and services so companies can create sandbox
environments, select them for work, and execute work through the sandbox
runtime path.
- Updated the UI environment and workspace surfaces to expose sandbox
environment configuration and selection.
- Added test coverage for sandbox runtime behavior, provider seams,
environment route guards, orchestration, and the fake provider plugin.
## Verification
- Ran locally before the final fixture-only scrub:
- `pnpm -r typecheck`
- `pnpm test:run`
- `pnpm build`
- Ran locally after the final scrub amend:
- `pnpm vitest run server/src/__tests__/runtime-api.test.ts`
- Reviewer spot checks:
- create a sandbox environment backed by the fake provider plugin
- run work through that environment
- confirm sandbox provider execution does not inherit host secrets
implicitly
## Risks
- This touches shared contracts, plugin SDK plumbing, server runtime
orchestration, and UI environment/workspace flows, so regressions would
likely show up as cross-layer mismatches rather than isolated type
errors.
- Runtime URL discovery and sandbox callback selection are sensitive to
host/bind configuration; if that logic is wrong, sandbox-backed
callbacks may fail even when execution succeeds.
- The fake provider plugin is intentionally deterministic and
test-oriented; future providers may expose capability gaps that this
branch does not yet cover.
## Model Used
- OpenAI Codex coding agent on a GPT-5-class backend in the
Paperclip/Codex harness. Exact backend model ID is not exposed
in-session. Tool-assisted workflow with shell execution, file editing,
git history inspection, and local test execution.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-24 12:15:53 -07:00
|
|
|
args: ["-lc", command],
|
|
|
|
|
cwd: target.remoteCwd,
|
|
|
|
|
env: options.env,
|
|
|
|
|
timeoutMs: (options.timeoutSec ?? 15) * 1000,
|
|
|
|
|
onLog,
|
|
|
|
|
});
|
Add SSH environment support (#4358)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies
> - The environments subsystem already models execution environments,
but before this branch there was no end-to-end SSH-backed runtime path
for agents to actually run work against a remote box
> - That meant agents could be configured around environment concepts
without a reliable way to execute adapter sessions remotely, sync
workspace state, and preserve run context across supported adapters
> - We also need environment selection to participate in normal
Paperclip control-plane behavior: agent defaults, project/issue
selection, route validation, and environment probing
> - Because this capability is still experimental, the UI surface should
be easy to hide and easy to remove later without undoing the underlying
implementation
> - This pull request adds SSH environment execution support across the
runtime, adapters, routes, schema, and tests, then puts the visible
environment-management UI behind an experimental flag
> - The benefit is that we can validate real SSH-backed agent execution
now while keeping the user-facing controls safely gated until the
feature is ready to come out of experimentation
## What Changed
- Added SSH-backed execution target support in the shared adapter
runtime, including remote workspace preparation, skill/runtime asset
sync, remote session handling, and workspace restore behavior after
runs.
- Added SSH execution coverage for supported local adapters, plus remote
execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi.
- Added environment selection and environment-management backend support
needed for SSH execution, including route/service work, validation,
probing, and agent default environment persistence.
- Added CLI support for SSH environment lab verification and updated
related docs/tests.
- Added the `enableEnvironments` experimental flag and gated the
environment UI behind it on company settings, agent configuration, and
project configuration surfaces.
## Verification
- `pnpm exec vitest run
packages/adapters/claude-local/src/server/execute.remote.test.ts
packages/adapters/cursor-local/src/server/execute.remote.test.ts
packages/adapters/gemini-local/src/server/execute.remote.test.ts
packages/adapters/opencode-local/src/server/execute.remote.test.ts
packages/adapters/pi-local/src/server/execute.remote.test.ts`
- `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts`
- `pnpm exec vitest run
server/src/__tests__/instance-settings-routes.test.ts`
- `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts
ui/src/lib/new-agent-runtime-config.test.ts`
- `pnpm -r typecheck`
- `pnpm build`
- Manual verification on a branch-local dev server:
- enabled the experimental flag
- created an SSH environment
- created a Linux Claude agent using that environment
- confirmed a run executed on the Linux box and synced workspace changes
back
## Risks
- Medium: this touches runtime execution flow across multiple adapters,
so regressions would likely show up in remote session setup, workspace
sync, or environment selection precedence.
- The UI flag reduces exposure, but the underlying runtime and route
changes are still substantial and rely on migration correctness.
- The change set is broad across adapters, control-plane services,
migrations, and UI gating, so review should pay close attention to
environment-selection precedence and remote workspace lifecycle
behavior.
## Model Used
- OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding
model with tool use and code execution in the local repo workspace. The
local adapter does not surface a more specific public model version
string in this branch workflow.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-23 19:15:22 -07:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
return await runAdapterExecutionTargetProcess(
|
|
|
|
|
runId,
|
|
|
|
|
target,
|
|
|
|
|
"sh",
|
|
|
|
|
["-lc", command],
|
|
|
|
|
{
|
|
|
|
|
cwd: options.cwd,
|
|
|
|
|
env: options.env,
|
|
|
|
|
timeoutSec: options.timeoutSec ?? 15,
|
|
|
|
|
graceSec: options.graceSec ?? 5,
|
|
|
|
|
onLog,
|
|
|
|
|
},
|
|
|
|
|
);
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
export async function readAdapterExecutionTargetHomeDir(
|
|
|
|
|
runId: string,
|
|
|
|
|
target: AdapterExecutionTarget | null | undefined,
|
|
|
|
|
options: AdapterExecutionTargetShellOptions,
|
|
|
|
|
): Promise<string | null> {
|
|
|
|
|
const result = await runAdapterExecutionTargetShellCommand(
|
|
|
|
|
runId,
|
|
|
|
|
target,
|
|
|
|
|
'printf %s "$HOME"',
|
|
|
|
|
options,
|
|
|
|
|
);
|
|
|
|
|
const homeDir = result.stdout.trim();
|
|
|
|
|
return homeDir.length > 0 ? homeDir : null;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
export async function ensureAdapterExecutionTargetFile(
|
|
|
|
|
runId: string,
|
|
|
|
|
target: AdapterExecutionTarget | null | undefined,
|
|
|
|
|
filePath: string,
|
|
|
|
|
options: AdapterExecutionTargetShellOptions,
|
|
|
|
|
): Promise<void> {
|
|
|
|
|
await runAdapterExecutionTargetShellCommand(
|
|
|
|
|
runId,
|
|
|
|
|
target,
|
|
|
|
|
`mkdir -p ${shellQuote(path.posix.dirname(filePath))} && : > ${shellQuote(filePath)}`,
|
|
|
|
|
options,
|
|
|
|
|
);
|
|
|
|
|
}
|
|
|
|
|
|
Add dedicated environment settings page and test-in-environment (#4798)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies
> - Agents run inside environments (local, SSH, E2B sandbox)
> - Operators need to configure and manage these environments
> - But environment settings were buried inside the general company
settings page, making them hard to find
> - Additionally, when testing an agent from the configuration form, the
test always ran locally regardless of which environment was selected
> - This PR moves environments into a dedicated top-level company
settings section and wires the "Test Environment" button to run inside
the selected environment
> - The benefit is operators can find and manage environments more
easily, and the test button now validates the actual environment the
agent will use
## What Changed
- Added a dedicated `CompanyEnvironments` settings page with its own
route and sidebar entry
- Updated `CompanySettingsSidebar` and `CompanySettingsNav` to include
the new environments section
- Modified the agent test route (`POST /agents/:id/test`) to accept an
optional `environmentId` parameter
- Updated all adapter `test.ts` handlers to resolve and use the
specified execution target environment
- Added `resolveTestExecutionTarget` to `execution-target.ts` for remote
environment test resolution with cwd fallback
- Moved the "Test Environment" button and its feedback display into the
`NewAgent` page footer for better UX flow
## Verification
- `pnpm test` — all existing and new tests pass
- `pnpm typecheck` — clean
- Manual: navigate to Company Settings, confirm "Environments" appears
as a top-level section
- Manual: configure an agent with a non-local environment, click "Test
Environment", confirm the test runs inside that environment
## Risks
- Low risk. UI-only routing change for the settings page. The
test-in-environment change adds an optional parameter with a local
fallback, so existing behavior is preserved when no environment is
specified.
## Model Used
Codex GPT 5.4 high via Paperclip.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-29 15:56:13 -07:00
|
|
|
/**
|
|
|
|
|
* Ensure a working directory exists (and is a directory) on the execution target.
|
|
|
|
|
*
|
|
|
|
|
* For local targets this delegates to the local `ensureAbsoluteDirectory` helper
|
|
|
|
|
* (Node fs). For remote (SSH/sandbox) targets it shells out and runs
|
|
|
|
|
* `mkdir -p` (when allowed) followed by a `[ -d ]` check so the result reflects
|
|
|
|
|
* the directory state inside the environment, not on the Paperclip host.
|
|
|
|
|
*
|
|
|
|
|
* Throws an Error with a human-readable message on failure.
|
|
|
|
|
*/
|
|
|
|
|
export async function ensureAdapterExecutionTargetDirectory(
|
|
|
|
|
runId: string,
|
|
|
|
|
target: AdapterExecutionTarget | null | undefined,
|
|
|
|
|
cwd: string,
|
|
|
|
|
options: AdapterExecutionTargetShellOptions & { createIfMissing?: boolean },
|
|
|
|
|
): Promise<void> {
|
|
|
|
|
const createIfMissing = options.createIfMissing ?? false;
|
|
|
|
|
|
|
|
|
|
if (!target || target.kind === "local") {
|
|
|
|
|
const { ensureAbsoluteDirectory } = await import("./server-utils.js");
|
|
|
|
|
await ensureAbsoluteDirectory(cwd, { createIfMissing });
|
|
|
|
|
return;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// Remote (SSH or sandbox): both expect POSIX absolute paths inside the env.
|
|
|
|
|
if (!cwd.startsWith("/")) {
|
|
|
|
|
throw new Error(`Working directory must be an absolute POSIX path on the remote target: "${cwd}"`);
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
const quoted = shellQuote(cwd);
|
|
|
|
|
const script = createIfMissing
|
|
|
|
|
? `mkdir -p ${quoted} && [ -d ${quoted} ]`
|
|
|
|
|
: `[ -d ${quoted} ]`;
|
|
|
|
|
|
|
|
|
|
const result = await runAdapterExecutionTargetShellCommand(runId, target, script, {
|
|
|
|
|
cwd: target.kind === "remote" ? target.remoteCwd : cwd,
|
|
|
|
|
env: options.env,
|
|
|
|
|
timeoutSec: options.timeoutSec ?? 15,
|
|
|
|
|
graceSec: options.graceSec ?? 5,
|
|
|
|
|
onLog: options.onLog,
|
|
|
|
|
});
|
|
|
|
|
|
|
|
|
|
if (result.timedOut) {
|
|
|
|
|
throw new Error(`Timed out checking working directory on remote target: "${cwd}"`);
|
|
|
|
|
}
|
|
|
|
|
if ((result.exitCode ?? 1) !== 0) {
|
|
|
|
|
const detail = (result.stderr || result.stdout || "").trim();
|
|
|
|
|
if (createIfMissing) {
|
|
|
|
|
throw new Error(
|
|
|
|
|
`Could not create working directory "${cwd}" on remote target${detail ? `: ${detail}` : "."}`,
|
|
|
|
|
);
|
|
|
|
|
}
|
|
|
|
|
throw new Error(
|
|
|
|
|
`Working directory does not exist on remote target: "${cwd}"${detail ? ` (${detail})` : ""}`,
|
|
|
|
|
);
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
Add SSH environment support (#4358)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies
> - The environments subsystem already models execution environments,
but before this branch there was no end-to-end SSH-backed runtime path
for agents to actually run work against a remote box
> - That meant agents could be configured around environment concepts
without a reliable way to execute adapter sessions remotely, sync
workspace state, and preserve run context across supported adapters
> - We also need environment selection to participate in normal
Paperclip control-plane behavior: agent defaults, project/issue
selection, route validation, and environment probing
> - Because this capability is still experimental, the UI surface should
be easy to hide and easy to remove later without undoing the underlying
implementation
> - This pull request adds SSH environment execution support across the
runtime, adapters, routes, schema, and tests, then puts the visible
environment-management UI behind an experimental flag
> - The benefit is that we can validate real SSH-backed agent execution
now while keeping the user-facing controls safely gated until the
feature is ready to come out of experimentation
## What Changed
- Added SSH-backed execution target support in the shared adapter
runtime, including remote workspace preparation, skill/runtime asset
sync, remote session handling, and workspace restore behavior after
runs.
- Added SSH execution coverage for supported local adapters, plus remote
execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi.
- Added environment selection and environment-management backend support
needed for SSH execution, including route/service work, validation,
probing, and agent default environment persistence.
- Added CLI support for SSH environment lab verification and updated
related docs/tests.
- Added the `enableEnvironments` experimental flag and gated the
environment UI behind it on company settings, agent configuration, and
project configuration surfaces.
## Verification
- `pnpm exec vitest run
packages/adapters/claude-local/src/server/execute.remote.test.ts
packages/adapters/cursor-local/src/server/execute.remote.test.ts
packages/adapters/gemini-local/src/server/execute.remote.test.ts
packages/adapters/opencode-local/src/server/execute.remote.test.ts
packages/adapters/pi-local/src/server/execute.remote.test.ts`
- `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts`
- `pnpm exec vitest run
server/src/__tests__/instance-settings-routes.test.ts`
- `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts
ui/src/lib/new-agent-runtime-config.test.ts`
- `pnpm -r typecheck`
- `pnpm build`
- Manual verification on a branch-local dev server:
- enabled the experimental flag
- created an SSH environment
- created a Linux Claude agent using that environment
- confirmed a run executed on the Linux box and synced workspace changes
back
## Risks
- Medium: this touches runtime execution flow across multiple adapters,
so regressions would likely show up in remote session setup, workspace
sync, or environment selection precedence.
- The UI flag reduces exposure, but the underlying runtime and route
changes are still substantial and rely on migration correctness.
- The change set is broad across adapters, control-plane services,
migrations, and UI gating, so review should pay close attention to
environment-selection precedence and remote workspace lifecycle
behavior.
## Model Used
- OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding
model with tool use and code execution in the local repo workspace. The
local adapter does not surface a more specific public model version
string in this branch workflow.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-23 19:15:22 -07:00
|
|
|
export function adapterExecutionTargetSessionIdentity(
|
|
|
|
|
target: AdapterExecutionTarget | null | undefined,
|
|
|
|
|
): Record<string, unknown> | null {
|
|
|
|
|
if (!target || target.kind === "local") return null;
|
Add sandbox environment support (#4415)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies.
> - The environment/runtime layer decides where agent work executes and
how the control plane reaches those runtimes.
> - Today Paperclip can run locally and over SSH, but sandboxed
execution needs a first-class environment model instead of one-off
adapter behavior.
> - We also want sandbox providers to be pluggable so the core does not
hardcode every provider implementation.
> - This branch adds the Sandbox environment path, the provider
contract, and a deterministic fake provider plugin.
> - That required synchronized changes across shared contracts, plugin
SDK surfaces, server runtime orchestration, and the UI
environment/workspace flows.
> - The result is that sandbox execution becomes a core control-plane
capability while keeping provider implementations extensible and
testable.
## What Changed
- Added sandbox runtime support to the environment execution path,
including runtime URL discovery, sandbox execution targeting,
orchestration, and heartbeat integration.
- Added plugin-provider support for sandbox environments so providers
can be supplied via plugins instead of hardcoded server logic.
- Added the fake sandbox provider plugin with deterministic behavior
suitable for local and automated testing.
- Updated shared types, validators, plugin protocol definitions, and SDK
helpers to carry sandbox provider and workspace-runtime contracts across
package boundaries.
- Updated server routes and services so companies can create sandbox
environments, select them for work, and execute work through the sandbox
runtime path.
- Updated the UI environment and workspace surfaces to expose sandbox
environment configuration and selection.
- Added test coverage for sandbox runtime behavior, provider seams,
environment route guards, orchestration, and the fake provider plugin.
## Verification
- Ran locally before the final fixture-only scrub:
- `pnpm -r typecheck`
- `pnpm test:run`
- `pnpm build`
- Ran locally after the final scrub amend:
- `pnpm vitest run server/src/__tests__/runtime-api.test.ts`
- Reviewer spot checks:
- create a sandbox environment backed by the fake provider plugin
- run work through that environment
- confirm sandbox provider execution does not inherit host secrets
implicitly
## Risks
- This touches shared contracts, plugin SDK plumbing, server runtime
orchestration, and UI environment/workspace flows, so regressions would
likely show up as cross-layer mismatches rather than isolated type
errors.
- Runtime URL discovery and sandbox callback selection are sensitive to
host/bind configuration; if that logic is wrong, sandbox-backed
callbacks may fail even when execution succeeds.
- The fake provider plugin is intentionally deterministic and
test-oriented; future providers may expose capability gaps that this
branch does not yet cover.
## Model Used
- OpenAI Codex coding agent on a GPT-5-class backend in the
Paperclip/Codex harness. Exact backend model ID is not exposed
in-session. Tool-assisted workflow with shell execution, file editing,
git history inspection, and local test execution.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-24 12:15:53 -07:00
|
|
|
if (target.transport === "ssh") return buildRemoteExecutionSessionIdentity(target.spec);
|
Add sandbox callback bridge for remote environment API access (#4801)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies
> - Agents can run inside sandboxed environments like E2B, which are
isolated from the host network
> - Sandboxed agents need to call back to the Paperclip API to report
progress, post comments, and update issue status
> - But sandbox environments cannot reach the Paperclip server directly
because they run in isolated network namespaces
> - This PR adds a callback bridge that proxies API requests from the
sandbox to the Paperclip server, running as a local HTTP server on the
host that forwards authenticated requests
> - The bridge is started automatically when an adapter launches a
sandbox execution, and torn down when the run completes
> - The benefit is sandboxed agents can interact with the Paperclip API
without requiring network-level access to the host, enabling E2B and
similar providers to work end-to-end
## What Changed
- Added `sandbox-callback-bridge.ts` in `packages/adapter-utils/` — a
lightweight HTTP bridge server that accepts requests from sandbox
environments and proxies them to the Paperclip API with authentication
- Added request validation and security policy: the bridge only forwards
requests to the configured API URL, validates content types, enforces
size limits, and rejects non-API paths
- Wired the bridge into all remote adapter execute paths (claude, codex,
cursor, gemini, pi) — the bridge starts before the agent process and the
bridge URL is passed via environment variables
- Updated `environment-execution-target.ts` to prefer the explicit API
URL from environment lease metadata for sandbox callback routing
- Fixed Claude sandbox runtime setup to work with the bridge
configuration
- Added comprehensive test coverage for bridge request handling, policy
enforcement, and sandbox execution integration
- Fixed browser bundling — the bridge module is excluded from the
frontend bundle via the adapter-utils index export
## Verification
- `pnpm test` — all existing and new tests pass, including bridge unit
tests and sandbox execution integration tests
- `pnpm typecheck` — clean
- Manual: configure an E2B environment, run an agent task, verify the
agent can post comments and update issue status through the bridge
## Risks
- Medium. This is a new network-facing component (HTTP server on
localhost). The security policy restricts forwarding to the configured
API URL only and validates all requests, but any proxy introduces attack
surface. The bridge binds to localhost only and is scoped to the
lifetime of a single agent run.
## Model Used
Codex GPT 5.4 high via Paperclip.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-29 16:37:34 -07:00
|
|
|
const paperclipTransport = resolveSandboxPaperclipTransport(target);
|
Add sandbox environment support (#4415)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies.
> - The environment/runtime layer decides where agent work executes and
how the control plane reaches those runtimes.
> - Today Paperclip can run locally and over SSH, but sandboxed
execution needs a first-class environment model instead of one-off
adapter behavior.
> - We also want sandbox providers to be pluggable so the core does not
hardcode every provider implementation.
> - This branch adds the Sandbox environment path, the provider
contract, and a deterministic fake provider plugin.
> - That required synchronized changes across shared contracts, plugin
SDK surfaces, server runtime orchestration, and the UI
environment/workspace flows.
> - The result is that sandbox execution becomes a core control-plane
capability while keeping provider implementations extensible and
testable.
## What Changed
- Added sandbox runtime support to the environment execution path,
including runtime URL discovery, sandbox execution targeting,
orchestration, and heartbeat integration.
- Added plugin-provider support for sandbox environments so providers
can be supplied via plugins instead of hardcoded server logic.
- Added the fake sandbox provider plugin with deterministic behavior
suitable for local and automated testing.
- Updated shared types, validators, plugin protocol definitions, and SDK
helpers to carry sandbox provider and workspace-runtime contracts across
package boundaries.
- Updated server routes and services so companies can create sandbox
environments, select them for work, and execute work through the sandbox
runtime path.
- Updated the UI environment and workspace surfaces to expose sandbox
environment configuration and selection.
- Added test coverage for sandbox runtime behavior, provider seams,
environment route guards, orchestration, and the fake provider plugin.
## Verification
- Ran locally before the final fixture-only scrub:
- `pnpm -r typecheck`
- `pnpm test:run`
- `pnpm build`
- Ran locally after the final scrub amend:
- `pnpm vitest run server/src/__tests__/runtime-api.test.ts`
- Reviewer spot checks:
- create a sandbox environment backed by the fake provider plugin
- run work through that environment
- confirm sandbox provider execution does not inherit host secrets
implicitly
## Risks
- This touches shared contracts, plugin SDK plumbing, server runtime
orchestration, and UI environment/workspace flows, so regressions would
likely show up as cross-layer mismatches rather than isolated type
errors.
- Runtime URL discovery and sandbox callback selection are sensitive to
host/bind configuration; if that logic is wrong, sandbox-backed
callbacks may fail even when execution succeeds.
- The fake provider plugin is intentionally deterministic and
test-oriented; future providers may expose capability gaps that this
branch does not yet cover.
## Model Used
- OpenAI Codex coding agent on a GPT-5-class backend in the
Paperclip/Codex harness. Exact backend model ID is not exposed
in-session. Tool-assisted workflow with shell execution, file editing,
git history inspection, and local test execution.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-24 12:15:53 -07:00
|
|
|
return {
|
|
|
|
|
transport: "sandbox",
|
|
|
|
|
providerKey: target.providerKey ?? null,
|
|
|
|
|
environmentId: target.environmentId ?? null,
|
|
|
|
|
leaseId: target.leaseId ?? null,
|
|
|
|
|
remoteCwd: target.remoteCwd,
|
Add sandbox callback bridge for remote environment API access (#4801)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies
> - Agents can run inside sandboxed environments like E2B, which are
isolated from the host network
> - Sandboxed agents need to call back to the Paperclip API to report
progress, post comments, and update issue status
> - But sandbox environments cannot reach the Paperclip server directly
because they run in isolated network namespaces
> - This PR adds a callback bridge that proxies API requests from the
sandbox to the Paperclip server, running as a local HTTP server on the
host that forwards authenticated requests
> - The bridge is started automatically when an adapter launches a
sandbox execution, and torn down when the run completes
> - The benefit is sandboxed agents can interact with the Paperclip API
without requiring network-level access to the host, enabling E2B and
similar providers to work end-to-end
## What Changed
- Added `sandbox-callback-bridge.ts` in `packages/adapter-utils/` — a
lightweight HTTP bridge server that accepts requests from sandbox
environments and proxies them to the Paperclip API with authentication
- Added request validation and security policy: the bridge only forwards
requests to the configured API URL, validates content types, enforces
size limits, and rejects non-API paths
- Wired the bridge into all remote adapter execute paths (claude, codex,
cursor, gemini, pi) — the bridge starts before the agent process and the
bridge URL is passed via environment variables
- Updated `environment-execution-target.ts` to prefer the explicit API
URL from environment lease metadata for sandbox callback routing
- Fixed Claude sandbox runtime setup to work with the bridge
configuration
- Added comprehensive test coverage for bridge request handling, policy
enforcement, and sandbox execution integration
- Fixed browser bundling — the bridge module is excluded from the
frontend bundle via the adapter-utils index export
## Verification
- `pnpm test` — all existing and new tests pass, including bridge unit
tests and sandbox execution integration tests
- `pnpm typecheck` — clean
- Manual: configure an E2B environment, run an agent task, verify the
agent can post comments and update issue status through the bridge
## Risks
- Medium. This is a new network-facing component (HTTP server on
localhost). The security policy restricts forwarding to the configured
API URL only and validates all requests, but any proxy introduces attack
surface. The bridge binds to localhost only and is scoped to the
lifetime of a single agent run.
## Model Used
Codex GPT 5.4 high via Paperclip.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-29 16:37:34 -07:00
|
|
|
paperclipTransport,
|
|
|
|
|
...(paperclipTransport === "direct" && target.paperclipApiUrl ? { paperclipApiUrl: target.paperclipApiUrl } : {}),
|
Add sandbox environment support (#4415)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies.
> - The environment/runtime layer decides where agent work executes and
how the control plane reaches those runtimes.
> - Today Paperclip can run locally and over SSH, but sandboxed
execution needs a first-class environment model instead of one-off
adapter behavior.
> - We also want sandbox providers to be pluggable so the core does not
hardcode every provider implementation.
> - This branch adds the Sandbox environment path, the provider
contract, and a deterministic fake provider plugin.
> - That required synchronized changes across shared contracts, plugin
SDK surfaces, server runtime orchestration, and the UI
environment/workspace flows.
> - The result is that sandbox execution becomes a core control-plane
capability while keeping provider implementations extensible and
testable.
## What Changed
- Added sandbox runtime support to the environment execution path,
including runtime URL discovery, sandbox execution targeting,
orchestration, and heartbeat integration.
- Added plugin-provider support for sandbox environments so providers
can be supplied via plugins instead of hardcoded server logic.
- Added the fake sandbox provider plugin with deterministic behavior
suitable for local and automated testing.
- Updated shared types, validators, plugin protocol definitions, and SDK
helpers to carry sandbox provider and workspace-runtime contracts across
package boundaries.
- Updated server routes and services so companies can create sandbox
environments, select them for work, and execute work through the sandbox
runtime path.
- Updated the UI environment and workspace surfaces to expose sandbox
environment configuration and selection.
- Added test coverage for sandbox runtime behavior, provider seams,
environment route guards, orchestration, and the fake provider plugin.
## Verification
- Ran locally before the final fixture-only scrub:
- `pnpm -r typecheck`
- `pnpm test:run`
- `pnpm build`
- Ran locally after the final scrub amend:
- `pnpm vitest run server/src/__tests__/runtime-api.test.ts`
- Reviewer spot checks:
- create a sandbox environment backed by the fake provider plugin
- run work through that environment
- confirm sandbox provider execution does not inherit host secrets
implicitly
## Risks
- This touches shared contracts, plugin SDK plumbing, server runtime
orchestration, and UI environment/workspace flows, so regressions would
likely show up as cross-layer mismatches rather than isolated type
errors.
- Runtime URL discovery and sandbox callback selection are sensitive to
host/bind configuration; if that logic is wrong, sandbox-backed
callbacks may fail even when execution succeeds.
- The fake provider plugin is intentionally deterministic and
test-oriented; future providers may expose capability gaps that this
branch does not yet cover.
## Model Used
- OpenAI Codex coding agent on a GPT-5-class backend in the
Paperclip/Codex harness. Exact backend model ID is not exposed
in-session. Tool-assisted workflow with shell execution, file editing,
git history inspection, and local test execution.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-24 12:15:53 -07:00
|
|
|
};
|
Add SSH environment support (#4358)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies
> - The environments subsystem already models execution environments,
but before this branch there was no end-to-end SSH-backed runtime path
for agents to actually run work against a remote box
> - That meant agents could be configured around environment concepts
without a reliable way to execute adapter sessions remotely, sync
workspace state, and preserve run context across supported adapters
> - We also need environment selection to participate in normal
Paperclip control-plane behavior: agent defaults, project/issue
selection, route validation, and environment probing
> - Because this capability is still experimental, the UI surface should
be easy to hide and easy to remove later without undoing the underlying
implementation
> - This pull request adds SSH environment execution support across the
runtime, adapters, routes, schema, and tests, then puts the visible
environment-management UI behind an experimental flag
> - The benefit is that we can validate real SSH-backed agent execution
now while keeping the user-facing controls safely gated until the
feature is ready to come out of experimentation
## What Changed
- Added SSH-backed execution target support in the shared adapter
runtime, including remote workspace preparation, skill/runtime asset
sync, remote session handling, and workspace restore behavior after
runs.
- Added SSH execution coverage for supported local adapters, plus remote
execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi.
- Added environment selection and environment-management backend support
needed for SSH execution, including route/service work, validation,
probing, and agent default environment persistence.
- Added CLI support for SSH environment lab verification and updated
related docs/tests.
- Added the `enableEnvironments` experimental flag and gated the
environment UI behind it on company settings, agent configuration, and
project configuration surfaces.
## Verification
- `pnpm exec vitest run
packages/adapters/claude-local/src/server/execute.remote.test.ts
packages/adapters/cursor-local/src/server/execute.remote.test.ts
packages/adapters/gemini-local/src/server/execute.remote.test.ts
packages/adapters/opencode-local/src/server/execute.remote.test.ts
packages/adapters/pi-local/src/server/execute.remote.test.ts`
- `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts`
- `pnpm exec vitest run
server/src/__tests__/instance-settings-routes.test.ts`
- `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts
ui/src/lib/new-agent-runtime-config.test.ts`
- `pnpm -r typecheck`
- `pnpm build`
- Manual verification on a branch-local dev server:
- enabled the experimental flag
- created an SSH environment
- created a Linux Claude agent using that environment
- confirmed a run executed on the Linux box and synced workspace changes
back
## Risks
- Medium: this touches runtime execution flow across multiple adapters,
so regressions would likely show up in remote session setup, workspace
sync, or environment selection precedence.
- The UI flag reduces exposure, but the underlying runtime and route
changes are still substantial and rely on migration correctness.
- The change set is broad across adapters, control-plane services,
migrations, and UI gating, so review should pay close attention to
environment-selection precedence and remote workspace lifecycle
behavior.
## Model Used
- OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding
model with tool use and code execution in the local repo workspace. The
local adapter does not surface a more specific public model version
string in this branch workflow.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-23 19:15:22 -07:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
export function adapterExecutionTargetSessionMatches(
|
|
|
|
|
saved: unknown,
|
|
|
|
|
target: AdapterExecutionTarget | null | undefined,
|
|
|
|
|
): boolean {
|
|
|
|
|
if (!target || target.kind === "local") {
|
|
|
|
|
return Object.keys(parseObject(saved)).length === 0;
|
|
|
|
|
}
|
Add sandbox environment support (#4415)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies.
> - The environment/runtime layer decides where agent work executes and
how the control plane reaches those runtimes.
> - Today Paperclip can run locally and over SSH, but sandboxed
execution needs a first-class environment model instead of one-off
adapter behavior.
> - We also want sandbox providers to be pluggable so the core does not
hardcode every provider implementation.
> - This branch adds the Sandbox environment path, the provider
contract, and a deterministic fake provider plugin.
> - That required synchronized changes across shared contracts, plugin
SDK surfaces, server runtime orchestration, and the UI
environment/workspace flows.
> - The result is that sandbox execution becomes a core control-plane
capability while keeping provider implementations extensible and
testable.
## What Changed
- Added sandbox runtime support to the environment execution path,
including runtime URL discovery, sandbox execution targeting,
orchestration, and heartbeat integration.
- Added plugin-provider support for sandbox environments so providers
can be supplied via plugins instead of hardcoded server logic.
- Added the fake sandbox provider plugin with deterministic behavior
suitable for local and automated testing.
- Updated shared types, validators, plugin protocol definitions, and SDK
helpers to carry sandbox provider and workspace-runtime contracts across
package boundaries.
- Updated server routes and services so companies can create sandbox
environments, select them for work, and execute work through the sandbox
runtime path.
- Updated the UI environment and workspace surfaces to expose sandbox
environment configuration and selection.
- Added test coverage for sandbox runtime behavior, provider seams,
environment route guards, orchestration, and the fake provider plugin.
## Verification
- Ran locally before the final fixture-only scrub:
- `pnpm -r typecheck`
- `pnpm test:run`
- `pnpm build`
- Ran locally after the final scrub amend:
- `pnpm vitest run server/src/__tests__/runtime-api.test.ts`
- Reviewer spot checks:
- create a sandbox environment backed by the fake provider plugin
- run work through that environment
- confirm sandbox provider execution does not inherit host secrets
implicitly
## Risks
- This touches shared contracts, plugin SDK plumbing, server runtime
orchestration, and UI environment/workspace flows, so regressions would
likely show up as cross-layer mismatches rather than isolated type
errors.
- Runtime URL discovery and sandbox callback selection are sensitive to
host/bind configuration; if that logic is wrong, sandbox-backed
callbacks may fail even when execution succeeds.
- The fake provider plugin is intentionally deterministic and
test-oriented; future providers may expose capability gaps that this
branch does not yet cover.
## Model Used
- OpenAI Codex coding agent on a GPT-5-class backend in the
Paperclip/Codex harness. Exact backend model ID is not exposed
in-session. Tool-assisted workflow with shell execution, file editing,
git history inspection, and local test execution.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-24 12:15:53 -07:00
|
|
|
if (target.transport === "ssh") return remoteExecutionSessionMatches(saved, target.spec);
|
|
|
|
|
const current = adapterExecutionTargetSessionIdentity(target);
|
|
|
|
|
const parsedSaved = parseObject(saved);
|
|
|
|
|
return (
|
|
|
|
|
readStringMeta(parsedSaved, "transport") === current?.transport &&
|
|
|
|
|
readStringMeta(parsedSaved, "providerKey") === current?.providerKey &&
|
|
|
|
|
readStringMeta(parsedSaved, "environmentId") === current?.environmentId &&
|
|
|
|
|
readStringMeta(parsedSaved, "leaseId") === current?.leaseId &&
|
|
|
|
|
readStringMeta(parsedSaved, "remoteCwd") === current?.remoteCwd &&
|
Add sandbox callback bridge for remote environment API access (#4801)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies
> - Agents can run inside sandboxed environments like E2B, which are
isolated from the host network
> - Sandboxed agents need to call back to the Paperclip API to report
progress, post comments, and update issue status
> - But sandbox environments cannot reach the Paperclip server directly
because they run in isolated network namespaces
> - This PR adds a callback bridge that proxies API requests from the
sandbox to the Paperclip server, running as a local HTTP server on the
host that forwards authenticated requests
> - The bridge is started automatically when an adapter launches a
sandbox execution, and torn down when the run completes
> - The benefit is sandboxed agents can interact with the Paperclip API
without requiring network-level access to the host, enabling E2B and
similar providers to work end-to-end
## What Changed
- Added `sandbox-callback-bridge.ts` in `packages/adapter-utils/` — a
lightweight HTTP bridge server that accepts requests from sandbox
environments and proxies them to the Paperclip API with authentication
- Added request validation and security policy: the bridge only forwards
requests to the configured API URL, validates content types, enforces
size limits, and rejects non-API paths
- Wired the bridge into all remote adapter execute paths (claude, codex,
cursor, gemini, pi) — the bridge starts before the agent process and the
bridge URL is passed via environment variables
- Updated `environment-execution-target.ts` to prefer the explicit API
URL from environment lease metadata for sandbox callback routing
- Fixed Claude sandbox runtime setup to work with the bridge
configuration
- Added comprehensive test coverage for bridge request handling, policy
enforcement, and sandbox execution integration
- Fixed browser bundling — the bridge module is excluded from the
frontend bundle via the adapter-utils index export
## Verification
- `pnpm test` — all existing and new tests pass, including bridge unit
tests and sandbox execution integration tests
- `pnpm typecheck` — clean
- Manual: configure an E2B environment, run an agent task, verify the
agent can post comments and update issue status through the bridge
## Risks
- Medium. This is a new network-facing component (HTTP server on
localhost). The security policy restricts forwarding to the configured
API URL only and validates all requests, but any proxy introduces attack
surface. The bridge binds to localhost only and is scoped to the
lifetime of a single agent run.
## Model Used
Codex GPT 5.4 high via Paperclip.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-29 16:37:34 -07:00
|
|
|
readStringMeta(parsedSaved, "paperclipTransport") === (current?.paperclipTransport ?? null) &&
|
Add sandbox environment support (#4415)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies.
> - The environment/runtime layer decides where agent work executes and
how the control plane reaches those runtimes.
> - Today Paperclip can run locally and over SSH, but sandboxed
execution needs a first-class environment model instead of one-off
adapter behavior.
> - We also want sandbox providers to be pluggable so the core does not
hardcode every provider implementation.
> - This branch adds the Sandbox environment path, the provider
contract, and a deterministic fake provider plugin.
> - That required synchronized changes across shared contracts, plugin
SDK surfaces, server runtime orchestration, and the UI
environment/workspace flows.
> - The result is that sandbox execution becomes a core control-plane
capability while keeping provider implementations extensible and
testable.
## What Changed
- Added sandbox runtime support to the environment execution path,
including runtime URL discovery, sandbox execution targeting,
orchestration, and heartbeat integration.
- Added plugin-provider support for sandbox environments so providers
can be supplied via plugins instead of hardcoded server logic.
- Added the fake sandbox provider plugin with deterministic behavior
suitable for local and automated testing.
- Updated shared types, validators, plugin protocol definitions, and SDK
helpers to carry sandbox provider and workspace-runtime contracts across
package boundaries.
- Updated server routes and services so companies can create sandbox
environments, select them for work, and execute work through the sandbox
runtime path.
- Updated the UI environment and workspace surfaces to expose sandbox
environment configuration and selection.
- Added test coverage for sandbox runtime behavior, provider seams,
environment route guards, orchestration, and the fake provider plugin.
## Verification
- Ran locally before the final fixture-only scrub:
- `pnpm -r typecheck`
- `pnpm test:run`
- `pnpm build`
- Ran locally after the final scrub amend:
- `pnpm vitest run server/src/__tests__/runtime-api.test.ts`
- Reviewer spot checks:
- create a sandbox environment backed by the fake provider plugin
- run work through that environment
- confirm sandbox provider execution does not inherit host secrets
implicitly
## Risks
- This touches shared contracts, plugin SDK plumbing, server runtime
orchestration, and UI environment/workspace flows, so regressions would
likely show up as cross-layer mismatches rather than isolated type
errors.
- Runtime URL discovery and sandbox callback selection are sensitive to
host/bind configuration; if that logic is wrong, sandbox-backed
callbacks may fail even when execution succeeds.
- The fake provider plugin is intentionally deterministic and
test-oriented; future providers may expose capability gaps that this
branch does not yet cover.
## Model Used
- OpenAI Codex coding agent on a GPT-5-class backend in the
Paperclip/Codex harness. Exact backend model ID is not exposed
in-session. Tool-assisted workflow with shell execution, file editing,
git history inspection, and local test execution.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-24 12:15:53 -07:00
|
|
|
readStringMeta(parsedSaved, "paperclipApiUrl") === (current?.paperclipApiUrl ?? null)
|
|
|
|
|
);
|
Add SSH environment support (#4358)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies
> - The environments subsystem already models execution environments,
but before this branch there was no end-to-end SSH-backed runtime path
for agents to actually run work against a remote box
> - That meant agents could be configured around environment concepts
without a reliable way to execute adapter sessions remotely, sync
workspace state, and preserve run context across supported adapters
> - We also need environment selection to participate in normal
Paperclip control-plane behavior: agent defaults, project/issue
selection, route validation, and environment probing
> - Because this capability is still experimental, the UI surface should
be easy to hide and easy to remove later without undoing the underlying
implementation
> - This pull request adds SSH environment execution support across the
runtime, adapters, routes, schema, and tests, then puts the visible
environment-management UI behind an experimental flag
> - The benefit is that we can validate real SSH-backed agent execution
now while keeping the user-facing controls safely gated until the
feature is ready to come out of experimentation
## What Changed
- Added SSH-backed execution target support in the shared adapter
runtime, including remote workspace preparation, skill/runtime asset
sync, remote session handling, and workspace restore behavior after
runs.
- Added SSH execution coverage for supported local adapters, plus remote
execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi.
- Added environment selection and environment-management backend support
needed for SSH execution, including route/service work, validation,
probing, and agent default environment persistence.
- Added CLI support for SSH environment lab verification and updated
related docs/tests.
- Added the `enableEnvironments` experimental flag and gated the
environment UI behind it on company settings, agent configuration, and
project configuration surfaces.
## Verification
- `pnpm exec vitest run
packages/adapters/claude-local/src/server/execute.remote.test.ts
packages/adapters/cursor-local/src/server/execute.remote.test.ts
packages/adapters/gemini-local/src/server/execute.remote.test.ts
packages/adapters/opencode-local/src/server/execute.remote.test.ts
packages/adapters/pi-local/src/server/execute.remote.test.ts`
- `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts`
- `pnpm exec vitest run
server/src/__tests__/instance-settings-routes.test.ts`
- `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts
ui/src/lib/new-agent-runtime-config.test.ts`
- `pnpm -r typecheck`
- `pnpm build`
- Manual verification on a branch-local dev server:
- enabled the experimental flag
- created an SSH environment
- created a Linux Claude agent using that environment
- confirmed a run executed on the Linux box and synced workspace changes
back
## Risks
- Medium: this touches runtime execution flow across multiple adapters,
so regressions would likely show up in remote session setup, workspace
sync, or environment selection precedence.
- The UI flag reduces exposure, but the underlying runtime and route
changes are still substantial and rely on migration correctness.
- The change set is broad across adapters, control-plane services,
migrations, and UI gating, so review should pay close attention to
environment-selection precedence and remote workspace lifecycle
behavior.
## Model Used
- OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding
model with tool use and code execution in the local repo workspace. The
local adapter does not surface a more specific public model version
string in this branch workflow.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-23 19:15:22 -07:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
export function parseAdapterExecutionTarget(value: unknown): AdapterExecutionTarget | null {
|
|
|
|
|
const parsed = parseObject(value);
|
|
|
|
|
const kind = readStringMeta(parsed, "kind");
|
|
|
|
|
|
|
|
|
|
if (kind === "local") {
|
|
|
|
|
return {
|
|
|
|
|
kind: "local",
|
|
|
|
|
environmentId: readStringMeta(parsed, "environmentId"),
|
|
|
|
|
leaseId: readStringMeta(parsed, "leaseId"),
|
|
|
|
|
};
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
if (kind === "remote" && readStringMeta(parsed, "transport") === "ssh") {
|
|
|
|
|
const spec = parseSshRemoteExecutionSpec(parseObject(parsed.spec));
|
|
|
|
|
if (!spec) return null;
|
|
|
|
|
return {
|
|
|
|
|
kind: "remote",
|
|
|
|
|
transport: "ssh",
|
|
|
|
|
environmentId: readStringMeta(parsed, "environmentId"),
|
|
|
|
|
leaseId: readStringMeta(parsed, "leaseId"),
|
|
|
|
|
remoteCwd: spec.remoteCwd,
|
|
|
|
|
paperclipApiUrl: readStringMeta(parsed, "paperclipApiUrl") ?? spec.paperclipApiUrl ?? null,
|
|
|
|
|
spec,
|
|
|
|
|
};
|
|
|
|
|
}
|
|
|
|
|
|
Add sandbox environment support (#4415)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies.
> - The environment/runtime layer decides where agent work executes and
how the control plane reaches those runtimes.
> - Today Paperclip can run locally and over SSH, but sandboxed
execution needs a first-class environment model instead of one-off
adapter behavior.
> - We also want sandbox providers to be pluggable so the core does not
hardcode every provider implementation.
> - This branch adds the Sandbox environment path, the provider
contract, and a deterministic fake provider plugin.
> - That required synchronized changes across shared contracts, plugin
SDK surfaces, server runtime orchestration, and the UI
environment/workspace flows.
> - The result is that sandbox execution becomes a core control-plane
capability while keeping provider implementations extensible and
testable.
## What Changed
- Added sandbox runtime support to the environment execution path,
including runtime URL discovery, sandbox execution targeting,
orchestration, and heartbeat integration.
- Added plugin-provider support for sandbox environments so providers
can be supplied via plugins instead of hardcoded server logic.
- Added the fake sandbox provider plugin with deterministic behavior
suitable for local and automated testing.
- Updated shared types, validators, plugin protocol definitions, and SDK
helpers to carry sandbox provider and workspace-runtime contracts across
package boundaries.
- Updated server routes and services so companies can create sandbox
environments, select them for work, and execute work through the sandbox
runtime path.
- Updated the UI environment and workspace surfaces to expose sandbox
environment configuration and selection.
- Added test coverage for sandbox runtime behavior, provider seams,
environment route guards, orchestration, and the fake provider plugin.
## Verification
- Ran locally before the final fixture-only scrub:
- `pnpm -r typecheck`
- `pnpm test:run`
- `pnpm build`
- Ran locally after the final scrub amend:
- `pnpm vitest run server/src/__tests__/runtime-api.test.ts`
- Reviewer spot checks:
- create a sandbox environment backed by the fake provider plugin
- run work through that environment
- confirm sandbox provider execution does not inherit host secrets
implicitly
## Risks
- This touches shared contracts, plugin SDK plumbing, server runtime
orchestration, and UI environment/workspace flows, so regressions would
likely show up as cross-layer mismatches rather than isolated type
errors.
- Runtime URL discovery and sandbox callback selection are sensitive to
host/bind configuration; if that logic is wrong, sandbox-backed
callbacks may fail even when execution succeeds.
- The fake provider plugin is intentionally deterministic and
test-oriented; future providers may expose capability gaps that this
branch does not yet cover.
## Model Used
- OpenAI Codex coding agent on a GPT-5-class backend in the
Paperclip/Codex harness. Exact backend model ID is not exposed
in-session. Tool-assisted workflow with shell execution, file editing,
git history inspection, and local test execution.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-24 12:15:53 -07:00
|
|
|
if (kind === "remote" && readStringMeta(parsed, "transport") === "sandbox") {
|
|
|
|
|
const remoteCwd = readStringMeta(parsed, "remoteCwd");
|
Add sandbox callback bridge for remote environment API access (#4801)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies
> - Agents can run inside sandboxed environments like E2B, which are
isolated from the host network
> - Sandboxed agents need to call back to the Paperclip API to report
progress, post comments, and update issue status
> - But sandbox environments cannot reach the Paperclip server directly
because they run in isolated network namespaces
> - This PR adds a callback bridge that proxies API requests from the
sandbox to the Paperclip server, running as a local HTTP server on the
host that forwards authenticated requests
> - The bridge is started automatically when an adapter launches a
sandbox execution, and torn down when the run completes
> - The benefit is sandboxed agents can interact with the Paperclip API
without requiring network-level access to the host, enabling E2B and
similar providers to work end-to-end
## What Changed
- Added `sandbox-callback-bridge.ts` in `packages/adapter-utils/` — a
lightweight HTTP bridge server that accepts requests from sandbox
environments and proxies them to the Paperclip API with authentication
- Added request validation and security policy: the bridge only forwards
requests to the configured API URL, validates content types, enforces
size limits, and rejects non-API paths
- Wired the bridge into all remote adapter execute paths (claude, codex,
cursor, gemini, pi) — the bridge starts before the agent process and the
bridge URL is passed via environment variables
- Updated `environment-execution-target.ts` to prefer the explicit API
URL from environment lease metadata for sandbox callback routing
- Fixed Claude sandbox runtime setup to work with the bridge
configuration
- Added comprehensive test coverage for bridge request handling, policy
enforcement, and sandbox execution integration
- Fixed browser bundling — the bridge module is excluded from the
frontend bundle via the adapter-utils index export
## Verification
- `pnpm test` — all existing and new tests pass, including bridge unit
tests and sandbox execution integration tests
- `pnpm typecheck` — clean
- Manual: configure an E2B environment, run an agent task, verify the
agent can post comments and update issue status through the bridge
## Risks
- Medium. This is a new network-facing component (HTTP server on
localhost). The security policy restricts forwarding to the configured
API URL only and validates all requests, but any proxy introduces attack
surface. The bridge binds to localhost only and is scoped to the
lifetime of a single agent run.
## Model Used
Codex GPT 5.4 high via Paperclip.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-29 16:37:34 -07:00
|
|
|
const paperclipTransport = readStringMeta(parsed, "paperclipTransport");
|
Add sandbox environment support (#4415)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies.
> - The environment/runtime layer decides where agent work executes and
how the control plane reaches those runtimes.
> - Today Paperclip can run locally and over SSH, but sandboxed
execution needs a first-class environment model instead of one-off
adapter behavior.
> - We also want sandbox providers to be pluggable so the core does not
hardcode every provider implementation.
> - This branch adds the Sandbox environment path, the provider
contract, and a deterministic fake provider plugin.
> - That required synchronized changes across shared contracts, plugin
SDK surfaces, server runtime orchestration, and the UI
environment/workspace flows.
> - The result is that sandbox execution becomes a core control-plane
capability while keeping provider implementations extensible and
testable.
## What Changed
- Added sandbox runtime support to the environment execution path,
including runtime URL discovery, sandbox execution targeting,
orchestration, and heartbeat integration.
- Added plugin-provider support for sandbox environments so providers
can be supplied via plugins instead of hardcoded server logic.
- Added the fake sandbox provider plugin with deterministic behavior
suitable for local and automated testing.
- Updated shared types, validators, plugin protocol definitions, and SDK
helpers to carry sandbox provider and workspace-runtime contracts across
package boundaries.
- Updated server routes and services so companies can create sandbox
environments, select them for work, and execute work through the sandbox
runtime path.
- Updated the UI environment and workspace surfaces to expose sandbox
environment configuration and selection.
- Added test coverage for sandbox runtime behavior, provider seams,
environment route guards, orchestration, and the fake provider plugin.
## Verification
- Ran locally before the final fixture-only scrub:
- `pnpm -r typecheck`
- `pnpm test:run`
- `pnpm build`
- Ran locally after the final scrub amend:
- `pnpm vitest run server/src/__tests__/runtime-api.test.ts`
- Reviewer spot checks:
- create a sandbox environment backed by the fake provider plugin
- run work through that environment
- confirm sandbox provider execution does not inherit host secrets
implicitly
## Risks
- This touches shared contracts, plugin SDK plumbing, server runtime
orchestration, and UI environment/workspace flows, so regressions would
likely show up as cross-layer mismatches rather than isolated type
errors.
- Runtime URL discovery and sandbox callback selection are sensitive to
host/bind configuration; if that logic is wrong, sandbox-backed
callbacks may fail even when execution succeeds.
- The fake provider plugin is intentionally deterministic and
test-oriented; future providers may expose capability gaps that this
branch does not yet cover.
## Model Used
- OpenAI Codex coding agent on a GPT-5-class backend in the
Paperclip/Codex harness. Exact backend model ID is not exposed
in-session. Tool-assisted workflow with shell execution, file editing,
git history inspection, and local test execution.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-24 12:15:53 -07:00
|
|
|
if (!remoteCwd) return null;
|
|
|
|
|
return {
|
|
|
|
|
kind: "remote",
|
|
|
|
|
transport: "sandbox",
|
|
|
|
|
providerKey: readStringMeta(parsed, "providerKey"),
|
|
|
|
|
environmentId: readStringMeta(parsed, "environmentId"),
|
|
|
|
|
leaseId: readStringMeta(parsed, "leaseId"),
|
|
|
|
|
remoteCwd,
|
|
|
|
|
paperclipApiUrl: readStringMeta(parsed, "paperclipApiUrl"),
|
Add sandbox callback bridge for remote environment API access (#4801)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies
> - Agents can run inside sandboxed environments like E2B, which are
isolated from the host network
> - Sandboxed agents need to call back to the Paperclip API to report
progress, post comments, and update issue status
> - But sandbox environments cannot reach the Paperclip server directly
because they run in isolated network namespaces
> - This PR adds a callback bridge that proxies API requests from the
sandbox to the Paperclip server, running as a local HTTP server on the
host that forwards authenticated requests
> - The bridge is started automatically when an adapter launches a
sandbox execution, and torn down when the run completes
> - The benefit is sandboxed agents can interact with the Paperclip API
without requiring network-level access to the host, enabling E2B and
similar providers to work end-to-end
## What Changed
- Added `sandbox-callback-bridge.ts` in `packages/adapter-utils/` — a
lightweight HTTP bridge server that accepts requests from sandbox
environments and proxies them to the Paperclip API with authentication
- Added request validation and security policy: the bridge only forwards
requests to the configured API URL, validates content types, enforces
size limits, and rejects non-API paths
- Wired the bridge into all remote adapter execute paths (claude, codex,
cursor, gemini, pi) — the bridge starts before the agent process and the
bridge URL is passed via environment variables
- Updated `environment-execution-target.ts` to prefer the explicit API
URL from environment lease metadata for sandbox callback routing
- Fixed Claude sandbox runtime setup to work with the bridge
configuration
- Added comprehensive test coverage for bridge request handling, policy
enforcement, and sandbox execution integration
- Fixed browser bundling — the bridge module is excluded from the
frontend bundle via the adapter-utils index export
## Verification
- `pnpm test` — all existing and new tests pass, including bridge unit
tests and sandbox execution integration tests
- `pnpm typecheck` — clean
- Manual: configure an E2B environment, run an agent task, verify the
agent can post comments and update issue status through the bridge
## Risks
- Medium. This is a new network-facing component (HTTP server on
localhost). The security policy restricts forwarding to the configured
API URL only and validates all requests, but any proxy introduces attack
surface. The bridge binds to localhost only and is scoped to the
lifetime of a single agent run.
## Model Used
Codex GPT 5.4 high via Paperclip.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-29 16:37:34 -07:00
|
|
|
paperclipTransport:
|
|
|
|
|
paperclipTransport === "direct" || paperclipTransport === "bridge"
|
|
|
|
|
? paperclipTransport
|
|
|
|
|
: undefined,
|
Add sandbox environment support (#4415)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies.
> - The environment/runtime layer decides where agent work executes and
how the control plane reaches those runtimes.
> - Today Paperclip can run locally and over SSH, but sandboxed
execution needs a first-class environment model instead of one-off
adapter behavior.
> - We also want sandbox providers to be pluggable so the core does not
hardcode every provider implementation.
> - This branch adds the Sandbox environment path, the provider
contract, and a deterministic fake provider plugin.
> - That required synchronized changes across shared contracts, plugin
SDK surfaces, server runtime orchestration, and the UI
environment/workspace flows.
> - The result is that sandbox execution becomes a core control-plane
capability while keeping provider implementations extensible and
testable.
## What Changed
- Added sandbox runtime support to the environment execution path,
including runtime URL discovery, sandbox execution targeting,
orchestration, and heartbeat integration.
- Added plugin-provider support for sandbox environments so providers
can be supplied via plugins instead of hardcoded server logic.
- Added the fake sandbox provider plugin with deterministic behavior
suitable for local and automated testing.
- Updated shared types, validators, plugin protocol definitions, and SDK
helpers to carry sandbox provider and workspace-runtime contracts across
package boundaries.
- Updated server routes and services so companies can create sandbox
environments, select them for work, and execute work through the sandbox
runtime path.
- Updated the UI environment and workspace surfaces to expose sandbox
environment configuration and selection.
- Added test coverage for sandbox runtime behavior, provider seams,
environment route guards, orchestration, and the fake provider plugin.
## Verification
- Ran locally before the final fixture-only scrub:
- `pnpm -r typecheck`
- `pnpm test:run`
- `pnpm build`
- Ran locally after the final scrub amend:
- `pnpm vitest run server/src/__tests__/runtime-api.test.ts`
- Reviewer spot checks:
- create a sandbox environment backed by the fake provider plugin
- run work through that environment
- confirm sandbox provider execution does not inherit host secrets
implicitly
## Risks
- This touches shared contracts, plugin SDK plumbing, server runtime
orchestration, and UI environment/workspace flows, so regressions would
likely show up as cross-layer mismatches rather than isolated type
errors.
- Runtime URL discovery and sandbox callback selection are sensitive to
host/bind configuration; if that logic is wrong, sandbox-backed
callbacks may fail even when execution succeeds.
- The fake provider plugin is intentionally deterministic and
test-oriented; future providers may expose capability gaps that this
branch does not yet cover.
## Model Used
- OpenAI Codex coding agent on a GPT-5-class backend in the
Paperclip/Codex harness. Exact backend model ID is not exposed
in-session. Tool-assisted workflow with shell execution, file editing,
git history inspection, and local test execution.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-24 12:15:53 -07:00
|
|
|
timeoutMs: typeof parsed.timeoutMs === "number" ? parsed.timeoutMs : null,
|
|
|
|
|
};
|
|
|
|
|
}
|
|
|
|
|
|
Add SSH environment support (#4358)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies
> - The environments subsystem already models execution environments,
but before this branch there was no end-to-end SSH-backed runtime path
for agents to actually run work against a remote box
> - That meant agents could be configured around environment concepts
without a reliable way to execute adapter sessions remotely, sync
workspace state, and preserve run context across supported adapters
> - We also need environment selection to participate in normal
Paperclip control-plane behavior: agent defaults, project/issue
selection, route validation, and environment probing
> - Because this capability is still experimental, the UI surface should
be easy to hide and easy to remove later without undoing the underlying
implementation
> - This pull request adds SSH environment execution support across the
runtime, adapters, routes, schema, and tests, then puts the visible
environment-management UI behind an experimental flag
> - The benefit is that we can validate real SSH-backed agent execution
now while keeping the user-facing controls safely gated until the
feature is ready to come out of experimentation
## What Changed
- Added SSH-backed execution target support in the shared adapter
runtime, including remote workspace preparation, skill/runtime asset
sync, remote session handling, and workspace restore behavior after
runs.
- Added SSH execution coverage for supported local adapters, plus remote
execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi.
- Added environment selection and environment-management backend support
needed for SSH execution, including route/service work, validation,
probing, and agent default environment persistence.
- Added CLI support for SSH environment lab verification and updated
related docs/tests.
- Added the `enableEnvironments` experimental flag and gated the
environment UI behind it on company settings, agent configuration, and
project configuration surfaces.
## Verification
- `pnpm exec vitest run
packages/adapters/claude-local/src/server/execute.remote.test.ts
packages/adapters/cursor-local/src/server/execute.remote.test.ts
packages/adapters/gemini-local/src/server/execute.remote.test.ts
packages/adapters/opencode-local/src/server/execute.remote.test.ts
packages/adapters/pi-local/src/server/execute.remote.test.ts`
- `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts`
- `pnpm exec vitest run
server/src/__tests__/instance-settings-routes.test.ts`
- `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts
ui/src/lib/new-agent-runtime-config.test.ts`
- `pnpm -r typecheck`
- `pnpm build`
- Manual verification on a branch-local dev server:
- enabled the experimental flag
- created an SSH environment
- created a Linux Claude agent using that environment
- confirmed a run executed on the Linux box and synced workspace changes
back
## Risks
- Medium: this touches runtime execution flow across multiple adapters,
so regressions would likely show up in remote session setup, workspace
sync, or environment selection precedence.
- The UI flag reduces exposure, but the underlying runtime and route
changes are still substantial and rely on migration correctness.
- The change set is broad across adapters, control-plane services,
migrations, and UI gating, so review should pay close attention to
environment-selection precedence and remote workspace lifecycle
behavior.
## Model Used
- OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding
model with tool use and code execution in the local repo workspace. The
local adapter does not surface a more specific public model version
string in this branch workflow.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-23 19:15:22 -07:00
|
|
|
return null;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
export function adapterExecutionTargetFromRemoteExecution(
|
|
|
|
|
remoteExecution: unknown,
|
|
|
|
|
metadata: Pick<AdapterLocalExecutionTarget, "environmentId" | "leaseId"> = {},
|
|
|
|
|
): AdapterExecutionTarget | null {
|
|
|
|
|
const parsed = parseObject(remoteExecution);
|
|
|
|
|
const ssh = parseSshRemoteExecutionSpec(parsed);
|
|
|
|
|
if (ssh) {
|
|
|
|
|
return {
|
|
|
|
|
kind: "remote",
|
|
|
|
|
transport: "ssh",
|
|
|
|
|
environmentId: metadata.environmentId ?? null,
|
|
|
|
|
leaseId: metadata.leaseId ?? null,
|
|
|
|
|
remoteCwd: ssh.remoteCwd,
|
|
|
|
|
paperclipApiUrl: ssh.paperclipApiUrl ?? null,
|
|
|
|
|
spec: ssh,
|
|
|
|
|
};
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
return null;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
export function readAdapterExecutionTarget(input: {
|
|
|
|
|
executionTarget?: unknown;
|
|
|
|
|
legacyRemoteExecution?: unknown;
|
|
|
|
|
}): AdapterExecutionTarget | null {
|
|
|
|
|
if (isAdapterExecutionTargetInstance(input.executionTarget)) {
|
|
|
|
|
return input.executionTarget;
|
|
|
|
|
}
|
|
|
|
|
return (
|
|
|
|
|
parseAdapterExecutionTarget(input.executionTarget) ??
|
|
|
|
|
adapterExecutionTargetFromRemoteExecution(input.legacyRemoteExecution)
|
|
|
|
|
);
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
export async function prepareAdapterExecutionTargetRuntime(input: {
|
|
|
|
|
target: AdapterExecutionTarget | null | undefined;
|
|
|
|
|
adapterKey: string;
|
|
|
|
|
workspaceLocalDir: string;
|
|
|
|
|
workspaceExclude?: string[];
|
|
|
|
|
preserveAbsentOnRestore?: string[];
|
|
|
|
|
assets?: AdapterManagedRuntimeAsset[];
|
|
|
|
|
installCommand?: string | null;
|
|
|
|
|
}): Promise<PreparedAdapterExecutionTargetRuntime> {
|
|
|
|
|
const target = input.target ?? { kind: "local" as const };
|
|
|
|
|
if (target.kind === "local") {
|
|
|
|
|
return {
|
|
|
|
|
target,
|
|
|
|
|
runtimeRootDir: null,
|
|
|
|
|
assetDirs: {},
|
|
|
|
|
restoreWorkspace: async () => {},
|
|
|
|
|
};
|
|
|
|
|
}
|
|
|
|
|
|
Add sandbox environment support (#4415)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies.
> - The environment/runtime layer decides where agent work executes and
how the control plane reaches those runtimes.
> - Today Paperclip can run locally and over SSH, but sandboxed
execution needs a first-class environment model instead of one-off
adapter behavior.
> - We also want sandbox providers to be pluggable so the core does not
hardcode every provider implementation.
> - This branch adds the Sandbox environment path, the provider
contract, and a deterministic fake provider plugin.
> - That required synchronized changes across shared contracts, plugin
SDK surfaces, server runtime orchestration, and the UI
environment/workspace flows.
> - The result is that sandbox execution becomes a core control-plane
capability while keeping provider implementations extensible and
testable.
## What Changed
- Added sandbox runtime support to the environment execution path,
including runtime URL discovery, sandbox execution targeting,
orchestration, and heartbeat integration.
- Added plugin-provider support for sandbox environments so providers
can be supplied via plugins instead of hardcoded server logic.
- Added the fake sandbox provider plugin with deterministic behavior
suitable for local and automated testing.
- Updated shared types, validators, plugin protocol definitions, and SDK
helpers to carry sandbox provider and workspace-runtime contracts across
package boundaries.
- Updated server routes and services so companies can create sandbox
environments, select them for work, and execute work through the sandbox
runtime path.
- Updated the UI environment and workspace surfaces to expose sandbox
environment configuration and selection.
- Added test coverage for sandbox runtime behavior, provider seams,
environment route guards, orchestration, and the fake provider plugin.
## Verification
- Ran locally before the final fixture-only scrub:
- `pnpm -r typecheck`
- `pnpm test:run`
- `pnpm build`
- Ran locally after the final scrub amend:
- `pnpm vitest run server/src/__tests__/runtime-api.test.ts`
- Reviewer spot checks:
- create a sandbox environment backed by the fake provider plugin
- run work through that environment
- confirm sandbox provider execution does not inherit host secrets
implicitly
## Risks
- This touches shared contracts, plugin SDK plumbing, server runtime
orchestration, and UI environment/workspace flows, so regressions would
likely show up as cross-layer mismatches rather than isolated type
errors.
- Runtime URL discovery and sandbox callback selection are sensitive to
host/bind configuration; if that logic is wrong, sandbox-backed
callbacks may fail even when execution succeeds.
- The fake provider plugin is intentionally deterministic and
test-oriented; future providers may expose capability gaps that this
branch does not yet cover.
## Model Used
- OpenAI Codex coding agent on a GPT-5-class backend in the
Paperclip/Codex harness. Exact backend model ID is not exposed
in-session. Tool-assisted workflow with shell execution, file editing,
git history inspection, and local test execution.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-24 12:15:53 -07:00
|
|
|
if (target.transport === "ssh") {
|
|
|
|
|
const prepared = await prepareRemoteManagedRuntime({
|
|
|
|
|
spec: target.spec,
|
|
|
|
|
adapterKey: input.adapterKey,
|
|
|
|
|
workspaceLocalDir: input.workspaceLocalDir,
|
|
|
|
|
assets: input.assets,
|
|
|
|
|
});
|
|
|
|
|
return {
|
|
|
|
|
target,
|
|
|
|
|
runtimeRootDir: prepared.runtimeRootDir,
|
|
|
|
|
assetDirs: prepared.assetDirs,
|
|
|
|
|
restoreWorkspace: prepared.restoreWorkspace,
|
|
|
|
|
};
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
const prepared = await prepareCommandManagedRuntime({
|
|
|
|
|
runner: requireSandboxRunner(target),
|
|
|
|
|
spec: {
|
|
|
|
|
providerKey: target.providerKey,
|
Let sandbox providers declare shell defaults (#5114)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies
> - Agents execute in sandboxed remote environments served by pluggable
sandbox
> providers (E2B today, more later)
> - Today every sandbox command runs under `sh -lc` regardless of what
the
> provider's container actually ships
> - That misses bash-only shell init on E2B (which ships bash) and
prevents
> future providers from declaring a different default — there's no way
for a
> provider to say "I have bash, use it"
> - This PR adds a `shellCommand` field to sandbox execution targets so
providers
> can declare their preferred shell ("bash" for E2B), threads it through
the
> sandbox-managed-runtime client, callback bridge, and execution-target
shell
> helper, and validates the value at the lease-metadata boundary
> - The benefit is that sandbox commands run under the right shell on
the right
> provider, and adding new sandbox providers only needs to declare a
shell
> preference
## What Changed
- Added `packages/adapter-utils/src/sandbox-shell.ts` exporting
`preferredShellForSandbox(shellCommand)` (returns `"bash"` if input is
`"bash"`,
else `"sh"`)
- Added `shellCommand?: "bash" | "sh" | null` to
`AdapterSandboxExecutionTarget`
and `CommandManagedRuntimeSpec`; threaded it through
`runAdapterExecutionTargetShellCommand`,
`prepareAdapterExecutionTargetRuntime`,
and `startAdapterExecutionTargetPaperclipBridge`
- `createCommandManagedRuntimeClient`, `prepareCommandManagedRuntime`,
and
`createCommandManagedSandboxCallbackBridgeQueueClient` now take an
optional
`shellCommand` and use `preferredShellForSandbox` to pick the shell
- `startSandboxCallbackBridgeServer` accepts a `shellCommand` for its
server
startup, readiness probe, and stop hook
- E2B sandbox plugin declares `shellCommand: "bash"` in `leaseMetadata`
- `resolveEnvironmentExecutionTarget` reads `shellCommand` from lease
metadata
(validating against `"bash" | "sh" | null`)
- `environment-runtime.ts` adds `"shellCommand"` to
`INTERNAL_PLUGIN_SANDBOX_CONFIG_KEYS`
so the field round-trips through internal plugin config without leaking
to
external plugin metadata
- Updated tests in `command-managed-runtime.test.ts`,
`execution-target-sandbox.test.ts`, `sandbox-callback-bridge.test.ts`,
`environment-execution-target.test.ts`
## Verification
- `pnpm --filter @paperclipai/adapter-utils test`
- `pnpm --filter @paperclipai/server test --
environment-execution-target`
- `pnpm --filter @paperclipai/sandbox-providers-e2b test`
- Manual QA: boot a Paperclip instance, create an E2B-backed
environment, run a
claude_local agent against it, and confirm the run completes (verifies
bash
shell semantics flow through the callback bridge end-to-end)
## Risks
- E2B sandbox commands now run under `bash -lc` instead of `sh -lc`.
Bash is a
strict superset for the commands we issue (no busybox-only flags in our
shell
scripts), so risk is low. The shellCommand field is opt-in via lease
metadata —
providers that don't declare it stay on `sh`.
- New optional field on `CommandManagedRuntimeSpec` and
`AdapterSandboxExecutionTarget`.
Consumers ignoring the field retain previous behaviour (sh).
- Lease metadata now carries an additional field. Existing leases
without
`shellCommand` resolve to `null` and fall back to sh — backwards
compatible.
## Model Used
- OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI
- Provider: OpenAI
- Used to author the code changes in this PR
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots — N/A (no UI changes)
- [ ] I have updated relevant documentation to reflect my changes — N/A
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-03 12:19:35 -07:00
|
|
|
shellCommand: target.shellCommand,
|
Add sandbox environment support (#4415)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies.
> - The environment/runtime layer decides where agent work executes and
how the control plane reaches those runtimes.
> - Today Paperclip can run locally and over SSH, but sandboxed
execution needs a first-class environment model instead of one-off
adapter behavior.
> - We also want sandbox providers to be pluggable so the core does not
hardcode every provider implementation.
> - This branch adds the Sandbox environment path, the provider
contract, and a deterministic fake provider plugin.
> - That required synchronized changes across shared contracts, plugin
SDK surfaces, server runtime orchestration, and the UI
environment/workspace flows.
> - The result is that sandbox execution becomes a core control-plane
capability while keeping provider implementations extensible and
testable.
## What Changed
- Added sandbox runtime support to the environment execution path,
including runtime URL discovery, sandbox execution targeting,
orchestration, and heartbeat integration.
- Added plugin-provider support for sandbox environments so providers
can be supplied via plugins instead of hardcoded server logic.
- Added the fake sandbox provider plugin with deterministic behavior
suitable for local and automated testing.
- Updated shared types, validators, plugin protocol definitions, and SDK
helpers to carry sandbox provider and workspace-runtime contracts across
package boundaries.
- Updated server routes and services so companies can create sandbox
environments, select them for work, and execute work through the sandbox
runtime path.
- Updated the UI environment and workspace surfaces to expose sandbox
environment configuration and selection.
- Added test coverage for sandbox runtime behavior, provider seams,
environment route guards, orchestration, and the fake provider plugin.
## Verification
- Ran locally before the final fixture-only scrub:
- `pnpm -r typecheck`
- `pnpm test:run`
- `pnpm build`
- Ran locally after the final scrub amend:
- `pnpm vitest run server/src/__tests__/runtime-api.test.ts`
- Reviewer spot checks:
- create a sandbox environment backed by the fake provider plugin
- run work through that environment
- confirm sandbox provider execution does not inherit host secrets
implicitly
## Risks
- This touches shared contracts, plugin SDK plumbing, server runtime
orchestration, and UI environment/workspace flows, so regressions would
likely show up as cross-layer mismatches rather than isolated type
errors.
- Runtime URL discovery and sandbox callback selection are sensitive to
host/bind configuration; if that logic is wrong, sandbox-backed
callbacks may fail even when execution succeeds.
- The fake provider plugin is intentionally deterministic and
test-oriented; future providers may expose capability gaps that this
branch does not yet cover.
## Model Used
- OpenAI Codex coding agent on a GPT-5-class backend in the
Paperclip/Codex harness. Exact backend model ID is not exposed
in-session. Tool-assisted workflow with shell execution, file editing,
git history inspection, and local test execution.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-24 12:15:53 -07:00
|
|
|
leaseId: target.leaseId,
|
|
|
|
|
remoteCwd: target.remoteCwd,
|
|
|
|
|
timeoutMs: target.timeoutMs,
|
|
|
|
|
paperclipApiUrl: target.paperclipApiUrl,
|
|
|
|
|
},
|
Add SSH environment support (#4358)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies
> - The environments subsystem already models execution environments,
but before this branch there was no end-to-end SSH-backed runtime path
for agents to actually run work against a remote box
> - That meant agents could be configured around environment concepts
without a reliable way to execute adapter sessions remotely, sync
workspace state, and preserve run context across supported adapters
> - We also need environment selection to participate in normal
Paperclip control-plane behavior: agent defaults, project/issue
selection, route validation, and environment probing
> - Because this capability is still experimental, the UI surface should
be easy to hide and easy to remove later without undoing the underlying
implementation
> - This pull request adds SSH environment execution support across the
runtime, adapters, routes, schema, and tests, then puts the visible
environment-management UI behind an experimental flag
> - The benefit is that we can validate real SSH-backed agent execution
now while keeping the user-facing controls safely gated until the
feature is ready to come out of experimentation
## What Changed
- Added SSH-backed execution target support in the shared adapter
runtime, including remote workspace preparation, skill/runtime asset
sync, remote session handling, and workspace restore behavior after
runs.
- Added SSH execution coverage for supported local adapters, plus remote
execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi.
- Added environment selection and environment-management backend support
needed for SSH execution, including route/service work, validation,
probing, and agent default environment persistence.
- Added CLI support for SSH environment lab verification and updated
related docs/tests.
- Added the `enableEnvironments` experimental flag and gated the
environment UI behind it on company settings, agent configuration, and
project configuration surfaces.
## Verification
- `pnpm exec vitest run
packages/adapters/claude-local/src/server/execute.remote.test.ts
packages/adapters/cursor-local/src/server/execute.remote.test.ts
packages/adapters/gemini-local/src/server/execute.remote.test.ts
packages/adapters/opencode-local/src/server/execute.remote.test.ts
packages/adapters/pi-local/src/server/execute.remote.test.ts`
- `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts`
- `pnpm exec vitest run
server/src/__tests__/instance-settings-routes.test.ts`
- `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts
ui/src/lib/new-agent-runtime-config.test.ts`
- `pnpm -r typecheck`
- `pnpm build`
- Manual verification on a branch-local dev server:
- enabled the experimental flag
- created an SSH environment
- created a Linux Claude agent using that environment
- confirmed a run executed on the Linux box and synced workspace changes
back
## Risks
- Medium: this touches runtime execution flow across multiple adapters,
so regressions would likely show up in remote session setup, workspace
sync, or environment selection precedence.
- The UI flag reduces exposure, but the underlying runtime and route
changes are still substantial and rely on migration correctness.
- The change set is broad across adapters, control-plane services,
migrations, and UI gating, so review should pay close attention to
environment-selection precedence and remote workspace lifecycle
behavior.
## Model Used
- OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding
model with tool use and code execution in the local repo workspace. The
local adapter does not surface a more specific public model version
string in this branch workflow.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-23 19:15:22 -07:00
|
|
|
adapterKey: input.adapterKey,
|
|
|
|
|
workspaceLocalDir: input.workspaceLocalDir,
|
Add sandbox environment support (#4415)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies.
> - The environment/runtime layer decides where agent work executes and
how the control plane reaches those runtimes.
> - Today Paperclip can run locally and over SSH, but sandboxed
execution needs a first-class environment model instead of one-off
adapter behavior.
> - We also want sandbox providers to be pluggable so the core does not
hardcode every provider implementation.
> - This branch adds the Sandbox environment path, the provider
contract, and a deterministic fake provider plugin.
> - That required synchronized changes across shared contracts, plugin
SDK surfaces, server runtime orchestration, and the UI
environment/workspace flows.
> - The result is that sandbox execution becomes a core control-plane
capability while keeping provider implementations extensible and
testable.
## What Changed
- Added sandbox runtime support to the environment execution path,
including runtime URL discovery, sandbox execution targeting,
orchestration, and heartbeat integration.
- Added plugin-provider support for sandbox environments so providers
can be supplied via plugins instead of hardcoded server logic.
- Added the fake sandbox provider plugin with deterministic behavior
suitable for local and automated testing.
- Updated shared types, validators, plugin protocol definitions, and SDK
helpers to carry sandbox provider and workspace-runtime contracts across
package boundaries.
- Updated server routes and services so companies can create sandbox
environments, select them for work, and execute work through the sandbox
runtime path.
- Updated the UI environment and workspace surfaces to expose sandbox
environment configuration and selection.
- Added test coverage for sandbox runtime behavior, provider seams,
environment route guards, orchestration, and the fake provider plugin.
## Verification
- Ran locally before the final fixture-only scrub:
- `pnpm -r typecheck`
- `pnpm test:run`
- `pnpm build`
- Ran locally after the final scrub amend:
- `pnpm vitest run server/src/__tests__/runtime-api.test.ts`
- Reviewer spot checks:
- create a sandbox environment backed by the fake provider plugin
- run work through that environment
- confirm sandbox provider execution does not inherit host secrets
implicitly
## Risks
- This touches shared contracts, plugin SDK plumbing, server runtime
orchestration, and UI environment/workspace flows, so regressions would
likely show up as cross-layer mismatches rather than isolated type
errors.
- Runtime URL discovery and sandbox callback selection are sensitive to
host/bind configuration; if that logic is wrong, sandbox-backed
callbacks may fail even when execution succeeds.
- The fake provider plugin is intentionally deterministic and
test-oriented; future providers may expose capability gaps that this
branch does not yet cover.
## Model Used
- OpenAI Codex coding agent on a GPT-5-class backend in the
Paperclip/Codex harness. Exact backend model ID is not exposed
in-session. Tool-assisted workflow with shell execution, file editing,
git history inspection, and local test execution.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-24 12:15:53 -07:00
|
|
|
workspaceExclude: input.workspaceExclude,
|
|
|
|
|
preserveAbsentOnRestore: input.preserveAbsentOnRestore,
|
Add SSH environment support (#4358)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies
> - The environments subsystem already models execution environments,
but before this branch there was no end-to-end SSH-backed runtime path
for agents to actually run work against a remote box
> - That meant agents could be configured around environment concepts
without a reliable way to execute adapter sessions remotely, sync
workspace state, and preserve run context across supported adapters
> - We also need environment selection to participate in normal
Paperclip control-plane behavior: agent defaults, project/issue
selection, route validation, and environment probing
> - Because this capability is still experimental, the UI surface should
be easy to hide and easy to remove later without undoing the underlying
implementation
> - This pull request adds SSH environment execution support across the
runtime, adapters, routes, schema, and tests, then puts the visible
environment-management UI behind an experimental flag
> - The benefit is that we can validate real SSH-backed agent execution
now while keeping the user-facing controls safely gated until the
feature is ready to come out of experimentation
## What Changed
- Added SSH-backed execution target support in the shared adapter
runtime, including remote workspace preparation, skill/runtime asset
sync, remote session handling, and workspace restore behavior after
runs.
- Added SSH execution coverage for supported local adapters, plus remote
execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi.
- Added environment selection and environment-management backend support
needed for SSH execution, including route/service work, validation,
probing, and agent default environment persistence.
- Added CLI support for SSH environment lab verification and updated
related docs/tests.
- Added the `enableEnvironments` experimental flag and gated the
environment UI behind it on company settings, agent configuration, and
project configuration surfaces.
## Verification
- `pnpm exec vitest run
packages/adapters/claude-local/src/server/execute.remote.test.ts
packages/adapters/cursor-local/src/server/execute.remote.test.ts
packages/adapters/gemini-local/src/server/execute.remote.test.ts
packages/adapters/opencode-local/src/server/execute.remote.test.ts
packages/adapters/pi-local/src/server/execute.remote.test.ts`
- `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts`
- `pnpm exec vitest run
server/src/__tests__/instance-settings-routes.test.ts`
- `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts
ui/src/lib/new-agent-runtime-config.test.ts`
- `pnpm -r typecheck`
- `pnpm build`
- Manual verification on a branch-local dev server:
- enabled the experimental flag
- created an SSH environment
- created a Linux Claude agent using that environment
- confirmed a run executed on the Linux box and synced workspace changes
back
## Risks
- Medium: this touches runtime execution flow across multiple adapters,
so regressions would likely show up in remote session setup, workspace
sync, or environment selection precedence.
- The UI flag reduces exposure, but the underlying runtime and route
changes are still substantial and rely on migration correctness.
- The change set is broad across adapters, control-plane services,
migrations, and UI gating, so review should pay close attention to
environment-selection precedence and remote workspace lifecycle
behavior.
## Model Used
- OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding
model with tool use and code execution in the local repo workspace. The
local adapter does not surface a more specific public model version
string in this branch workflow.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-23 19:15:22 -07:00
|
|
|
assets: input.assets,
|
Add sandbox environment support (#4415)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies.
> - The environment/runtime layer decides where agent work executes and
how the control plane reaches those runtimes.
> - Today Paperclip can run locally and over SSH, but sandboxed
execution needs a first-class environment model instead of one-off
adapter behavior.
> - We also want sandbox providers to be pluggable so the core does not
hardcode every provider implementation.
> - This branch adds the Sandbox environment path, the provider
contract, and a deterministic fake provider plugin.
> - That required synchronized changes across shared contracts, plugin
SDK surfaces, server runtime orchestration, and the UI
environment/workspace flows.
> - The result is that sandbox execution becomes a core control-plane
capability while keeping provider implementations extensible and
testable.
## What Changed
- Added sandbox runtime support to the environment execution path,
including runtime URL discovery, sandbox execution targeting,
orchestration, and heartbeat integration.
- Added plugin-provider support for sandbox environments so providers
can be supplied via plugins instead of hardcoded server logic.
- Added the fake sandbox provider plugin with deterministic behavior
suitable for local and automated testing.
- Updated shared types, validators, plugin protocol definitions, and SDK
helpers to carry sandbox provider and workspace-runtime contracts across
package boundaries.
- Updated server routes and services so companies can create sandbox
environments, select them for work, and execute work through the sandbox
runtime path.
- Updated the UI environment and workspace surfaces to expose sandbox
environment configuration and selection.
- Added test coverage for sandbox runtime behavior, provider seams,
environment route guards, orchestration, and the fake provider plugin.
## Verification
- Ran locally before the final fixture-only scrub:
- `pnpm -r typecheck`
- `pnpm test:run`
- `pnpm build`
- Ran locally after the final scrub amend:
- `pnpm vitest run server/src/__tests__/runtime-api.test.ts`
- Reviewer spot checks:
- create a sandbox environment backed by the fake provider plugin
- run work through that environment
- confirm sandbox provider execution does not inherit host secrets
implicitly
## Risks
- This touches shared contracts, plugin SDK plumbing, server runtime
orchestration, and UI environment/workspace flows, so regressions would
likely show up as cross-layer mismatches rather than isolated type
errors.
- Runtime URL discovery and sandbox callback selection are sensitive to
host/bind configuration; if that logic is wrong, sandbox-backed
callbacks may fail even when execution succeeds.
- The fake provider plugin is intentionally deterministic and
test-oriented; future providers may expose capability gaps that this
branch does not yet cover.
## Model Used
- OpenAI Codex coding agent on a GPT-5-class backend in the
Paperclip/Codex harness. Exact backend model ID is not exposed
in-session. Tool-assisted workflow with shell execution, file editing,
git history inspection, and local test execution.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-24 12:15:53 -07:00
|
|
|
installCommand: input.installCommand,
|
Add SSH environment support (#4358)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies
> - The environments subsystem already models execution environments,
but before this branch there was no end-to-end SSH-backed runtime path
for agents to actually run work against a remote box
> - That meant agents could be configured around environment concepts
without a reliable way to execute adapter sessions remotely, sync
workspace state, and preserve run context across supported adapters
> - We also need environment selection to participate in normal
Paperclip control-plane behavior: agent defaults, project/issue
selection, route validation, and environment probing
> - Because this capability is still experimental, the UI surface should
be easy to hide and easy to remove later without undoing the underlying
implementation
> - This pull request adds SSH environment execution support across the
runtime, adapters, routes, schema, and tests, then puts the visible
environment-management UI behind an experimental flag
> - The benefit is that we can validate real SSH-backed agent execution
now while keeping the user-facing controls safely gated until the
feature is ready to come out of experimentation
## What Changed
- Added SSH-backed execution target support in the shared adapter
runtime, including remote workspace preparation, skill/runtime asset
sync, remote session handling, and workspace restore behavior after
runs.
- Added SSH execution coverage for supported local adapters, plus remote
execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi.
- Added environment selection and environment-management backend support
needed for SSH execution, including route/service work, validation,
probing, and agent default environment persistence.
- Added CLI support for SSH environment lab verification and updated
related docs/tests.
- Added the `enableEnvironments` experimental flag and gated the
environment UI behind it on company settings, agent configuration, and
project configuration surfaces.
## Verification
- `pnpm exec vitest run
packages/adapters/claude-local/src/server/execute.remote.test.ts
packages/adapters/cursor-local/src/server/execute.remote.test.ts
packages/adapters/gemini-local/src/server/execute.remote.test.ts
packages/adapters/opencode-local/src/server/execute.remote.test.ts
packages/adapters/pi-local/src/server/execute.remote.test.ts`
- `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts`
- `pnpm exec vitest run
server/src/__tests__/instance-settings-routes.test.ts`
- `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts
ui/src/lib/new-agent-runtime-config.test.ts`
- `pnpm -r typecheck`
- `pnpm build`
- Manual verification on a branch-local dev server:
- enabled the experimental flag
- created an SSH environment
- created a Linux Claude agent using that environment
- confirmed a run executed on the Linux box and synced workspace changes
back
## Risks
- Medium: this touches runtime execution flow across multiple adapters,
so regressions would likely show up in remote session setup, workspace
sync, or environment selection precedence.
- The UI flag reduces exposure, but the underlying runtime and route
changes are still substantial and rely on migration correctness.
- The change set is broad across adapters, control-plane services,
migrations, and UI gating, so review should pay close attention to
environment-selection precedence and remote workspace lifecycle
behavior.
## Model Used
- OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding
model with tool use and code execution in the local repo workspace. The
local adapter does not surface a more specific public model version
string in this branch workflow.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-23 19:15:22 -07:00
|
|
|
});
|
|
|
|
|
return {
|
|
|
|
|
target,
|
|
|
|
|
runtimeRootDir: prepared.runtimeRootDir,
|
|
|
|
|
assetDirs: prepared.assetDirs,
|
|
|
|
|
restoreWorkspace: prepared.restoreWorkspace,
|
|
|
|
|
};
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
export function runtimeAssetDir(
|
|
|
|
|
prepared: Pick<PreparedAdapterExecutionTargetRuntime, "assetDirs">,
|
|
|
|
|
key: string,
|
|
|
|
|
fallbackRemoteCwd: string,
|
|
|
|
|
): string {
|
|
|
|
|
return prepared.assetDirs[key] ?? path.posix.join(fallbackRemoteCwd, ".paperclip-runtime", key);
|
|
|
|
|
}
|
Add sandbox callback bridge for remote environment API access (#4801)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies
> - Agents can run inside sandboxed environments like E2B, which are
isolated from the host network
> - Sandboxed agents need to call back to the Paperclip API to report
progress, post comments, and update issue status
> - But sandbox environments cannot reach the Paperclip server directly
because they run in isolated network namespaces
> - This PR adds a callback bridge that proxies API requests from the
sandbox to the Paperclip server, running as a local HTTP server on the
host that forwards authenticated requests
> - The bridge is started automatically when an adapter launches a
sandbox execution, and torn down when the run completes
> - The benefit is sandboxed agents can interact with the Paperclip API
without requiring network-level access to the host, enabling E2B and
similar providers to work end-to-end
## What Changed
- Added `sandbox-callback-bridge.ts` in `packages/adapter-utils/` — a
lightweight HTTP bridge server that accepts requests from sandbox
environments and proxies them to the Paperclip API with authentication
- Added request validation and security policy: the bridge only forwards
requests to the configured API URL, validates content types, enforces
size limits, and rejects non-API paths
- Wired the bridge into all remote adapter execute paths (claude, codex,
cursor, gemini, pi) — the bridge starts before the agent process and the
bridge URL is passed via environment variables
- Updated `environment-execution-target.ts` to prefer the explicit API
URL from environment lease metadata for sandbox callback routing
- Fixed Claude sandbox runtime setup to work with the bridge
configuration
- Added comprehensive test coverage for bridge request handling, policy
enforcement, and sandbox execution integration
- Fixed browser bundling — the bridge module is excluded from the
frontend bundle via the adapter-utils index export
## Verification
- `pnpm test` — all existing and new tests pass, including bridge unit
tests and sandbox execution integration tests
- `pnpm typecheck` — clean
- Manual: configure an E2B environment, run an agent task, verify the
agent can post comments and update issue status through the bridge
## Risks
- Medium. This is a new network-facing component (HTTP server on
localhost). The security policy restricts forwarding to the configured
API URL only and validates all requests, but any proxy introduces attack
surface. The bridge binds to localhost only and is scoped to the
lifetime of a single agent run.
## Model Used
Codex GPT 5.4 high via Paperclip.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-29 16:37:34 -07:00
|
|
|
|
|
|
|
|
function buildBridgeResponseHeaders(response: Response): Record<string, string> {
|
|
|
|
|
const out: Record<string, string> = {};
|
|
|
|
|
for (const key of ["content-type", "etag", "last-modified"]) {
|
|
|
|
|
const value = response.headers.get(key);
|
|
|
|
|
if (value && value.trim().length > 0) out[key] = value.trim();
|
|
|
|
|
}
|
|
|
|
|
return out;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
function buildBridgeForwardUrl(baseUrl: string, request: { path: string; query: string }): URL {
|
|
|
|
|
const url = new URL(request.path, baseUrl);
|
|
|
|
|
const query = request.query.trim();
|
|
|
|
|
url.search = query.startsWith("?") ? query.slice(1) : query;
|
|
|
|
|
return url;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
function bridgeResponseBodyLimitError(maxBodyBytes: number): Error {
|
|
|
|
|
return new Error(`Bridge response body exceeded the configured size limit of ${maxBodyBytes} bytes.`);
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
async function readBridgeForwardResponseBody(response: Response, maxBodyBytes: number): Promise<string> {
|
|
|
|
|
const rawContentLength = response.headers.get("content-length");
|
|
|
|
|
if (rawContentLength) {
|
|
|
|
|
const contentLength = Number.parseInt(rawContentLength, 10);
|
|
|
|
|
if (Number.isFinite(contentLength) && contentLength > maxBodyBytes) {
|
|
|
|
|
throw bridgeResponseBodyLimitError(maxBodyBytes);
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
if (!response.body) {
|
|
|
|
|
return "";
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
const reader = response.body.getReader();
|
|
|
|
|
const chunks: Buffer[] = [];
|
|
|
|
|
let totalBytes = 0;
|
|
|
|
|
while (true) {
|
|
|
|
|
const { done, value } = await reader.read();
|
|
|
|
|
if (done) break;
|
|
|
|
|
if (!value) continue;
|
|
|
|
|
totalBytes += value.byteLength;
|
|
|
|
|
if (totalBytes > maxBodyBytes) {
|
|
|
|
|
await reader.cancel().catch(() => undefined);
|
|
|
|
|
throw bridgeResponseBodyLimitError(maxBodyBytes);
|
|
|
|
|
}
|
|
|
|
|
chunks.push(Buffer.from(value));
|
|
|
|
|
}
|
|
|
|
|
return Buffer.concat(chunks, totalBytes).toString("utf8");
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
export async function startAdapterExecutionTargetPaperclipBridge(input: {
|
|
|
|
|
runId: string;
|
|
|
|
|
target: AdapterExecutionTarget | null | undefined;
|
|
|
|
|
runtimeRootDir: string | null | undefined;
|
|
|
|
|
adapterKey: string;
|
|
|
|
|
hostApiToken: string | null | undefined;
|
|
|
|
|
hostApiUrl?: string | null;
|
|
|
|
|
onLog?: (stream: "stdout" | "stderr", chunk: string) => Promise<void>;
|
|
|
|
|
maxBodyBytes?: number | null;
|
|
|
|
|
}): Promise<AdapterExecutionTargetPaperclipBridgeHandle | null> {
|
|
|
|
|
if (!adapterExecutionTargetUsesPaperclipBridge(input.target)) {
|
|
|
|
|
return null;
|
|
|
|
|
}
|
|
|
|
|
if (!input.target || input.target.kind !== "remote" || input.target.transport !== "sandbox") {
|
|
|
|
|
return null;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
const target = input.target;
|
|
|
|
|
const onLog = input.onLog ?? (async () => {});
|
|
|
|
|
const hostApiToken = input.hostApiToken?.trim() ?? "";
|
|
|
|
|
if (hostApiToken.length === 0) {
|
|
|
|
|
throw new Error("Sandbox bridge mode requires a host-side Paperclip API token.");
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
const runtimeRootDir =
|
|
|
|
|
input.runtimeRootDir?.trim().length
|
|
|
|
|
? input.runtimeRootDir.trim()
|
|
|
|
|
: path.posix.join(target.remoteCwd, ".paperclip-runtime", input.adapterKey);
|
|
|
|
|
const bridgeRuntimeDir = path.posix.join(runtimeRootDir, "paperclip-bridge");
|
|
|
|
|
const queueDir = path.posix.join(bridgeRuntimeDir, "queue");
|
|
|
|
|
const assetRemoteDir = path.posix.join(bridgeRuntimeDir, "server");
|
|
|
|
|
const bridgeToken = createSandboxCallbackBridgeToken();
|
|
|
|
|
const maxBodyBytes =
|
|
|
|
|
typeof input.maxBodyBytes === "number" && Number.isFinite(input.maxBodyBytes) && input.maxBodyBytes > 0
|
|
|
|
|
? Math.trunc(input.maxBodyBytes)
|
|
|
|
|
: DEFAULT_SANDBOX_CALLBACK_BRIDGE_MAX_BODY_BYTES;
|
|
|
|
|
const hostApiUrl =
|
|
|
|
|
input.hostApiUrl?.trim() ||
|
|
|
|
|
process.env.PAPERCLIP_RUNTIME_API_URL?.trim() ||
|
|
|
|
|
process.env.PAPERCLIP_API_URL?.trim() ||
|
|
|
|
|
resolveDefaultPaperclipApiUrl();
|
|
|
|
|
|
|
|
|
|
await onLog(
|
|
|
|
|
"stdout",
|
|
|
|
|
`[paperclip] Starting sandbox callback bridge for ${input.adapterKey} in ${bridgeRuntimeDir}.\n`,
|
|
|
|
|
);
|
|
|
|
|
|
|
|
|
|
const bridgeAsset = await createSandboxCallbackBridgeAsset();
|
|
|
|
|
let server: Awaited<ReturnType<typeof startSandboxCallbackBridgeServer>> | null = null;
|
|
|
|
|
let worker: Awaited<ReturnType<typeof startSandboxCallbackBridgeWorker>> | null = null;
|
|
|
|
|
try {
|
|
|
|
|
const client = createCommandManagedSandboxCallbackBridgeQueueClient({
|
|
|
|
|
runner: requireSandboxRunner(target),
|
|
|
|
|
remoteCwd: target.remoteCwd,
|
|
|
|
|
timeoutMs: target.timeoutMs,
|
Let sandbox providers declare shell defaults (#5114)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies
> - Agents execute in sandboxed remote environments served by pluggable
sandbox
> providers (E2B today, more later)
> - Today every sandbox command runs under `sh -lc` regardless of what
the
> provider's container actually ships
> - That misses bash-only shell init on E2B (which ships bash) and
prevents
> future providers from declaring a different default — there's no way
for a
> provider to say "I have bash, use it"
> - This PR adds a `shellCommand` field to sandbox execution targets so
providers
> can declare their preferred shell ("bash" for E2B), threads it through
the
> sandbox-managed-runtime client, callback bridge, and execution-target
shell
> helper, and validates the value at the lease-metadata boundary
> - The benefit is that sandbox commands run under the right shell on
the right
> provider, and adding new sandbox providers only needs to declare a
shell
> preference
## What Changed
- Added `packages/adapter-utils/src/sandbox-shell.ts` exporting
`preferredShellForSandbox(shellCommand)` (returns `"bash"` if input is
`"bash"`,
else `"sh"`)
- Added `shellCommand?: "bash" | "sh" | null` to
`AdapterSandboxExecutionTarget`
and `CommandManagedRuntimeSpec`; threaded it through
`runAdapterExecutionTargetShellCommand`,
`prepareAdapterExecutionTargetRuntime`,
and `startAdapterExecutionTargetPaperclipBridge`
- `createCommandManagedRuntimeClient`, `prepareCommandManagedRuntime`,
and
`createCommandManagedSandboxCallbackBridgeQueueClient` now take an
optional
`shellCommand` and use `preferredShellForSandbox` to pick the shell
- `startSandboxCallbackBridgeServer` accepts a `shellCommand` for its
server
startup, readiness probe, and stop hook
- E2B sandbox plugin declares `shellCommand: "bash"` in `leaseMetadata`
- `resolveEnvironmentExecutionTarget` reads `shellCommand` from lease
metadata
(validating against `"bash" | "sh" | null`)
- `environment-runtime.ts` adds `"shellCommand"` to
`INTERNAL_PLUGIN_SANDBOX_CONFIG_KEYS`
so the field round-trips through internal plugin config without leaking
to
external plugin metadata
- Updated tests in `command-managed-runtime.test.ts`,
`execution-target-sandbox.test.ts`, `sandbox-callback-bridge.test.ts`,
`environment-execution-target.test.ts`
## Verification
- `pnpm --filter @paperclipai/adapter-utils test`
- `pnpm --filter @paperclipai/server test --
environment-execution-target`
- `pnpm --filter @paperclipai/sandbox-providers-e2b test`
- Manual QA: boot a Paperclip instance, create an E2B-backed
environment, run a
claude_local agent against it, and confirm the run completes (verifies
bash
shell semantics flow through the callback bridge end-to-end)
## Risks
- E2B sandbox commands now run under `bash -lc` instead of `sh -lc`.
Bash is a
strict superset for the commands we issue (no busybox-only flags in our
shell
scripts), so risk is low. The shellCommand field is opt-in via lease
metadata —
providers that don't declare it stay on `sh`.
- New optional field on `CommandManagedRuntimeSpec` and
`AdapterSandboxExecutionTarget`.
Consumers ignoring the field retain previous behaviour (sh).
- Lease metadata now carries an additional field. Existing leases
without
`shellCommand` resolve to `null` and fall back to sh — backwards
compatible.
## Model Used
- OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI
- Provider: OpenAI
- Used to author the code changes in this PR
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots — N/A (no UI changes)
- [ ] I have updated relevant documentation to reflect my changes — N/A
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-03 12:19:35 -07:00
|
|
|
shellCommand: preferredSandboxShell(target),
|
Add sandbox callback bridge for remote environment API access (#4801)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies
> - Agents can run inside sandboxed environments like E2B, which are
isolated from the host network
> - Sandboxed agents need to call back to the Paperclip API to report
progress, post comments, and update issue status
> - But sandbox environments cannot reach the Paperclip server directly
because they run in isolated network namespaces
> - This PR adds a callback bridge that proxies API requests from the
sandbox to the Paperclip server, running as a local HTTP server on the
host that forwards authenticated requests
> - The bridge is started automatically when an adapter launches a
sandbox execution, and torn down when the run completes
> - The benefit is sandboxed agents can interact with the Paperclip API
without requiring network-level access to the host, enabling E2B and
similar providers to work end-to-end
## What Changed
- Added `sandbox-callback-bridge.ts` in `packages/adapter-utils/` — a
lightweight HTTP bridge server that accepts requests from sandbox
environments and proxies them to the Paperclip API with authentication
- Added request validation and security policy: the bridge only forwards
requests to the configured API URL, validates content types, enforces
size limits, and rejects non-API paths
- Wired the bridge into all remote adapter execute paths (claude, codex,
cursor, gemini, pi) — the bridge starts before the agent process and the
bridge URL is passed via environment variables
- Updated `environment-execution-target.ts` to prefer the explicit API
URL from environment lease metadata for sandbox callback routing
- Fixed Claude sandbox runtime setup to work with the bridge
configuration
- Added comprehensive test coverage for bridge request handling, policy
enforcement, and sandbox execution integration
- Fixed browser bundling — the bridge module is excluded from the
frontend bundle via the adapter-utils index export
## Verification
- `pnpm test` — all existing and new tests pass, including bridge unit
tests and sandbox execution integration tests
- `pnpm typecheck` — clean
- Manual: configure an E2B environment, run an agent task, verify the
agent can post comments and update issue status through the bridge
## Risks
- Medium. This is a new network-facing component (HTTP server on
localhost). The security policy restricts forwarding to the configured
API URL only and validates all requests, but any proxy introduces attack
surface. The bridge binds to localhost only and is scoped to the
lifetime of a single agent run.
## Model Used
Codex GPT 5.4 high via Paperclip.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-29 16:37:34 -07:00
|
|
|
});
|
|
|
|
|
worker = await startSandboxCallbackBridgeWorker({
|
|
|
|
|
client,
|
|
|
|
|
queueDir,
|
|
|
|
|
maxBodyBytes,
|
|
|
|
|
handleRequest: async (request) => {
|
|
|
|
|
const headers = new Headers();
|
|
|
|
|
for (const [key, value] of Object.entries(request.headers)) {
|
|
|
|
|
if (value.trim().length === 0) continue;
|
|
|
|
|
headers.set(key, value);
|
|
|
|
|
}
|
|
|
|
|
headers.set("authorization", `Bearer ${hostApiToken}`);
|
|
|
|
|
headers.set("x-paperclip-run-id", input.runId);
|
|
|
|
|
const method = request.method.trim().toUpperCase() || "GET";
|
|
|
|
|
const response = await fetch(buildBridgeForwardUrl(hostApiUrl, request), {
|
|
|
|
|
method,
|
|
|
|
|
headers,
|
|
|
|
|
...(method === "GET" || method === "HEAD" ? {} : { body: request.body }),
|
|
|
|
|
signal: AbortSignal.timeout(30_000),
|
|
|
|
|
});
|
|
|
|
|
return {
|
|
|
|
|
status: response.status,
|
|
|
|
|
headers: buildBridgeResponseHeaders(response),
|
|
|
|
|
body: await readBridgeForwardResponseBody(response, maxBodyBytes),
|
|
|
|
|
};
|
|
|
|
|
},
|
|
|
|
|
});
|
|
|
|
|
server = await startSandboxCallbackBridgeServer({
|
|
|
|
|
runner: requireSandboxRunner(target),
|
|
|
|
|
remoteCwd: target.remoteCwd,
|
|
|
|
|
assetRemoteDir,
|
|
|
|
|
queueDir,
|
|
|
|
|
bridgeToken,
|
|
|
|
|
bridgeAsset,
|
|
|
|
|
timeoutMs: target.timeoutMs,
|
|
|
|
|
maxBodyBytes,
|
Let sandbox providers declare shell defaults (#5114)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies
> - Agents execute in sandboxed remote environments served by pluggable
sandbox
> providers (E2B today, more later)
> - Today every sandbox command runs under `sh -lc` regardless of what
the
> provider's container actually ships
> - That misses bash-only shell init on E2B (which ships bash) and
prevents
> future providers from declaring a different default — there's no way
for a
> provider to say "I have bash, use it"
> - This PR adds a `shellCommand` field to sandbox execution targets so
providers
> can declare their preferred shell ("bash" for E2B), threads it through
the
> sandbox-managed-runtime client, callback bridge, and execution-target
shell
> helper, and validates the value at the lease-metadata boundary
> - The benefit is that sandbox commands run under the right shell on
the right
> provider, and adding new sandbox providers only needs to declare a
shell
> preference
## What Changed
- Added `packages/adapter-utils/src/sandbox-shell.ts` exporting
`preferredShellForSandbox(shellCommand)` (returns `"bash"` if input is
`"bash"`,
else `"sh"`)
- Added `shellCommand?: "bash" | "sh" | null` to
`AdapterSandboxExecutionTarget`
and `CommandManagedRuntimeSpec`; threaded it through
`runAdapterExecutionTargetShellCommand`,
`prepareAdapterExecutionTargetRuntime`,
and `startAdapterExecutionTargetPaperclipBridge`
- `createCommandManagedRuntimeClient`, `prepareCommandManagedRuntime`,
and
`createCommandManagedSandboxCallbackBridgeQueueClient` now take an
optional
`shellCommand` and use `preferredShellForSandbox` to pick the shell
- `startSandboxCallbackBridgeServer` accepts a `shellCommand` for its
server
startup, readiness probe, and stop hook
- E2B sandbox plugin declares `shellCommand: "bash"` in `leaseMetadata`
- `resolveEnvironmentExecutionTarget` reads `shellCommand` from lease
metadata
(validating against `"bash" | "sh" | null`)
- `environment-runtime.ts` adds `"shellCommand"` to
`INTERNAL_PLUGIN_SANDBOX_CONFIG_KEYS`
so the field round-trips through internal plugin config without leaking
to
external plugin metadata
- Updated tests in `command-managed-runtime.test.ts`,
`execution-target-sandbox.test.ts`, `sandbox-callback-bridge.test.ts`,
`environment-execution-target.test.ts`
## Verification
- `pnpm --filter @paperclipai/adapter-utils test`
- `pnpm --filter @paperclipai/server test --
environment-execution-target`
- `pnpm --filter @paperclipai/sandbox-providers-e2b test`
- Manual QA: boot a Paperclip instance, create an E2B-backed
environment, run a
claude_local agent against it, and confirm the run completes (verifies
bash
shell semantics flow through the callback bridge end-to-end)
## Risks
- E2B sandbox commands now run under `bash -lc` instead of `sh -lc`.
Bash is a
strict superset for the commands we issue (no busybox-only flags in our
shell
scripts), so risk is low. The shellCommand field is opt-in via lease
metadata —
providers that don't declare it stay on `sh`.
- New optional field on `CommandManagedRuntimeSpec` and
`AdapterSandboxExecutionTarget`.
Consumers ignoring the field retain previous behaviour (sh).
- Lease metadata now carries an additional field. Existing leases
without
`shellCommand` resolve to `null` and fall back to sh — backwards
compatible.
## Model Used
- OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI
- Provider: OpenAI
- Used to author the code changes in this PR
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots — N/A (no UI changes)
- [ ] I have updated relevant documentation to reflect my changes — N/A
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-03 12:19:35 -07:00
|
|
|
shellCommand: preferredSandboxShell(target),
|
Add sandbox callback bridge for remote environment API access (#4801)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies
> - Agents can run inside sandboxed environments like E2B, which are
isolated from the host network
> - Sandboxed agents need to call back to the Paperclip API to report
progress, post comments, and update issue status
> - But sandbox environments cannot reach the Paperclip server directly
because they run in isolated network namespaces
> - This PR adds a callback bridge that proxies API requests from the
sandbox to the Paperclip server, running as a local HTTP server on the
host that forwards authenticated requests
> - The bridge is started automatically when an adapter launches a
sandbox execution, and torn down when the run completes
> - The benefit is sandboxed agents can interact with the Paperclip API
without requiring network-level access to the host, enabling E2B and
similar providers to work end-to-end
## What Changed
- Added `sandbox-callback-bridge.ts` in `packages/adapter-utils/` — a
lightweight HTTP bridge server that accepts requests from sandbox
environments and proxies them to the Paperclip API with authentication
- Added request validation and security policy: the bridge only forwards
requests to the configured API URL, validates content types, enforces
size limits, and rejects non-API paths
- Wired the bridge into all remote adapter execute paths (claude, codex,
cursor, gemini, pi) — the bridge starts before the agent process and the
bridge URL is passed via environment variables
- Updated `environment-execution-target.ts` to prefer the explicit API
URL from environment lease metadata for sandbox callback routing
- Fixed Claude sandbox runtime setup to work with the bridge
configuration
- Added comprehensive test coverage for bridge request handling, policy
enforcement, and sandbox execution integration
- Fixed browser bundling — the bridge module is excluded from the
frontend bundle via the adapter-utils index export
## Verification
- `pnpm test` — all existing and new tests pass, including bridge unit
tests and sandbox execution integration tests
- `pnpm typecheck` — clean
- Manual: configure an E2B environment, run an agent task, verify the
agent can post comments and update issue status through the bridge
## Risks
- Medium. This is a new network-facing component (HTTP server on
localhost). The security policy restricts forwarding to the configured
API URL only and validates all requests, but any proxy introduces attack
surface. The bridge binds to localhost only and is scoped to the
lifetime of a single agent run.
## Model Used
Codex GPT 5.4 high via Paperclip.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-29 16:37:34 -07:00
|
|
|
});
|
|
|
|
|
} catch (error) {
|
|
|
|
|
await Promise.allSettled([
|
|
|
|
|
server?.stop(),
|
|
|
|
|
worker?.stop(),
|
|
|
|
|
bridgeAsset.cleanup(),
|
|
|
|
|
]);
|
|
|
|
|
throw error;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
return {
|
|
|
|
|
env: {
|
|
|
|
|
PAPERCLIP_API_URL: server.baseUrl,
|
|
|
|
|
PAPERCLIP_API_KEY: bridgeToken,
|
|
|
|
|
PAPERCLIP_API_BRIDGE_MODE: "queue_v1",
|
|
|
|
|
},
|
|
|
|
|
stop: async () => {
|
|
|
|
|
await Promise.allSettled([
|
|
|
|
|
server?.stop(),
|
|
|
|
|
]);
|
|
|
|
|
await Promise.allSettled([
|
|
|
|
|
worker?.stop(),
|
|
|
|
|
bridgeAsset.cleanup(),
|
|
|
|
|
]);
|
|
|
|
|
},
|
|
|
|
|
};
|
|
|
|
|
}
|