paperclip/server/src/services/environment-runtime.ts

1196 lines
43 KiB
TypeScript
Raw Normal View History

Add sandbox environment support (#4415) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies. > - The environment/runtime layer decides where agent work executes and how the control plane reaches those runtimes. > - Today Paperclip can run locally and over SSH, but sandboxed execution needs a first-class environment model instead of one-off adapter behavior. > - We also want sandbox providers to be pluggable so the core does not hardcode every provider implementation. > - This branch adds the Sandbox environment path, the provider contract, and a deterministic fake provider plugin. > - That required synchronized changes across shared contracts, plugin SDK surfaces, server runtime orchestration, and the UI environment/workspace flows. > - The result is that sandbox execution becomes a core control-plane capability while keeping provider implementations extensible and testable. ## What Changed - Added sandbox runtime support to the environment execution path, including runtime URL discovery, sandbox execution targeting, orchestration, and heartbeat integration. - Added plugin-provider support for sandbox environments so providers can be supplied via plugins instead of hardcoded server logic. - Added the fake sandbox provider plugin with deterministic behavior suitable for local and automated testing. - Updated shared types, validators, plugin protocol definitions, and SDK helpers to carry sandbox provider and workspace-runtime contracts across package boundaries. - Updated server routes and services so companies can create sandbox environments, select them for work, and execute work through the sandbox runtime path. - Updated the UI environment and workspace surfaces to expose sandbox environment configuration and selection. - Added test coverage for sandbox runtime behavior, provider seams, environment route guards, orchestration, and the fake provider plugin. ## Verification - Ran locally before the final fixture-only scrub: - `pnpm -r typecheck` - `pnpm test:run` - `pnpm build` - Ran locally after the final scrub amend: - `pnpm vitest run server/src/__tests__/runtime-api.test.ts` - Reviewer spot checks: - create a sandbox environment backed by the fake provider plugin - run work through that environment - confirm sandbox provider execution does not inherit host secrets implicitly ## Risks - This touches shared contracts, plugin SDK plumbing, server runtime orchestration, and UI environment/workspace flows, so regressions would likely show up as cross-layer mismatches rather than isolated type errors. - Runtime URL discovery and sandbox callback selection are sensitive to host/bind configuration; if that logic is wrong, sandbox-backed callbacks may fail even when execution succeeds. - The fake provider plugin is intentionally deterministic and test-oriented; future providers may expose capability gaps that this branch does not yet cover. ## Model Used - OpenAI Codex coding agent on a GPT-5-class backend in the Paperclip/Codex harness. Exact backend model ID is not exposed in-session. Tool-assisted workflow with shell execution, file editing, git history inspection, and local test execution. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-24 12:15:53 -07:00
import { and, eq, inArray } from "drizzle-orm";
import type { Db } from "@paperclipai/db";
import { environmentLeases } from "@paperclipai/db";
import type {
Environment,
EnvironmentLease,
EnvironmentLeaseStatus,
ExecutionWorkspace,
PluginEnvironmentConfig,
SandboxEnvironmentConfig,
} from "@paperclipai/shared";
import type {
PluginEnvironmentExecuteResult,
PluginEnvironmentLease,
PluginEnvironmentRealizeWorkspaceResult,
} from "@paperclipai/plugin-sdk";
import { ensureSshWorkspaceReady, findReachablePaperclipApiUrlOverSsh } from "@paperclipai/adapter-utils/ssh";
import { environmentService } from "./environments.js";
Generalize sandbox provider core for plugin-only providers (#4449) ## Thinking Path > - Paperclip is a control plane, so optional execution providers should sit at the plugin edge instead of hardcoding provider-specific behavior into core shared/server/ui layers. > - Sandbox environments are already first-class, and the fake provider proves the built-in path; the remaining gap was that real providers still leaked provider-specific config and runtime assumptions into core. > - That coupling showed up in config normalization, secret persistence, capabilities reporting, lease reconstruction, and the board UI form fields. > - As long as core knew about those provider-shaped details, shipping a provider as a pure third-party plugin meant every new provider would still require host changes. > - This pull request generalizes the sandbox provider seam around schema-driven plugin metadata and generic secret-ref handling. > - The runtime and UI now consume provider metadata generically, so core only special-cases the built-in fake provider while third-party providers can live entirely in plugins. ## What Changed - Added generic sandbox-provider capability metadata so plugin-backed providers can expose `configSchema` through shared environment support and the environments capabilities API. - Reworked sandbox config normalization/persistence/runtime resolution to handle schema-declared secret-ref fields generically, storing them as Paperclip secrets and resolving them for probe/execute/release flows. - Generalized plugin sandbox runtime handling so provider validation, reusable-lease matching, lease reconstruction, and plugin worker calls all operate on provider-agnostic config instead of provider-shaped branches. - Replaced hardcoded sandbox provider form fields in Company Settings with schema-driven rendering and blocked agent environment selection from the built-in fake provider. - Added regression coverage for the generic seam across shared support helpers plus environment config, probe, routes, runtime, and sandbox-provider runtime tests. ## Verification - `pnpm vitest --run packages/shared/src/environment-support.test.ts server/src/__tests__/environment-config.test.ts server/src/__tests__/environment-probe.test.ts server/src/__tests__/environment-routes.test.ts server/src/__tests__/environment-runtime.test.ts server/src/__tests__/sandbox-provider-runtime.test.ts` - `pnpm -r typecheck` ## Risks - Plugin sandbox providers now depend more heavily on accurate `configSchema` declarations; incorrect schemas can misclassify secret-bearing fields or omit required config. - Reusable lease matching is now metadata-driven for plugin-backed providers, so providers that fail to persist stable metadata may reprovision instead of resuming an existing lease. - The UI form is now fully schema-driven for plugin-backed sandbox providers; provider manifests without good defaults or descriptions may produce a rougher operator experience. ## Model Used - OpenAI Codex via `codex_local` - Model ID: `gpt-5.4` - Reasoning effort: `high` - Context window observed in runtime session metadata: `258400` tokens - Capabilities used: terminal tool execution, git, and local code/test inspection ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-24 18:03:41 -07:00
import {
parseEnvironmentDriverConfig,
resolveEnvironmentDriverConfigForRuntime,
stripSandboxProviderEnvelope,
} from "./environment-config.js";
Add sandbox environment support (#4415) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies. > - The environment/runtime layer decides where agent work executes and how the control plane reaches those runtimes. > - Today Paperclip can run locally and over SSH, but sandboxed execution needs a first-class environment model instead of one-off adapter behavior. > - We also want sandbox providers to be pluggable so the core does not hardcode every provider implementation. > - This branch adds the Sandbox environment path, the provider contract, and a deterministic fake provider plugin. > - That required synchronized changes across shared contracts, plugin SDK surfaces, server runtime orchestration, and the UI environment/workspace flows. > - The result is that sandbox execution becomes a core control-plane capability while keeping provider implementations extensible and testable. ## What Changed - Added sandbox runtime support to the environment execution path, including runtime URL discovery, sandbox execution targeting, orchestration, and heartbeat integration. - Added plugin-provider support for sandbox environments so providers can be supplied via plugins instead of hardcoded server logic. - Added the fake sandbox provider plugin with deterministic behavior suitable for local and automated testing. - Updated shared types, validators, plugin protocol definitions, and SDK helpers to carry sandbox provider and workspace-runtime contracts across package boundaries. - Updated server routes and services so companies can create sandbox environments, select them for work, and execute work through the sandbox runtime path. - Updated the UI environment and workspace surfaces to expose sandbox environment configuration and selection. - Added test coverage for sandbox runtime behavior, provider seams, environment route guards, orchestration, and the fake provider plugin. ## Verification - Ran locally before the final fixture-only scrub: - `pnpm -r typecheck` - `pnpm test:run` - `pnpm build` - Ran locally after the final scrub amend: - `pnpm vitest run server/src/__tests__/runtime-api.test.ts` - Reviewer spot checks: - create a sandbox environment backed by the fake provider plugin - run work through that environment - confirm sandbox provider execution does not inherit host secrets implicitly ## Risks - This touches shared contracts, plugin SDK plumbing, server runtime orchestration, and UI environment/workspace flows, so regressions would likely show up as cross-layer mismatches rather than isolated type errors. - Runtime URL discovery and sandbox callback selection are sensitive to host/bind configuration; if that logic is wrong, sandbox-backed callbacks may fail even when execution succeeds. - The fake provider plugin is intentionally deterministic and test-oriented; future providers may expose capability gaps that this branch does not yet cover. ## Model Used - OpenAI Codex coding agent on a GPT-5-class backend in the Paperclip/Codex harness. Exact backend model ID is not exposed in-session. Tool-assisted workflow with shell execution, file editing, git history inspection, and local test execution. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-24 12:15:53 -07:00
import {
acquireSandboxProviderLease,
findReusableSandboxProviderLeaseId,
isBuiltinSandboxProvider,
releaseSandboxProviderLease,
sandboxConfigFromLeaseMetadata,
sandboxConfigFromLeaseMetadataLoose,
} from "./sandbox-provider-runtime.js";
import { pluginRegistryService } from "./plugin-registry.js";
import type { PluginWorkerManager } from "./plugin-worker-manager.js";
import {
destroyPluginEnvironmentLease,
executePluginEnvironmentCommand,
realizePluginEnvironmentWorkspace,
Generalize sandbox provider core for plugin-only providers (#4449) ## Thinking Path > - Paperclip is a control plane, so optional execution providers should sit at the plugin edge instead of hardcoding provider-specific behavior into core shared/server/ui layers. > - Sandbox environments are already first-class, and the fake provider proves the built-in path; the remaining gap was that real providers still leaked provider-specific config and runtime assumptions into core. > - That coupling showed up in config normalization, secret persistence, capabilities reporting, lease reconstruction, and the board UI form fields. > - As long as core knew about those provider-shaped details, shipping a provider as a pure third-party plugin meant every new provider would still require host changes. > - This pull request generalizes the sandbox provider seam around schema-driven plugin metadata and generic secret-ref handling. > - The runtime and UI now consume provider metadata generically, so core only special-cases the built-in fake provider while third-party providers can live entirely in plugins. ## What Changed - Added generic sandbox-provider capability metadata so plugin-backed providers can expose `configSchema` through shared environment support and the environments capabilities API. - Reworked sandbox config normalization/persistence/runtime resolution to handle schema-declared secret-ref fields generically, storing them as Paperclip secrets and resolving them for probe/execute/release flows. - Generalized plugin sandbox runtime handling so provider validation, reusable-lease matching, lease reconstruction, and plugin worker calls all operate on provider-agnostic config instead of provider-shaped branches. - Replaced hardcoded sandbox provider form fields in Company Settings with schema-driven rendering and blocked agent environment selection from the built-in fake provider. - Added regression coverage for the generic seam across shared support helpers plus environment config, probe, routes, runtime, and sandbox-provider runtime tests. ## Verification - `pnpm vitest --run packages/shared/src/environment-support.test.ts server/src/__tests__/environment-config.test.ts server/src/__tests__/environment-probe.test.ts server/src/__tests__/environment-routes.test.ts server/src/__tests__/environment-runtime.test.ts server/src/__tests__/sandbox-provider-runtime.test.ts` - `pnpm -r typecheck` ## Risks - Plugin sandbox providers now depend more heavily on accurate `configSchema` declarations; incorrect schemas can misclassify secret-bearing fields or omit required config. - Reusable lease matching is now metadata-driven for plugin-backed providers, so providers that fail to persist stable metadata may reprovision instead of resuming an existing lease. - The UI form is now fully schema-driven for plugin-backed sandbox providers; provider manifests without good defaults or descriptions may produce a rougher operator experience. ## Model Used - OpenAI Codex via `codex_local` - Model ID: `gpt-5.4` - Reasoning effort: `high` - Context window observed in runtime session metadata: `258400` tokens - Capabilities used: terminal tool execution, git, and local code/test inspection ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-24 18:03:41 -07:00
resolvePluginSandboxProviderDriverByKey,
Improve E2B plugin configuration UX and fix execution timeouts (#4802) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - E2B is a sandbox provider plugin that runs agent code in isolated cloud environments > - Operators configure E2B through the plugin settings page > - But the E2B API key configuration was unclear — the settings field description didn't explain that pasted keys are auto-saved as company secrets, and the fallback to the host `E2B_API_KEY` variable wasn't documented > - Additionally, long-running E2B sandbox commands were timing out because the plugin environment RPC driver used a fixed timeout, and environment commands competed for the single foreground command slot > - This PR clarifies the E2B configuration UX, fixes RPC timeouts for plugin environment execution, and runs E2B environment commands in background mode to avoid blocking the foreground slot > - The benefit is clearer E2B setup for operators and more reliable sandbox command execution ## What Changed - Updated E2B plugin manifest and settings UI to clarify API key configuration — field description now explains that pasted keys are saved as company secrets and documents the `E2B_API_KEY` host fallback - Added test coverage for the plugin settings page rendering - Fixed `plugin-environment-driver.ts` to pass the configured timeout through to RPC calls instead of using a hardcoded default - Updated `environment-runtime.ts` to propagate timeout from the environment lease to the plugin driver - Changed E2B sandbox command execution to use background handles so long-running agent commands don't block the foreground slot needed by the callback bridge ## Verification - `pnpm test` — all existing and new tests pass - `pnpm typecheck` — clean - Manual: navigate to plugin settings, verify E2B API key field shows the updated description text - Manual: run an E2B-backed agent task with a long-running command, verify it completes without RPC timeout ## Risks - Low risk. Configuration UX change is cosmetic. The timeout fix passes an existing value through instead of dropping it. Background command execution is a behavioral change but only affects E2B sandbox commands — the foreground slot is still available for bridge health checks. ## Model Used Codex GPT 5.4 high via Paperclip. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-29 17:12:30 -07:00
resolvePluginExecuteRpcTimeoutMs,
Add sandbox environment support (#4415) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies. > - The environment/runtime layer decides where agent work executes and how the control plane reaches those runtimes. > - Today Paperclip can run locally and over SSH, but sandboxed execution needs a first-class environment model instead of one-off adapter behavior. > - We also want sandbox providers to be pluggable so the core does not hardcode every provider implementation. > - This branch adds the Sandbox environment path, the provider contract, and a deterministic fake provider plugin. > - That required synchronized changes across shared contracts, plugin SDK surfaces, server runtime orchestration, and the UI environment/workspace flows. > - The result is that sandbox execution becomes a core control-plane capability while keeping provider implementations extensible and testable. ## What Changed - Added sandbox runtime support to the environment execution path, including runtime URL discovery, sandbox execution targeting, orchestration, and heartbeat integration. - Added plugin-provider support for sandbox environments so providers can be supplied via plugins instead of hardcoded server logic. - Added the fake sandbox provider plugin with deterministic behavior suitable for local and automated testing. - Updated shared types, validators, plugin protocol definitions, and SDK helpers to carry sandbox provider and workspace-runtime contracts across package boundaries. - Updated server routes and services so companies can create sandbox environments, select them for work, and execute work through the sandbox runtime path. - Updated the UI environment and workspace surfaces to expose sandbox environment configuration and selection. - Added test coverage for sandbox runtime behavior, provider seams, environment route guards, orchestration, and the fake provider plugin. ## Verification - Ran locally before the final fixture-only scrub: - `pnpm -r typecheck` - `pnpm test:run` - `pnpm build` - Ran locally after the final scrub amend: - `pnpm vitest run server/src/__tests__/runtime-api.test.ts` - Reviewer spot checks: - create a sandbox environment backed by the fake provider plugin - run work through that environment - confirm sandbox provider execution does not inherit host secrets implicitly ## Risks - This touches shared contracts, plugin SDK plumbing, server runtime orchestration, and UI environment/workspace flows, so regressions would likely show up as cross-layer mismatches rather than isolated type errors. - Runtime URL discovery and sandbox callback selection are sensitive to host/bind configuration; if that logic is wrong, sandbox-backed callbacks may fail even when execution succeeds. - The fake provider plugin is intentionally deterministic and test-oriented; future providers may expose capability gaps that this branch does not yet cover. ## Model Used - OpenAI Codex coding agent on a GPT-5-class backend in the Paperclip/Codex harness. Exact backend model ID is not exposed in-session. Tool-assisted workflow with shell execution, file editing, git history inspection, and local test execution. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-24 12:15:53 -07:00
resumePluginEnvironmentLease,
} from "./plugin-environment-driver.js";
Generalize sandbox provider core for plugin-only providers (#4449) ## Thinking Path > - Paperclip is a control plane, so optional execution providers should sit at the plugin edge instead of hardcoding provider-specific behavior into core shared/server/ui layers. > - Sandbox environments are already first-class, and the fake provider proves the built-in path; the remaining gap was that real providers still leaked provider-specific config and runtime assumptions into core. > - That coupling showed up in config normalization, secret persistence, capabilities reporting, lease reconstruction, and the board UI form fields. > - As long as core knew about those provider-shaped details, shipping a provider as a pure third-party plugin meant every new provider would still require host changes. > - This pull request generalizes the sandbox provider seam around schema-driven plugin metadata and generic secret-ref handling. > - The runtime and UI now consume provider metadata generically, so core only special-cases the built-in fake provider while third-party providers can live entirely in plugins. ## What Changed - Added generic sandbox-provider capability metadata so plugin-backed providers can expose `configSchema` through shared environment support and the environments capabilities API. - Reworked sandbox config normalization/persistence/runtime resolution to handle schema-declared secret-ref fields generically, storing them as Paperclip secrets and resolving them for probe/execute/release flows. - Generalized plugin sandbox runtime handling so provider validation, reusable-lease matching, lease reconstruction, and plugin worker calls all operate on provider-agnostic config instead of provider-shaped branches. - Replaced hardcoded sandbox provider form fields in Company Settings with schema-driven rendering and blocked agent environment selection from the built-in fake provider. - Added regression coverage for the generic seam across shared support helpers plus environment config, probe, routes, runtime, and sandbox-provider runtime tests. ## Verification - `pnpm vitest --run packages/shared/src/environment-support.test.ts server/src/__tests__/environment-config.test.ts server/src/__tests__/environment-probe.test.ts server/src/__tests__/environment-routes.test.ts server/src/__tests__/environment-runtime.test.ts server/src/__tests__/sandbox-provider-runtime.test.ts` - `pnpm -r typecheck` ## Risks - Plugin sandbox providers now depend more heavily on accurate `configSchema` declarations; incorrect schemas can misclassify secret-bearing fields or omit required config. - Reusable lease matching is now metadata-driven for plugin-backed providers, so providers that fail to persist stable metadata may reprovision instead of resuming an existing lease. - The UI form is now fully schema-driven for plugin-backed sandbox providers; provider manifests without good defaults or descriptions may produce a rougher operator experience. ## Model Used - OpenAI Codex via `codex_local` - Model ID: `gpt-5.4` - Reasoning effort: `high` - Context window observed in runtime session metadata: `258400` tokens - Capabilities used: terminal tool execution, git, and local code/test inspection ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-24 18:03:41 -07:00
import { collectSecretRefPaths } from "./json-schema-secret-refs.js";
Add sandbox environment support (#4415) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies. > - The environment/runtime layer decides where agent work executes and how the control plane reaches those runtimes. > - Today Paperclip can run locally and over SSH, but sandboxed execution needs a first-class environment model instead of one-off adapter behavior. > - We also want sandbox providers to be pluggable so the core does not hardcode every provider implementation. > - This branch adds the Sandbox environment path, the provider contract, and a deterministic fake provider plugin. > - That required synchronized changes across shared contracts, plugin SDK surfaces, server runtime orchestration, and the UI environment/workspace flows. > - The result is that sandbox execution becomes a core control-plane capability while keeping provider implementations extensible and testable. ## What Changed - Added sandbox runtime support to the environment execution path, including runtime URL discovery, sandbox execution targeting, orchestration, and heartbeat integration. - Added plugin-provider support for sandbox environments so providers can be supplied via plugins instead of hardcoded server logic. - Added the fake sandbox provider plugin with deterministic behavior suitable for local and automated testing. - Updated shared types, validators, plugin protocol definitions, and SDK helpers to carry sandbox provider and workspace-runtime contracts across package boundaries. - Updated server routes and services so companies can create sandbox environments, select them for work, and execute work through the sandbox runtime path. - Updated the UI environment and workspace surfaces to expose sandbox environment configuration and selection. - Added test coverage for sandbox runtime behavior, provider seams, environment route guards, orchestration, and the fake provider plugin. ## Verification - Ran locally before the final fixture-only scrub: - `pnpm -r typecheck` - `pnpm test:run` - `pnpm build` - Ran locally after the final scrub amend: - `pnpm vitest run server/src/__tests__/runtime-api.test.ts` - Reviewer spot checks: - create a sandbox environment backed by the fake provider plugin - run work through that environment - confirm sandbox provider execution does not inherit host secrets implicitly ## Risks - This touches shared contracts, plugin SDK plumbing, server runtime orchestration, and UI environment/workspace flows, so regressions would likely show up as cross-layer mismatches rather than isolated type errors. - Runtime URL discovery and sandbox callback selection are sensitive to host/bind configuration; if that logic is wrong, sandbox-backed callbacks may fail even when execution succeeds. - The fake provider plugin is intentionally deterministic and test-oriented; future providers may expose capability gaps that this branch does not yet cover. ## Model Used - OpenAI Codex coding agent on a GPT-5-class backend in the Paperclip/Codex harness. Exact backend model ID is not exposed in-session. Tool-assisted workflow with shell execution, file editing, git history inspection, and local test execution. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-24 12:15:53 -07:00
import { buildWorkspaceRealizationRecordFromDriverInput } from "./workspace-realization.js";
export function buildEnvironmentLeaseContext(input: {
persistedExecutionWorkspace: Pick<ExecutionWorkspace, "id" | "mode"> | null;
}) {
return {
executionWorkspaceId: input.persistedExecutionWorkspace?.id ?? null,
executionWorkspaceMode: input.persistedExecutionWorkspace?.mode ?? null,
};
}
Generalize sandbox provider core for plugin-only providers (#4449) ## Thinking Path > - Paperclip is a control plane, so optional execution providers should sit at the plugin edge instead of hardcoding provider-specific behavior into core shared/server/ui layers. > - Sandbox environments are already first-class, and the fake provider proves the built-in path; the remaining gap was that real providers still leaked provider-specific config and runtime assumptions into core. > - That coupling showed up in config normalization, secret persistence, capabilities reporting, lease reconstruction, and the board UI form fields. > - As long as core knew about those provider-shaped details, shipping a provider as a pure third-party plugin meant every new provider would still require host changes. > - This pull request generalizes the sandbox provider seam around schema-driven plugin metadata and generic secret-ref handling. > - The runtime and UI now consume provider metadata generically, so core only special-cases the built-in fake provider while third-party providers can live entirely in plugins. ## What Changed - Added generic sandbox-provider capability metadata so plugin-backed providers can expose `configSchema` through shared environment support and the environments capabilities API. - Reworked sandbox config normalization/persistence/runtime resolution to handle schema-declared secret-ref fields generically, storing them as Paperclip secrets and resolving them for probe/execute/release flows. - Generalized plugin sandbox runtime handling so provider validation, reusable-lease matching, lease reconstruction, and plugin worker calls all operate on provider-agnostic config instead of provider-shaped branches. - Replaced hardcoded sandbox provider form fields in Company Settings with schema-driven rendering and blocked agent environment selection from the built-in fake provider. - Added regression coverage for the generic seam across shared support helpers plus environment config, probe, routes, runtime, and sandbox-provider runtime tests. ## Verification - `pnpm vitest --run packages/shared/src/environment-support.test.ts server/src/__tests__/environment-config.test.ts server/src/__tests__/environment-probe.test.ts server/src/__tests__/environment-routes.test.ts server/src/__tests__/environment-runtime.test.ts server/src/__tests__/sandbox-provider-runtime.test.ts` - `pnpm -r typecheck` ## Risks - Plugin sandbox providers now depend more heavily on accurate `configSchema` declarations; incorrect schemas can misclassify secret-bearing fields or omit required config. - Reusable lease matching is now metadata-driven for plugin-backed providers, so providers that fail to persist stable metadata may reprovision instead of resuming an existing lease. - The UI form is now fully schema-driven for plugin-backed sandbox providers; provider manifests without good defaults or descriptions may produce a rougher operator experience. ## Model Used - OpenAI Codex via `codex_local` - Model ID: `gpt-5.4` - Reasoning effort: `high` - Context window observed in runtime session metadata: `258400` tokens - Capabilities used: terminal tool execution, git, and local code/test inspection ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-24 18:03:41 -07:00
function stripSecretRefValuesFromPluginLeaseMetadata(input: {
metadata: Record<string, unknown> | null | undefined;
schema: Record<string, unknown> | null | undefined;
}): Record<string, unknown> {
const sanitized = structuredClone(input.metadata ?? {}) as Record<string, unknown>;
for (const path of collectSecretRefPaths(input.schema)) {
const keys = path.split(".");
const parents: Array<{ container: Record<string, unknown>; key: string }> = [];
let cursor: Record<string, unknown> | null = sanitized;
for (let index = 0; index < keys.length - 1; index += 1) {
const key = keys[index]!;
const next = cursor?.[key];
if (!next || typeof next !== "object" || Array.isArray(next)) {
cursor = null;
break;
}
parents.push({ container: cursor, key });
cursor = next as Record<string, unknown>;
}
if (!cursor) continue;
const leafKey = keys[keys.length - 1]!;
if (!Object.prototype.hasOwnProperty.call(cursor, leafKey)) continue;
delete cursor[leafKey];
for (let index = parents.length - 1; index >= 0; index -= 1) {
const { container, key } = parents[index]!;
const value = container[key];
if (
value &&
typeof value === "object" &&
!Array.isArray(value) &&
Object.keys(value as Record<string, unknown>).length === 0
) {
delete container[key];
} else {
break;
}
}
}
return sanitized;
}
Add sandbox environment support (#4415) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies. > - The environment/runtime layer decides where agent work executes and how the control plane reaches those runtimes. > - Today Paperclip can run locally and over SSH, but sandboxed execution needs a first-class environment model instead of one-off adapter behavior. > - We also want sandbox providers to be pluggable so the core does not hardcode every provider implementation. > - This branch adds the Sandbox environment path, the provider contract, and a deterministic fake provider plugin. > - That required synchronized changes across shared contracts, plugin SDK surfaces, server runtime orchestration, and the UI environment/workspace flows. > - The result is that sandbox execution becomes a core control-plane capability while keeping provider implementations extensible and testable. ## What Changed - Added sandbox runtime support to the environment execution path, including runtime URL discovery, sandbox execution targeting, orchestration, and heartbeat integration. - Added plugin-provider support for sandbox environments so providers can be supplied via plugins instead of hardcoded server logic. - Added the fake sandbox provider plugin with deterministic behavior suitable for local and automated testing. - Updated shared types, validators, plugin protocol definitions, and SDK helpers to carry sandbox provider and workspace-runtime contracts across package boundaries. - Updated server routes and services so companies can create sandbox environments, select them for work, and execute work through the sandbox runtime path. - Updated the UI environment and workspace surfaces to expose sandbox environment configuration and selection. - Added test coverage for sandbox runtime behavior, provider seams, environment route guards, orchestration, and the fake provider plugin. ## Verification - Ran locally before the final fixture-only scrub: - `pnpm -r typecheck` - `pnpm test:run` - `pnpm build` - Ran locally after the final scrub amend: - `pnpm vitest run server/src/__tests__/runtime-api.test.ts` - Reviewer spot checks: - create a sandbox environment backed by the fake provider plugin - run work through that environment - confirm sandbox provider execution does not inherit host secrets implicitly ## Risks - This touches shared contracts, plugin SDK plumbing, server runtime orchestration, and UI environment/workspace flows, so regressions would likely show up as cross-layer mismatches rather than isolated type errors. - Runtime URL discovery and sandbox callback selection are sensitive to host/bind configuration; if that logic is wrong, sandbox-backed callbacks may fail even when execution succeeds. - The fake provider plugin is intentionally deterministic and test-oriented; future providers may expose capability gaps that this branch does not yet cover. ## Model Used - OpenAI Codex coding agent on a GPT-5-class backend in the Paperclip/Codex harness. Exact backend model ID is not exposed in-session. Tool-assisted workflow with shell execution, file editing, git history inspection, and local test execution. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-24 12:15:53 -07:00
export interface EnvironmentDriverAcquireInput {
companyId: string;
environment: Environment;
issueId: string | null;
heartbeatRunId: string;
executionWorkspaceId: string | null;
executionWorkspaceMode: ExecutionWorkspace["mode"] | null;
}
export interface EnvironmentDriverReleaseInput {
environment: Environment;
lease: EnvironmentLease;
status: Extract<EnvironmentLeaseStatus, "released" | "expired" | "failed">;
}
export interface EnvironmentDriverLeaseInput {
environment: Environment;
lease: EnvironmentLease;
}
export interface EnvironmentDriverRealizeWorkspaceInput extends EnvironmentDriverLeaseInput {
workspace: {
localPath?: string;
remotePath?: string;
mode?: string;
metadata?: Record<string, unknown>;
};
}
export interface EnvironmentDriverExecuteInput extends EnvironmentDriverLeaseInput {
command: string;
args?: string[];
cwd?: string;
env?: Record<string, string>;
stdin?: string;
timeoutMs?: number;
}
export interface EnvironmentRuntimeDriver {
readonly driver: string;
acquireRunLease(input: EnvironmentDriverAcquireInput): Promise<EnvironmentLease>;
releaseRunLease(input: EnvironmentDriverReleaseInput): Promise<EnvironmentLease | null>;
resumeRunLease?(input: EnvironmentDriverLeaseInput): Promise<PluginEnvironmentLease | EnvironmentLease | null>;
destroyRunLease?(input: EnvironmentDriverLeaseInput): Promise<EnvironmentLease | null>;
realizeWorkspace?(input: EnvironmentDriverRealizeWorkspaceInput): Promise<PluginEnvironmentRealizeWorkspaceResult>;
execute?(input: EnvironmentDriverExecuteInput): Promise<PluginEnvironmentExecuteResult>;
}
export interface EnvironmentRuntimeLeaseRecord {
environment: Environment;
lease: EnvironmentLease;
leaseContext: ReturnType<typeof buildEnvironmentLeaseContext>;
}
Fix runtime state race, workspace sync, plugin startup, and orphaned leases (#4804) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Agents run inside environments that are leased, and the server manages runtime state, workspace configuration, and plugin lifecycle > - Several edge cases caused failures during concurrent operations: a race condition in runtime state insertion could produce duplicate-key errors, reused workspaces didn't sync their configuration when the parent issue was updated, sandbox provider plugins could be queried before registration completed, and orphaned environment leases from failed runs were never released > - This PR fixes these four runtime/environment issues > - The benefit is more reliable concurrent agent execution and proper resource cleanup ## What Changed - `services/heartbeat.ts`: Fixed a race condition where concurrent runtime state inserts could fail with a duplicate-key error by using an upsert pattern - `services/issues.ts`: Sync reused workspace configuration when an issue is updated, so the workspace reflects the latest issue state - `services/environment-runtime.ts`: Fixed a startup race where sandbox provider plugins could be queried before registration completed, by awaiting plugin readiness before resolving environment drivers - `services/heartbeat.ts`: Release environment leases for orphaned runs that lost their process without cleanup ## Verification - `pnpm test` — all existing and new tests pass, including new tests for runtime state upsert and process recovery lease cleanup - `pnpm typecheck` — clean - Manual: trigger concurrent agent runs to verify no duplicate-key failures; verify orphaned leases are released after process loss ## Risks - Low risk. The runtime state upsert changes insert-to-upsert behavior, which could mask a legitimate duplicate if two different runs produce the same key — but this is prevented by the run ID being part of the key. The plugin startup await is bounded by the existing registration timeout. ## Model Used Codex GPT 5.4 high via Paperclip. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-29 16:37:10 -07:00
const DEFAULT_PLUGIN_SANDBOX_WORKER_READY_TIMEOUT_MS = 5_000;
const DEFAULT_PLUGIN_SANDBOX_WORKER_READY_POLL_MS = 100;
function delay(ms: number): Promise<void> {
return new Promise((resolve) => setTimeout(resolve, ms));
}
Add sandbox environment support (#4415) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies. > - The environment/runtime layer decides where agent work executes and how the control plane reaches those runtimes. > - Today Paperclip can run locally and over SSH, but sandboxed execution needs a first-class environment model instead of one-off adapter behavior. > - We also want sandbox providers to be pluggable so the core does not hardcode every provider implementation. > - This branch adds the Sandbox environment path, the provider contract, and a deterministic fake provider plugin. > - That required synchronized changes across shared contracts, plugin SDK surfaces, server runtime orchestration, and the UI environment/workspace flows. > - The result is that sandbox execution becomes a core control-plane capability while keeping provider implementations extensible and testable. ## What Changed - Added sandbox runtime support to the environment execution path, including runtime URL discovery, sandbox execution targeting, orchestration, and heartbeat integration. - Added plugin-provider support for sandbox environments so providers can be supplied via plugins instead of hardcoded server logic. - Added the fake sandbox provider plugin with deterministic behavior suitable for local and automated testing. - Updated shared types, validators, plugin protocol definitions, and SDK helpers to carry sandbox provider and workspace-runtime contracts across package boundaries. - Updated server routes and services so companies can create sandbox environments, select them for work, and execute work through the sandbox runtime path. - Updated the UI environment and workspace surfaces to expose sandbox environment configuration and selection. - Added test coverage for sandbox runtime behavior, provider seams, environment route guards, orchestration, and the fake provider plugin. ## Verification - Ran locally before the final fixture-only scrub: - `pnpm -r typecheck` - `pnpm test:run` - `pnpm build` - Ran locally after the final scrub amend: - `pnpm vitest run server/src/__tests__/runtime-api.test.ts` - Reviewer spot checks: - create a sandbox environment backed by the fake provider plugin - run work through that environment - confirm sandbox provider execution does not inherit host secrets implicitly ## Risks - This touches shared contracts, plugin SDK plumbing, server runtime orchestration, and UI environment/workspace flows, so regressions would likely show up as cross-layer mismatches rather than isolated type errors. - Runtime URL discovery and sandbox callback selection are sensitive to host/bind configuration; if that logic is wrong, sandbox-backed callbacks may fail even when execution succeeds. - The fake provider plugin is intentionally deterministic and test-oriented; future providers may expose capability gaps that this branch does not yet cover. ## Model Used - OpenAI Codex coding agent on a GPT-5-class backend in the Paperclip/Codex harness. Exact backend model ID is not exposed in-session. Tool-assisted workflow with shell execution, file editing, git history inspection, and local test execution. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-24 12:15:53 -07:00
function getLeaseDriverKey(lease: Pick<EnvironmentLease, "metadata">, environment: Pick<Environment, "driver">): string {
const leaseDriver = typeof lease.metadata?.driver === "string" ? lease.metadata.driver : null;
return leaseDriver ?? environment.driver;
}
export function findReusableSandboxLeaseId(input: {
config: SandboxEnvironmentConfig;
leases: Array<Pick<EnvironmentLease, "providerLeaseId" | "metadata">>;
}): string | null {
return findReusableSandboxProviderLeaseId(input);
}
function createLocalEnvironmentDriver(db: Db): EnvironmentRuntimeDriver {
const environmentsSvc = environmentService(db);
return {
driver: "local",
async acquireRunLease(input) {
return await environmentsSvc.acquireLease({
companyId: input.companyId,
environmentId: input.environment.id,
executionWorkspaceId: input.executionWorkspaceId,
issueId: input.issueId,
heartbeatRunId: input.heartbeatRunId,
leasePolicy: "ephemeral",
provider: "local",
metadata: {
driver: input.environment.driver,
executionWorkspaceMode: input.executionWorkspaceMode,
},
});
},
async releaseRunLease(input) {
return await environmentsSvc.releaseLease(input.lease.id, input.status);
},
async realizeWorkspace(input) {
const record = buildWorkspaceRealizationRecordFromDriverInput({
environment: input.environment,
lease: input.lease,
workspace: input.workspace,
cwd: input.workspace.localPath ?? input.workspace.remotePath ?? null,
});
return {
cwd: input.workspace.localPath ?? input.workspace.remotePath ?? "/",
metadata: {
workspaceRealization: record,
},
};
},
};
}
function createSshEnvironmentDriver(db: Db): EnvironmentRuntimeDriver {
const environmentsSvc = environmentService(db);
return {
driver: "ssh",
async acquireRunLease(input) {
const parsed = await resolveEnvironmentDriverConfigForRuntime(db, input.companyId, input.environment);
if (parsed.driver !== "ssh") {
throw new Error(`Expected SSH environment config for driver "${input.environment.driver}".`);
}
const { remoteCwd } = await ensureSshWorkspaceReady(parsed.config);
const candidateUrls = (() => {
const raw = process.env.PAPERCLIP_RUNTIME_API_CANDIDATES_JSON;
if (!raw) return [];
try {
const parsed = JSON.parse(raw);
return Array.isArray(parsed)
? parsed.filter((value): value is string => typeof value === "string" && value.trim().length > 0)
: [];
} catch {
return [];
}
})();
const paperclipApiUrl = await findReachablePaperclipApiUrlOverSsh({
config: parsed.config,
candidates: candidateUrls,
});
if (!paperclipApiUrl) {
throw new Error(
`SSH environment ${parsed.config.username}@${parsed.config.host} could not reach any Paperclip API candidates.`,
);
}
return await environmentsSvc.acquireLease({
companyId: input.companyId,
environmentId: input.environment.id,
executionWorkspaceId: input.executionWorkspaceId,
issueId: input.issueId,
heartbeatRunId: input.heartbeatRunId,
leasePolicy: "ephemeral",
provider: "ssh",
providerLeaseId: `ssh://${parsed.config.username}@${parsed.config.host}:${parsed.config.port}${remoteCwd}`,
metadata: {
driver: input.environment.driver,
executionWorkspaceMode: input.executionWorkspaceMode,
host: parsed.config.host,
port: parsed.config.port,
username: parsed.config.username,
remoteWorkspacePath: parsed.config.remoteWorkspacePath,
remoteCwd,
paperclipApiUrl,
},
});
},
async releaseRunLease(input) {
return await environmentsSvc.releaseLease(input.lease.id, input.status);
},
async realizeWorkspace(input) {
const record = buildWorkspaceRealizationRecordFromDriverInput({
environment: input.environment,
lease: input.lease,
workspace: input.workspace,
cwd:
typeof input.lease.metadata?.remoteCwd === "string" && input.lease.metadata.remoteCwd.trim().length > 0
? input.lease.metadata.remoteCwd.trim()
: input.workspace.remotePath ?? input.workspace.localPath ?? null,
});
return {
cwd: record.remote.path ?? record.local.path,
metadata: {
workspaceRealization: record,
},
};
},
};
}
function createSandboxEnvironmentDriver(
db: Db,
Fix runtime state race, workspace sync, plugin startup, and orphaned leases (#4804) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Agents run inside environments that are leased, and the server manages runtime state, workspace configuration, and plugin lifecycle > - Several edge cases caused failures during concurrent operations: a race condition in runtime state insertion could produce duplicate-key errors, reused workspaces didn't sync their configuration when the parent issue was updated, sandbox provider plugins could be queried before registration completed, and orphaned environment leases from failed runs were never released > - This PR fixes these four runtime/environment issues > - The benefit is more reliable concurrent agent execution and proper resource cleanup ## What Changed - `services/heartbeat.ts`: Fixed a race condition where concurrent runtime state inserts could fail with a duplicate-key error by using an upsert pattern - `services/issues.ts`: Sync reused workspace configuration when an issue is updated, so the workspace reflects the latest issue state - `services/environment-runtime.ts`: Fixed a startup race where sandbox provider plugins could be queried before registration completed, by awaiting plugin readiness before resolving environment drivers - `services/heartbeat.ts`: Release environment leases for orphaned runs that lost their process without cleanup ## Verification - `pnpm test` — all existing and new tests pass, including new tests for runtime state upsert and process recovery lease cleanup - `pnpm typecheck` — clean - Manual: trigger concurrent agent runs to verify no duplicate-key failures; verify orphaned leases are released after process loss ## Risks - Low risk. The runtime state upsert changes insert-to-upsert behavior, which could mask a legitimate duplicate if two different runs produce the same key — but this is prevented by the run ID being part of the key. The plugin startup await is bounded by the existing registration timeout. ## Model Used Codex GPT 5.4 high via Paperclip. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-29 16:37:10 -07:00
options: {
pluginWorkerManager?: PluginWorkerManager;
pluginWorkerReadyTimeoutMs?: number;
pluginWorkerReadyPollMs?: number;
} = {},
Add sandbox environment support (#4415) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies. > - The environment/runtime layer decides where agent work executes and how the control plane reaches those runtimes. > - Today Paperclip can run locally and over SSH, but sandboxed execution needs a first-class environment model instead of one-off adapter behavior. > - We also want sandbox providers to be pluggable so the core does not hardcode every provider implementation. > - This branch adds the Sandbox environment path, the provider contract, and a deterministic fake provider plugin. > - That required synchronized changes across shared contracts, plugin SDK surfaces, server runtime orchestration, and the UI environment/workspace flows. > - The result is that sandbox execution becomes a core control-plane capability while keeping provider implementations extensible and testable. ## What Changed - Added sandbox runtime support to the environment execution path, including runtime URL discovery, sandbox execution targeting, orchestration, and heartbeat integration. - Added plugin-provider support for sandbox environments so providers can be supplied via plugins instead of hardcoded server logic. - Added the fake sandbox provider plugin with deterministic behavior suitable for local and automated testing. - Updated shared types, validators, plugin protocol definitions, and SDK helpers to carry sandbox provider and workspace-runtime contracts across package boundaries. - Updated server routes and services so companies can create sandbox environments, select them for work, and execute work through the sandbox runtime path. - Updated the UI environment and workspace surfaces to expose sandbox environment configuration and selection. - Added test coverage for sandbox runtime behavior, provider seams, environment route guards, orchestration, and the fake provider plugin. ## Verification - Ran locally before the final fixture-only scrub: - `pnpm -r typecheck` - `pnpm test:run` - `pnpm build` - Ran locally after the final scrub amend: - `pnpm vitest run server/src/__tests__/runtime-api.test.ts` - Reviewer spot checks: - create a sandbox environment backed by the fake provider plugin - run work through that environment - confirm sandbox provider execution does not inherit host secrets implicitly ## Risks - This touches shared contracts, plugin SDK plumbing, server runtime orchestration, and UI environment/workspace flows, so regressions would likely show up as cross-layer mismatches rather than isolated type errors. - Runtime URL discovery and sandbox callback selection are sensitive to host/bind configuration; if that logic is wrong, sandbox-backed callbacks may fail even when execution succeeds. - The fake provider plugin is intentionally deterministic and test-oriented; future providers may expose capability gaps that this branch does not yet cover. ## Model Used - OpenAI Codex coding agent on a GPT-5-class backend in the Paperclip/Codex harness. Exact backend model ID is not exposed in-session. Tool-assisted workflow with shell execution, file editing, git history inspection, and local test execution. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-24 12:15:53 -07:00
): EnvironmentRuntimeDriver {
Fix runtime state race, workspace sync, plugin startup, and orphaned leases (#4804) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Agents run inside environments that are leased, and the server manages runtime state, workspace configuration, and plugin lifecycle > - Several edge cases caused failures during concurrent operations: a race condition in runtime state insertion could produce duplicate-key errors, reused workspaces didn't sync their configuration when the parent issue was updated, sandbox provider plugins could be queried before registration completed, and orphaned environment leases from failed runs were never released > - This PR fixes these four runtime/environment issues > - The benefit is more reliable concurrent agent execution and proper resource cleanup ## What Changed - `services/heartbeat.ts`: Fixed a race condition where concurrent runtime state inserts could fail with a duplicate-key error by using an upsert pattern - `services/issues.ts`: Sync reused workspace configuration when an issue is updated, so the workspace reflects the latest issue state - `services/environment-runtime.ts`: Fixed a startup race where sandbox provider plugins could be queried before registration completed, by awaiting plugin readiness before resolving environment drivers - `services/heartbeat.ts`: Release environment leases for orphaned runs that lost their process without cleanup ## Verification - `pnpm test` — all existing and new tests pass, including new tests for runtime state upsert and process recovery lease cleanup - `pnpm typecheck` — clean - Manual: trigger concurrent agent runs to verify no duplicate-key failures; verify orphaned leases are released after process loss ## Risks - Low risk. The runtime state upsert changes insert-to-upsert behavior, which could mask a legitimate duplicate if two different runs produce the same key — but this is prevented by the run ID being part of the key. The plugin startup await is bounded by the existing registration timeout. ## Model Used Codex GPT 5.4 high via Paperclip. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-29 16:37:10 -07:00
const pluginWorkerManager = options.pluginWorkerManager;
const pluginWorkerReadyTimeoutMs = options.pluginWorkerReadyTimeoutMs ?? DEFAULT_PLUGIN_SANDBOX_WORKER_READY_TIMEOUT_MS;
const pluginWorkerReadyPollMs = options.pluginWorkerReadyPollMs ?? DEFAULT_PLUGIN_SANDBOX_WORKER_READY_POLL_MS;
Add sandbox environment support (#4415) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies. > - The environment/runtime layer decides where agent work executes and how the control plane reaches those runtimes. > - Today Paperclip can run locally and over SSH, but sandboxed execution needs a first-class environment model instead of one-off adapter behavior. > - We also want sandbox providers to be pluggable so the core does not hardcode every provider implementation. > - This branch adds the Sandbox environment path, the provider contract, and a deterministic fake provider plugin. > - That required synchronized changes across shared contracts, plugin SDK surfaces, server runtime orchestration, and the UI environment/workspace flows. > - The result is that sandbox execution becomes a core control-plane capability while keeping provider implementations extensible and testable. ## What Changed - Added sandbox runtime support to the environment execution path, including runtime URL discovery, sandbox execution targeting, orchestration, and heartbeat integration. - Added plugin-provider support for sandbox environments so providers can be supplied via plugins instead of hardcoded server logic. - Added the fake sandbox provider plugin with deterministic behavior suitable for local and automated testing. - Updated shared types, validators, plugin protocol definitions, and SDK helpers to carry sandbox provider and workspace-runtime contracts across package boundaries. - Updated server routes and services so companies can create sandbox environments, select them for work, and execute work through the sandbox runtime path. - Updated the UI environment and workspace surfaces to expose sandbox environment configuration and selection. - Added test coverage for sandbox runtime behavior, provider seams, environment route guards, orchestration, and the fake provider plugin. ## Verification - Ran locally before the final fixture-only scrub: - `pnpm -r typecheck` - `pnpm test:run` - `pnpm build` - Ran locally after the final scrub amend: - `pnpm vitest run server/src/__tests__/runtime-api.test.ts` - Reviewer spot checks: - create a sandbox environment backed by the fake provider plugin - run work through that environment - confirm sandbox provider execution does not inherit host secrets implicitly ## Risks - This touches shared contracts, plugin SDK plumbing, server runtime orchestration, and UI environment/workspace flows, so regressions would likely show up as cross-layer mismatches rather than isolated type errors. - Runtime URL discovery and sandbox callback selection are sensitive to host/bind configuration; if that logic is wrong, sandbox-backed callbacks may fail even when execution succeeds. - The fake provider plugin is intentionally deterministic and test-oriented; future providers may expose capability gaps that this branch does not yet cover. ## Model Used - OpenAI Codex coding agent on a GPT-5-class backend in the Paperclip/Codex harness. Exact backend model ID is not exposed in-session. Tool-assisted workflow with shell execution, file editing, git history inspection, and local test execution. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-24 12:15:53 -07:00
const environmentsSvc = environmentService(db);
Fix runtime state race, workspace sync, plugin startup, and orphaned leases (#4804) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Agents run inside environments that are leased, and the server manages runtime state, workspace configuration, and plugin lifecycle > - Several edge cases caused failures during concurrent operations: a race condition in runtime state insertion could produce duplicate-key errors, reused workspaces didn't sync their configuration when the parent issue was updated, sandbox provider plugins could be queried before registration completed, and orphaned environment leases from failed runs were never released > - This PR fixes these four runtime/environment issues > - The benefit is more reliable concurrent agent execution and proper resource cleanup ## What Changed - `services/heartbeat.ts`: Fixed a race condition where concurrent runtime state inserts could fail with a duplicate-key error by using an upsert pattern - `services/issues.ts`: Sync reused workspace configuration when an issue is updated, so the workspace reflects the latest issue state - `services/environment-runtime.ts`: Fixed a startup race where sandbox provider plugins could be queried before registration completed, by awaiting plugin readiness before resolving environment drivers - `services/heartbeat.ts`: Release environment leases for orphaned runs that lost their process without cleanup ## Verification - `pnpm test` — all existing and new tests pass, including new tests for runtime state upsert and process recovery lease cleanup - `pnpm typecheck` — clean - Manual: trigger concurrent agent runs to verify no duplicate-key failures; verify orphaned leases are released after process loss ## Risks - Low risk. The runtime state upsert changes insert-to-upsert behavior, which could mask a legitimate duplicate if two different runs produce the same key — but this is prevented by the run ID being part of the key. The plugin startup await is bounded by the existing registration timeout. ## Model Used Codex GPT 5.4 high via Paperclip. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-29 16:37:10 -07:00
async function resolveSandboxProviderPlugin(input: { provider: string }) {
const running = await resolvePluginSandboxProviderDriverByKey({
db,
driverKey: input.provider,
workerManager: pluginWorkerManager,
requireRunning: true,
});
if (running) {
return { state: "running" as const, resolved: running };
}
const installed = await resolvePluginSandboxProviderDriverByKey({
db,
driverKey: input.provider,
workerManager: pluginWorkerManager,
requireRunning: false,
});
if (!installed) {
return { state: "missing" as const, resolved: null };
}
if (installed.plugin.status !== "ready") {
return { state: "not_ready" as const, resolved: installed };
}
if (!pluginWorkerManager) {
return { state: "worker_unavailable" as const, resolved: installed };
}
const deadline = Date.now() + Math.max(0, pluginWorkerReadyTimeoutMs);
while (Date.now() < deadline) {
const retried = await resolvePluginSandboxProviderDriverByKey({
db,
driverKey: input.provider,
workerManager: pluginWorkerManager,
requireRunning: true,
});
if (retried) {
return { state: "running" as const, resolved: retried };
}
await delay(Math.max(1, pluginWorkerReadyPollMs));
}
return { state: "worker_unavailable" as const, resolved: installed };
}
Add sandbox environment support (#4415) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies. > - The environment/runtime layer decides where agent work executes and how the control plane reaches those runtimes. > - Today Paperclip can run locally and over SSH, but sandboxed execution needs a first-class environment model instead of one-off adapter behavior. > - We also want sandbox providers to be pluggable so the core does not hardcode every provider implementation. > - This branch adds the Sandbox environment path, the provider contract, and a deterministic fake provider plugin. > - That required synchronized changes across shared contracts, plugin SDK surfaces, server runtime orchestration, and the UI environment/workspace flows. > - The result is that sandbox execution becomes a core control-plane capability while keeping provider implementations extensible and testable. ## What Changed - Added sandbox runtime support to the environment execution path, including runtime URL discovery, sandbox execution targeting, orchestration, and heartbeat integration. - Added plugin-provider support for sandbox environments so providers can be supplied via plugins instead of hardcoded server logic. - Added the fake sandbox provider plugin with deterministic behavior suitable for local and automated testing. - Updated shared types, validators, plugin protocol definitions, and SDK helpers to carry sandbox provider and workspace-runtime contracts across package boundaries. - Updated server routes and services so companies can create sandbox environments, select them for work, and execute work through the sandbox runtime path. - Updated the UI environment and workspace surfaces to expose sandbox environment configuration and selection. - Added test coverage for sandbox runtime behavior, provider seams, environment route guards, orchestration, and the fake provider plugin. ## Verification - Ran locally before the final fixture-only scrub: - `pnpm -r typecheck` - `pnpm test:run` - `pnpm build` - Ran locally after the final scrub amend: - `pnpm vitest run server/src/__tests__/runtime-api.test.ts` - Reviewer spot checks: - create a sandbox environment backed by the fake provider plugin - run work through that environment - confirm sandbox provider execution does not inherit host secrets implicitly ## Risks - This touches shared contracts, plugin SDK plumbing, server runtime orchestration, and UI environment/workspace flows, so regressions would likely show up as cross-layer mismatches rather than isolated type errors. - Runtime URL discovery and sandbox callback selection are sensitive to host/bind configuration; if that logic is wrong, sandbox-backed callbacks may fail even when execution succeeds. - The fake provider plugin is intentionally deterministic and test-oriented; future providers may expose capability gaps that this branch does not yet cover. ## Model Used - OpenAI Codex coding agent on a GPT-5-class backend in the Paperclip/Codex harness. Exact backend model ID is not exposed in-session. Tool-assisted workflow with shell execution, file editing, git history inspection, and local test execution. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-24 12:15:53 -07:00
async function resolvePluginSandboxRuntimeConfig(input: {
environment: Environment;
lease: EnvironmentLease;
provider: string;
}): Promise<Record<string, unknown>> {
const metadataConfig = sandboxConfigFromLeaseMetadataLoose(input.lease);
if (metadataConfig && metadataConfig.provider === input.provider) {
const parsed = await resolveEnvironmentDriverConfigForRuntime(db, input.lease.companyId, {
driver: "sandbox",
config: sandboxConfigForLeaseMetadata(metadataConfig),
});
if (parsed.driver === "sandbox") {
return parsed.config as unknown as Record<string, unknown>;
}
}
if (input.environment.driver === "sandbox") {
try {
const parsed = await resolveEnvironmentDriverConfigForRuntime(
db,
input.lease.companyId,
input.environment,
);
if (parsed.driver === "sandbox" && parsed.config.provider === input.provider) {
return parsed.config as unknown as Record<string, unknown>;
}
} catch {
// Lease metadata below is intentionally kept sufficient for cleanup
// after the environment config changes or becomes invalid.
}
}
return {
provider: input.provider,
...sanitizePluginSandboxConfigFromLeaseMetadata(input.lease.metadata),
};
}
return {
driver: "sandbox",
async acquireRunLease(input) {
Generalize sandbox provider core for plugin-only providers (#4449) ## Thinking Path > - Paperclip is a control plane, so optional execution providers should sit at the plugin edge instead of hardcoding provider-specific behavior into core shared/server/ui layers. > - Sandbox environments are already first-class, and the fake provider proves the built-in path; the remaining gap was that real providers still leaked provider-specific config and runtime assumptions into core. > - That coupling showed up in config normalization, secret persistence, capabilities reporting, lease reconstruction, and the board UI form fields. > - As long as core knew about those provider-shaped details, shipping a provider as a pure third-party plugin meant every new provider would still require host changes. > - This pull request generalizes the sandbox provider seam around schema-driven plugin metadata and generic secret-ref handling. > - The runtime and UI now consume provider metadata generically, so core only special-cases the built-in fake provider while third-party providers can live entirely in plugins. ## What Changed - Added generic sandbox-provider capability metadata so plugin-backed providers can expose `configSchema` through shared environment support and the environments capabilities API. - Reworked sandbox config normalization/persistence/runtime resolution to handle schema-declared secret-ref fields generically, storing them as Paperclip secrets and resolving them for probe/execute/release flows. - Generalized plugin sandbox runtime handling so provider validation, reusable-lease matching, lease reconstruction, and plugin worker calls all operate on provider-agnostic config instead of provider-shaped branches. - Replaced hardcoded sandbox provider form fields in Company Settings with schema-driven rendering and blocked agent environment selection from the built-in fake provider. - Added regression coverage for the generic seam across shared support helpers plus environment config, probe, routes, runtime, and sandbox-provider runtime tests. ## Verification - `pnpm vitest --run packages/shared/src/environment-support.test.ts server/src/__tests__/environment-config.test.ts server/src/__tests__/environment-probe.test.ts server/src/__tests__/environment-routes.test.ts server/src/__tests__/environment-runtime.test.ts server/src/__tests__/sandbox-provider-runtime.test.ts` - `pnpm -r typecheck` ## Risks - Plugin sandbox providers now depend more heavily on accurate `configSchema` declarations; incorrect schemas can misclassify secret-bearing fields or omit required config. - Reusable lease matching is now metadata-driven for plugin-backed providers, so providers that fail to persist stable metadata may reprovision instead of resuming an existing lease. - The UI form is now fully schema-driven for plugin-backed sandbox providers; provider manifests without good defaults or descriptions may produce a rougher operator experience. ## Model Used - OpenAI Codex via `codex_local` - Model ID: `gpt-5.4` - Reasoning effort: `high` - Context window observed in runtime session metadata: `258400` tokens - Capabilities used: terminal tool execution, git, and local code/test inspection ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-24 18:03:41 -07:00
const storedParsed = parseEnvironmentDriverConfig(input.environment);
Add sandbox environment support (#4415) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies. > - The environment/runtime layer decides where agent work executes and how the control plane reaches those runtimes. > - Today Paperclip can run locally and over SSH, but sandboxed execution needs a first-class environment model instead of one-off adapter behavior. > - We also want sandbox providers to be pluggable so the core does not hardcode every provider implementation. > - This branch adds the Sandbox environment path, the provider contract, and a deterministic fake provider plugin. > - That required synchronized changes across shared contracts, plugin SDK surfaces, server runtime orchestration, and the UI environment/workspace flows. > - The result is that sandbox execution becomes a core control-plane capability while keeping provider implementations extensible and testable. ## What Changed - Added sandbox runtime support to the environment execution path, including runtime URL discovery, sandbox execution targeting, orchestration, and heartbeat integration. - Added plugin-provider support for sandbox environments so providers can be supplied via plugins instead of hardcoded server logic. - Added the fake sandbox provider plugin with deterministic behavior suitable for local and automated testing. - Updated shared types, validators, plugin protocol definitions, and SDK helpers to carry sandbox provider and workspace-runtime contracts across package boundaries. - Updated server routes and services so companies can create sandbox environments, select them for work, and execute work through the sandbox runtime path. - Updated the UI environment and workspace surfaces to expose sandbox environment configuration and selection. - Added test coverage for sandbox runtime behavior, provider seams, environment route guards, orchestration, and the fake provider plugin. ## Verification - Ran locally before the final fixture-only scrub: - `pnpm -r typecheck` - `pnpm test:run` - `pnpm build` - Ran locally after the final scrub amend: - `pnpm vitest run server/src/__tests__/runtime-api.test.ts` - Reviewer spot checks: - create a sandbox environment backed by the fake provider plugin - run work through that environment - confirm sandbox provider execution does not inherit host secrets implicitly ## Risks - This touches shared contracts, plugin SDK plumbing, server runtime orchestration, and UI environment/workspace flows, so regressions would likely show up as cross-layer mismatches rather than isolated type errors. - Runtime URL discovery and sandbox callback selection are sensitive to host/bind configuration; if that logic is wrong, sandbox-backed callbacks may fail even when execution succeeds. - The fake provider plugin is intentionally deterministic and test-oriented; future providers may expose capability gaps that this branch does not yet cover. ## Model Used - OpenAI Codex coding agent on a GPT-5-class backend in the Paperclip/Codex harness. Exact backend model ID is not exposed in-session. Tool-assisted workflow with shell execution, file editing, git history inspection, and local test execution. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-24 12:15:53 -07:00
const parsed = await resolveEnvironmentDriverConfigForRuntime(db, input.companyId, input.environment);
Generalize sandbox provider core for plugin-only providers (#4449) ## Thinking Path > - Paperclip is a control plane, so optional execution providers should sit at the plugin edge instead of hardcoding provider-specific behavior into core shared/server/ui layers. > - Sandbox environments are already first-class, and the fake provider proves the built-in path; the remaining gap was that real providers still leaked provider-specific config and runtime assumptions into core. > - That coupling showed up in config normalization, secret persistence, capabilities reporting, lease reconstruction, and the board UI form fields. > - As long as core knew about those provider-shaped details, shipping a provider as a pure third-party plugin meant every new provider would still require host changes. > - This pull request generalizes the sandbox provider seam around schema-driven plugin metadata and generic secret-ref handling. > - The runtime and UI now consume provider metadata generically, so core only special-cases the built-in fake provider while third-party providers can live entirely in plugins. ## What Changed - Added generic sandbox-provider capability metadata so plugin-backed providers can expose `configSchema` through shared environment support and the environments capabilities API. - Reworked sandbox config normalization/persistence/runtime resolution to handle schema-declared secret-ref fields generically, storing them as Paperclip secrets and resolving them for probe/execute/release flows. - Generalized plugin sandbox runtime handling so provider validation, reusable-lease matching, lease reconstruction, and plugin worker calls all operate on provider-agnostic config instead of provider-shaped branches. - Replaced hardcoded sandbox provider form fields in Company Settings with schema-driven rendering and blocked agent environment selection from the built-in fake provider. - Added regression coverage for the generic seam across shared support helpers plus environment config, probe, routes, runtime, and sandbox-provider runtime tests. ## Verification - `pnpm vitest --run packages/shared/src/environment-support.test.ts server/src/__tests__/environment-config.test.ts server/src/__tests__/environment-probe.test.ts server/src/__tests__/environment-routes.test.ts server/src/__tests__/environment-runtime.test.ts server/src/__tests__/sandbox-provider-runtime.test.ts` - `pnpm -r typecheck` ## Risks - Plugin sandbox providers now depend more heavily on accurate `configSchema` declarations; incorrect schemas can misclassify secret-bearing fields or omit required config. - Reusable lease matching is now metadata-driven for plugin-backed providers, so providers that fail to persist stable metadata may reprovision instead of resuming an existing lease. - The UI form is now fully schema-driven for plugin-backed sandbox providers; provider manifests without good defaults or descriptions may produce a rougher operator experience. ## Model Used - OpenAI Codex via `codex_local` - Model ID: `gpt-5.4` - Reasoning effort: `high` - Context window observed in runtime session metadata: `258400` tokens - Capabilities used: terminal tool execution, git, and local code/test inspection ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-24 18:03:41 -07:00
if (parsed.driver !== "sandbox" || storedParsed.driver !== "sandbox") {
Add sandbox environment support (#4415) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies. > - The environment/runtime layer decides where agent work executes and how the control plane reaches those runtimes. > - Today Paperclip can run locally and over SSH, but sandboxed execution needs a first-class environment model instead of one-off adapter behavior. > - We also want sandbox providers to be pluggable so the core does not hardcode every provider implementation. > - This branch adds the Sandbox environment path, the provider contract, and a deterministic fake provider plugin. > - That required synchronized changes across shared contracts, plugin SDK surfaces, server runtime orchestration, and the UI environment/workspace flows. > - The result is that sandbox execution becomes a core control-plane capability while keeping provider implementations extensible and testable. ## What Changed - Added sandbox runtime support to the environment execution path, including runtime URL discovery, sandbox execution targeting, orchestration, and heartbeat integration. - Added plugin-provider support for sandbox environments so providers can be supplied via plugins instead of hardcoded server logic. - Added the fake sandbox provider plugin with deterministic behavior suitable for local and automated testing. - Updated shared types, validators, plugin protocol definitions, and SDK helpers to carry sandbox provider and workspace-runtime contracts across package boundaries. - Updated server routes and services so companies can create sandbox environments, select them for work, and execute work through the sandbox runtime path. - Updated the UI environment and workspace surfaces to expose sandbox environment configuration and selection. - Added test coverage for sandbox runtime behavior, provider seams, environment route guards, orchestration, and the fake provider plugin. ## Verification - Ran locally before the final fixture-only scrub: - `pnpm -r typecheck` - `pnpm test:run` - `pnpm build` - Ran locally after the final scrub amend: - `pnpm vitest run server/src/__tests__/runtime-api.test.ts` - Reviewer spot checks: - create a sandbox environment backed by the fake provider plugin - run work through that environment - confirm sandbox provider execution does not inherit host secrets implicitly ## Risks - This touches shared contracts, plugin SDK plumbing, server runtime orchestration, and UI environment/workspace flows, so regressions would likely show up as cross-layer mismatches rather than isolated type errors. - Runtime URL discovery and sandbox callback selection are sensitive to host/bind configuration; if that logic is wrong, sandbox-backed callbacks may fail even when execution succeeds. - The fake provider plugin is intentionally deterministic and test-oriented; future providers may expose capability gaps that this branch does not yet cover. ## Model Used - OpenAI Codex coding agent on a GPT-5-class backend in the Paperclip/Codex harness. Exact backend model ID is not exposed in-session. Tool-assisted workflow with shell execution, file editing, git history inspection, and local test execution. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-24 12:15:53 -07:00
throw new Error(`Expected sandbox environment config for driver "${input.environment.driver}".`);
}
// Check if this provider should be handled by a plugin.
if (!isBuiltinSandboxProvider(parsed.config.provider)) {
Fix runtime state race, workspace sync, plugin startup, and orphaned leases (#4804) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Agents run inside environments that are leased, and the server manages runtime state, workspace configuration, and plugin lifecycle > - Several edge cases caused failures during concurrent operations: a race condition in runtime state insertion could produce duplicate-key errors, reused workspaces didn't sync their configuration when the parent issue was updated, sandbox provider plugins could be queried before registration completed, and orphaned environment leases from failed runs were never released > - This PR fixes these four runtime/environment issues > - The benefit is more reliable concurrent agent execution and proper resource cleanup ## What Changed - `services/heartbeat.ts`: Fixed a race condition where concurrent runtime state inserts could fail with a duplicate-key error by using an upsert pattern - `services/issues.ts`: Sync reused workspace configuration when an issue is updated, so the workspace reflects the latest issue state - `services/environment-runtime.ts`: Fixed a startup race where sandbox provider plugins could be queried before registration completed, by awaiting plugin readiness before resolving environment drivers - `services/heartbeat.ts`: Release environment leases for orphaned runs that lost their process without cleanup ## Verification - `pnpm test` — all existing and new tests pass, including new tests for runtime state upsert and process recovery lease cleanup - `pnpm typecheck` — clean - Manual: trigger concurrent agent runs to verify no duplicate-key failures; verify orphaned leases are released after process loss ## Risks - Low risk. The runtime state upsert changes insert-to-upsert behavior, which could mask a legitimate duplicate if two different runs produce the same key — but this is prevented by the run ID being part of the key. The plugin startup await is bounded by the existing registration timeout. ## Model Used Codex GPT 5.4 high via Paperclip. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-29 16:37:10 -07:00
const pluginProvider = await resolveSandboxProviderPlugin({
provider: parsed.config.provider,
Generalize sandbox provider core for plugin-only providers (#4449) ## Thinking Path > - Paperclip is a control plane, so optional execution providers should sit at the plugin edge instead of hardcoding provider-specific behavior into core shared/server/ui layers. > - Sandbox environments are already first-class, and the fake provider proves the built-in path; the remaining gap was that real providers still leaked provider-specific config and runtime assumptions into core. > - That coupling showed up in config normalization, secret persistence, capabilities reporting, lease reconstruction, and the board UI form fields. > - As long as core knew about those provider-shaped details, shipping a provider as a pure third-party plugin meant every new provider would still require host changes. > - This pull request generalizes the sandbox provider seam around schema-driven plugin metadata and generic secret-ref handling. > - The runtime and UI now consume provider metadata generically, so core only special-cases the built-in fake provider while third-party providers can live entirely in plugins. ## What Changed - Added generic sandbox-provider capability metadata so plugin-backed providers can expose `configSchema` through shared environment support and the environments capabilities API. - Reworked sandbox config normalization/persistence/runtime resolution to handle schema-declared secret-ref fields generically, storing them as Paperclip secrets and resolving them for probe/execute/release flows. - Generalized plugin sandbox runtime handling so provider validation, reusable-lease matching, lease reconstruction, and plugin worker calls all operate on provider-agnostic config instead of provider-shaped branches. - Replaced hardcoded sandbox provider form fields in Company Settings with schema-driven rendering and blocked agent environment selection from the built-in fake provider. - Added regression coverage for the generic seam across shared support helpers plus environment config, probe, routes, runtime, and sandbox-provider runtime tests. ## Verification - `pnpm vitest --run packages/shared/src/environment-support.test.ts server/src/__tests__/environment-config.test.ts server/src/__tests__/environment-probe.test.ts server/src/__tests__/environment-routes.test.ts server/src/__tests__/environment-runtime.test.ts server/src/__tests__/sandbox-provider-runtime.test.ts` - `pnpm -r typecheck` ## Risks - Plugin sandbox providers now depend more heavily on accurate `configSchema` declarations; incorrect schemas can misclassify secret-bearing fields or omit required config. - Reusable lease matching is now metadata-driven for plugin-backed providers, so providers that fail to persist stable metadata may reprovision instead of resuming an existing lease. - The UI form is now fully schema-driven for plugin-backed sandbox providers; provider manifests without good defaults or descriptions may produce a rougher operator experience. ## Model Used - OpenAI Codex via `codex_local` - Model ID: `gpt-5.4` - Reasoning effort: `high` - Context window observed in runtime session metadata: `258400` tokens - Capabilities used: terminal tool execution, git, and local code/test inspection ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-24 18:03:41 -07:00
});
Fix runtime state race, workspace sync, plugin startup, and orphaned leases (#4804) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Agents run inside environments that are leased, and the server manages runtime state, workspace configuration, and plugin lifecycle > - Several edge cases caused failures during concurrent operations: a race condition in runtime state insertion could produce duplicate-key errors, reused workspaces didn't sync their configuration when the parent issue was updated, sandbox provider plugins could be queried before registration completed, and orphaned environment leases from failed runs were never released > - This PR fixes these four runtime/environment issues > - The benefit is more reliable concurrent agent execution and proper resource cleanup ## What Changed - `services/heartbeat.ts`: Fixed a race condition where concurrent runtime state inserts could fail with a duplicate-key error by using an upsert pattern - `services/issues.ts`: Sync reused workspace configuration when an issue is updated, so the workspace reflects the latest issue state - `services/environment-runtime.ts`: Fixed a startup race where sandbox provider plugins could be queried before registration completed, by awaiting plugin readiness before resolving environment drivers - `services/heartbeat.ts`: Release environment leases for orphaned runs that lost their process without cleanup ## Verification - `pnpm test` — all existing and new tests pass, including new tests for runtime state upsert and process recovery lease cleanup - `pnpm typecheck` — clean - Manual: trigger concurrent agent runs to verify no duplicate-key failures; verify orphaned leases are released after process loss ## Risks - Low risk. The runtime state upsert changes insert-to-upsert behavior, which could mask a legitimate duplicate if two different runs produce the same key — but this is prevented by the run ID being part of the key. The plugin startup await is bounded by the existing registration timeout. ## Model Used Codex GPT 5.4 high via Paperclip. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-29 16:37:10 -07:00
if (pluginProvider.state === "missing") {
Add sandbox environment support (#4415) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies. > - The environment/runtime layer decides where agent work executes and how the control plane reaches those runtimes. > - Today Paperclip can run locally and over SSH, but sandboxed execution needs a first-class environment model instead of one-off adapter behavior. > - We also want sandbox providers to be pluggable so the core does not hardcode every provider implementation. > - This branch adds the Sandbox environment path, the provider contract, and a deterministic fake provider plugin. > - That required synchronized changes across shared contracts, plugin SDK surfaces, server runtime orchestration, and the UI environment/workspace flows. > - The result is that sandbox execution becomes a core control-plane capability while keeping provider implementations extensible and testable. ## What Changed - Added sandbox runtime support to the environment execution path, including runtime URL discovery, sandbox execution targeting, orchestration, and heartbeat integration. - Added plugin-provider support for sandbox environments so providers can be supplied via plugins instead of hardcoded server logic. - Added the fake sandbox provider plugin with deterministic behavior suitable for local and automated testing. - Updated shared types, validators, plugin protocol definitions, and SDK helpers to carry sandbox provider and workspace-runtime contracts across package boundaries. - Updated server routes and services so companies can create sandbox environments, select them for work, and execute work through the sandbox runtime path. - Updated the UI environment and workspace surfaces to expose sandbox environment configuration and selection. - Added test coverage for sandbox runtime behavior, provider seams, environment route guards, orchestration, and the fake provider plugin. ## Verification - Ran locally before the final fixture-only scrub: - `pnpm -r typecheck` - `pnpm test:run` - `pnpm build` - Ran locally after the final scrub amend: - `pnpm vitest run server/src/__tests__/runtime-api.test.ts` - Reviewer spot checks: - create a sandbox environment backed by the fake provider plugin - run work through that environment - confirm sandbox provider execution does not inherit host secrets implicitly ## Risks - This touches shared contracts, plugin SDK plumbing, server runtime orchestration, and UI environment/workspace flows, so regressions would likely show up as cross-layer mismatches rather than isolated type errors. - Runtime URL discovery and sandbox callback selection are sensitive to host/bind configuration; if that logic is wrong, sandbox-backed callbacks may fail even when execution succeeds. - The fake provider plugin is intentionally deterministic and test-oriented; future providers may expose capability gaps that this branch does not yet cover. ## Model Used - OpenAI Codex coding agent on a GPT-5-class backend in the Paperclip/Codex harness. Exact backend model ID is not exposed in-session. Tool-assisted workflow with shell execution, file editing, git history inspection, and local test execution. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-24 12:15:53 -07:00
throw new Error(
`Sandbox provider "${parsed.config.provider}" is not registered as a built-in provider and no matching plugin is available.`,
);
}
Fix runtime state race, workspace sync, plugin startup, and orphaned leases (#4804) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Agents run inside environments that are leased, and the server manages runtime state, workspace configuration, and plugin lifecycle > - Several edge cases caused failures during concurrent operations: a race condition in runtime state insertion could produce duplicate-key errors, reused workspaces didn't sync their configuration when the parent issue was updated, sandbox provider plugins could be queried before registration completed, and orphaned environment leases from failed runs were never released > - This PR fixes these four runtime/environment issues > - The benefit is more reliable concurrent agent execution and proper resource cleanup ## What Changed - `services/heartbeat.ts`: Fixed a race condition where concurrent runtime state inserts could fail with a duplicate-key error by using an upsert pattern - `services/issues.ts`: Sync reused workspace configuration when an issue is updated, so the workspace reflects the latest issue state - `services/environment-runtime.ts`: Fixed a startup race where sandbox provider plugins could be queried before registration completed, by awaiting plugin readiness before resolving environment drivers - `services/heartbeat.ts`: Release environment leases for orphaned runs that lost their process without cleanup ## Verification - `pnpm test` — all existing and new tests pass, including new tests for runtime state upsert and process recovery lease cleanup - `pnpm typecheck` — clean - Manual: trigger concurrent agent runs to verify no duplicate-key failures; verify orphaned leases are released after process loss ## Risks - Low risk. The runtime state upsert changes insert-to-upsert behavior, which could mask a legitimate duplicate if two different runs produce the same key — but this is prevented by the run ID being part of the key. The plugin startup await is bounded by the existing registration timeout. ## Model Used Codex GPT 5.4 high via Paperclip. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-29 16:37:10 -07:00
if (pluginProvider.state === "not_ready") {
throw new Error(
`Sandbox provider "${parsed.config.provider}" is installed via plugin "${pluginProvider.resolved.plugin.pluginKey}", but that plugin is currently ${pluginProvider.resolved.plugin.status}.`,
);
}
if (pluginProvider.state === "worker_unavailable") {
throw new Error(
`Sandbox provider "${parsed.config.provider}" is installed via plugin "${pluginProvider.resolved.plugin.pluginKey}", but its worker is not running.`,
);
}
if (!pluginWorkerManager) {
throw new Error(
`Sandbox provider "${parsed.config.provider}" is installed, but sandbox plugin workers are unavailable in this server process.`,
);
}
Add sandbox environment support (#4415) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies. > - The environment/runtime layer decides where agent work executes and how the control plane reaches those runtimes. > - Today Paperclip can run locally and over SSH, but sandboxed execution needs a first-class environment model instead of one-off adapter behavior. > - We also want sandbox providers to be pluggable so the core does not hardcode every provider implementation. > - This branch adds the Sandbox environment path, the provider contract, and a deterministic fake provider plugin. > - That required synchronized changes across shared contracts, plugin SDK surfaces, server runtime orchestration, and the UI environment/workspace flows. > - The result is that sandbox execution becomes a core control-plane capability while keeping provider implementations extensible and testable. ## What Changed - Added sandbox runtime support to the environment execution path, including runtime URL discovery, sandbox execution targeting, orchestration, and heartbeat integration. - Added plugin-provider support for sandbox environments so providers can be supplied via plugins instead of hardcoded server logic. - Added the fake sandbox provider plugin with deterministic behavior suitable for local and automated testing. - Updated shared types, validators, plugin protocol definitions, and SDK helpers to carry sandbox provider and workspace-runtime contracts across package boundaries. - Updated server routes and services so companies can create sandbox environments, select them for work, and execute work through the sandbox runtime path. - Updated the UI environment and workspace surfaces to expose sandbox environment configuration and selection. - Added test coverage for sandbox runtime behavior, provider seams, environment route guards, orchestration, and the fake provider plugin. ## Verification - Ran locally before the final fixture-only scrub: - `pnpm -r typecheck` - `pnpm test:run` - `pnpm build` - Ran locally after the final scrub amend: - `pnpm vitest run server/src/__tests__/runtime-api.test.ts` - Reviewer spot checks: - create a sandbox environment backed by the fake provider plugin - run work through that environment - confirm sandbox provider execution does not inherit host secrets implicitly ## Risks - This touches shared contracts, plugin SDK plumbing, server runtime orchestration, and UI environment/workspace flows, so regressions would likely show up as cross-layer mismatches rather than isolated type errors. - Runtime URL discovery and sandbox callback selection are sensitive to host/bind configuration; if that logic is wrong, sandbox-backed callbacks may fail even when execution succeeds. - The fake provider plugin is intentionally deterministic and test-oriented; future providers may expose capability gaps that this branch does not yet cover. ## Model Used - OpenAI Codex coding agent on a GPT-5-class backend in the Paperclip/Codex harness. Exact backend model ID is not exposed in-session. Tool-assisted workflow with shell execution, file editing, git history inspection, and local test execution. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-24 12:15:53 -07:00
Generalize sandbox provider core for plugin-only providers (#4449) ## Thinking Path > - Paperclip is a control plane, so optional execution providers should sit at the plugin edge instead of hardcoding provider-specific behavior into core shared/server/ui layers. > - Sandbox environments are already first-class, and the fake provider proves the built-in path; the remaining gap was that real providers still leaked provider-specific config and runtime assumptions into core. > - That coupling showed up in config normalization, secret persistence, capabilities reporting, lease reconstruction, and the board UI form fields. > - As long as core knew about those provider-shaped details, shipping a provider as a pure third-party plugin meant every new provider would still require host changes. > - This pull request generalizes the sandbox provider seam around schema-driven plugin metadata and generic secret-ref handling. > - The runtime and UI now consume provider metadata generically, so core only special-cases the built-in fake provider while third-party providers can live entirely in plugins. ## What Changed - Added generic sandbox-provider capability metadata so plugin-backed providers can expose `configSchema` through shared environment support and the environments capabilities API. - Reworked sandbox config normalization/persistence/runtime resolution to handle schema-declared secret-ref fields generically, storing them as Paperclip secrets and resolving them for probe/execute/release flows. - Generalized plugin sandbox runtime handling so provider validation, reusable-lease matching, lease reconstruction, and plugin worker calls all operate on provider-agnostic config instead of provider-shaped branches. - Replaced hardcoded sandbox provider form fields in Company Settings with schema-driven rendering and blocked agent environment selection from the built-in fake provider. - Added regression coverage for the generic seam across shared support helpers plus environment config, probe, routes, runtime, and sandbox-provider runtime tests. ## Verification - `pnpm vitest --run packages/shared/src/environment-support.test.ts server/src/__tests__/environment-config.test.ts server/src/__tests__/environment-probe.test.ts server/src/__tests__/environment-routes.test.ts server/src/__tests__/environment-runtime.test.ts server/src/__tests__/sandbox-provider-runtime.test.ts` - `pnpm -r typecheck` ## Risks - Plugin sandbox providers now depend more heavily on accurate `configSchema` declarations; incorrect schemas can misclassify secret-bearing fields or omit required config. - Reusable lease matching is now metadata-driven for plugin-backed providers, so providers that fail to persist stable metadata may reprovision instead of resuming an existing lease. - The UI form is now fully schema-driven for plugin-backed sandbox providers; provider manifests without good defaults or descriptions may produce a rougher operator experience. ## Model Used - OpenAI Codex via `codex_local` - Model ID: `gpt-5.4` - Reasoning effort: `high` - Context window observed in runtime session metadata: `258400` tokens - Capabilities used: terminal tool execution, git, and local code/test inspection ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-24 18:03:41 -07:00
const workerConfig = stripSandboxProviderEnvelope(parsed.config);
const storedConfig = storedParsed.config;
const existingLeases = parsed.config.reuseLease
? await environmentsSvc.listLeases(input.environment.id)
: [];
const reusableProviderLeaseId = parsed.config.reuseLease
? findReusableSandboxLeaseId({ config: storedConfig, leases: existingLeases })
: null;
const reusableLease = reusableProviderLeaseId
? existingLeases.find((lease) => lease.providerLeaseId === reusableProviderLeaseId)
: null;
const providerLease = reusableLease?.providerLeaseId
? await pluginWorkerManager.call(
Fix runtime state race, workspace sync, plugin startup, and orphaned leases (#4804) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Agents run inside environments that are leased, and the server manages runtime state, workspace configuration, and plugin lifecycle > - Several edge cases caused failures during concurrent operations: a race condition in runtime state insertion could produce duplicate-key errors, reused workspaces didn't sync their configuration when the parent issue was updated, sandbox provider plugins could be queried before registration completed, and orphaned environment leases from failed runs were never released > - This PR fixes these four runtime/environment issues > - The benefit is more reliable concurrent agent execution and proper resource cleanup ## What Changed - `services/heartbeat.ts`: Fixed a race condition where concurrent runtime state inserts could fail with a duplicate-key error by using an upsert pattern - `services/issues.ts`: Sync reused workspace configuration when an issue is updated, so the workspace reflects the latest issue state - `services/environment-runtime.ts`: Fixed a startup race where sandbox provider plugins could be queried before registration completed, by awaiting plugin readiness before resolving environment drivers - `services/heartbeat.ts`: Release environment leases for orphaned runs that lost their process without cleanup ## Verification - `pnpm test` — all existing and new tests pass, including new tests for runtime state upsert and process recovery lease cleanup - `pnpm typecheck` — clean - Manual: trigger concurrent agent runs to verify no duplicate-key failures; verify orphaned leases are released after process loss ## Risks - Low risk. The runtime state upsert changes insert-to-upsert behavior, which could mask a legitimate duplicate if two different runs produce the same key — but this is prevented by the run ID being part of the key. The plugin startup await is bounded by the existing registration timeout. ## Model Used Codex GPT 5.4 high via Paperclip. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-29 16:37:10 -07:00
pluginProvider.resolved.plugin.id,
Generalize sandbox provider core for plugin-only providers (#4449) ## Thinking Path > - Paperclip is a control plane, so optional execution providers should sit at the plugin edge instead of hardcoding provider-specific behavior into core shared/server/ui layers. > - Sandbox environments are already first-class, and the fake provider proves the built-in path; the remaining gap was that real providers still leaked provider-specific config and runtime assumptions into core. > - That coupling showed up in config normalization, secret persistence, capabilities reporting, lease reconstruction, and the board UI form fields. > - As long as core knew about those provider-shaped details, shipping a provider as a pure third-party plugin meant every new provider would still require host changes. > - This pull request generalizes the sandbox provider seam around schema-driven plugin metadata and generic secret-ref handling. > - The runtime and UI now consume provider metadata generically, so core only special-cases the built-in fake provider while third-party providers can live entirely in plugins. ## What Changed - Added generic sandbox-provider capability metadata so plugin-backed providers can expose `configSchema` through shared environment support and the environments capabilities API. - Reworked sandbox config normalization/persistence/runtime resolution to handle schema-declared secret-ref fields generically, storing them as Paperclip secrets and resolving them for probe/execute/release flows. - Generalized plugin sandbox runtime handling so provider validation, reusable-lease matching, lease reconstruction, and plugin worker calls all operate on provider-agnostic config instead of provider-shaped branches. - Replaced hardcoded sandbox provider form fields in Company Settings with schema-driven rendering and blocked agent environment selection from the built-in fake provider. - Added regression coverage for the generic seam across shared support helpers plus environment config, probe, routes, runtime, and sandbox-provider runtime tests. ## Verification - `pnpm vitest --run packages/shared/src/environment-support.test.ts server/src/__tests__/environment-config.test.ts server/src/__tests__/environment-probe.test.ts server/src/__tests__/environment-routes.test.ts server/src/__tests__/environment-runtime.test.ts server/src/__tests__/sandbox-provider-runtime.test.ts` - `pnpm -r typecheck` ## Risks - Plugin sandbox providers now depend more heavily on accurate `configSchema` declarations; incorrect schemas can misclassify secret-bearing fields or omit required config. - Reusable lease matching is now metadata-driven for plugin-backed providers, so providers that fail to persist stable metadata may reprovision instead of resuming an existing lease. - The UI form is now fully schema-driven for plugin-backed sandbox providers; provider manifests without good defaults or descriptions may produce a rougher operator experience. ## Model Used - OpenAI Codex via `codex_local` - Model ID: `gpt-5.4` - Reasoning effort: `high` - Context window observed in runtime session metadata: `258400` tokens - Capabilities used: terminal tool execution, git, and local code/test inspection ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-24 18:03:41 -07:00
"environmentResumeLease",
{
driverKey: parsed.config.provider,
companyId: input.companyId,
environmentId: input.environment.id,
config: workerConfig,
providerLeaseId: reusableLease.providerLeaseId,
leaseMetadata: reusableLease.metadata ?? undefined,
},
).then((resumed) =>
typeof resumed.providerLeaseId === "string" && resumed.providerLeaseId.length > 0
? resumed
: null,
).catch(() => null)
: null;
const acquiredLease = providerLease ?? await pluginWorkerManager.call(
Fix runtime state race, workspace sync, plugin startup, and orphaned leases (#4804) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Agents run inside environments that are leased, and the server manages runtime state, workspace configuration, and plugin lifecycle > - Several edge cases caused failures during concurrent operations: a race condition in runtime state insertion could produce duplicate-key errors, reused workspaces didn't sync their configuration when the parent issue was updated, sandbox provider plugins could be queried before registration completed, and orphaned environment leases from failed runs were never released > - This PR fixes these four runtime/environment issues > - The benefit is more reliable concurrent agent execution and proper resource cleanup ## What Changed - `services/heartbeat.ts`: Fixed a race condition where concurrent runtime state inserts could fail with a duplicate-key error by using an upsert pattern - `services/issues.ts`: Sync reused workspace configuration when an issue is updated, so the workspace reflects the latest issue state - `services/environment-runtime.ts`: Fixed a startup race where sandbox provider plugins could be queried before registration completed, by awaiting plugin readiness before resolving environment drivers - `services/heartbeat.ts`: Release environment leases for orphaned runs that lost their process without cleanup ## Verification - `pnpm test` — all existing and new tests pass, including new tests for runtime state upsert and process recovery lease cleanup - `pnpm typecheck` — clean - Manual: trigger concurrent agent runs to verify no duplicate-key failures; verify orphaned leases are released after process loss ## Risks - Low risk. The runtime state upsert changes insert-to-upsert behavior, which could mask a legitimate duplicate if two different runs produce the same key — but this is prevented by the run ID being part of the key. The plugin startup await is bounded by the existing registration timeout. ## Model Used Codex GPT 5.4 high via Paperclip. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-29 16:37:10 -07:00
pluginProvider.resolved.plugin.id,
Add sandbox environment support (#4415) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies. > - The environment/runtime layer decides where agent work executes and how the control plane reaches those runtimes. > - Today Paperclip can run locally and over SSH, but sandboxed execution needs a first-class environment model instead of one-off adapter behavior. > - We also want sandbox providers to be pluggable so the core does not hardcode every provider implementation. > - This branch adds the Sandbox environment path, the provider contract, and a deterministic fake provider plugin. > - That required synchronized changes across shared contracts, plugin SDK surfaces, server runtime orchestration, and the UI environment/workspace flows. > - The result is that sandbox execution becomes a core control-plane capability while keeping provider implementations extensible and testable. ## What Changed - Added sandbox runtime support to the environment execution path, including runtime URL discovery, sandbox execution targeting, orchestration, and heartbeat integration. - Added plugin-provider support for sandbox environments so providers can be supplied via plugins instead of hardcoded server logic. - Added the fake sandbox provider plugin with deterministic behavior suitable for local and automated testing. - Updated shared types, validators, plugin protocol definitions, and SDK helpers to carry sandbox provider and workspace-runtime contracts across package boundaries. - Updated server routes and services so companies can create sandbox environments, select them for work, and execute work through the sandbox runtime path. - Updated the UI environment and workspace surfaces to expose sandbox environment configuration and selection. - Added test coverage for sandbox runtime behavior, provider seams, environment route guards, orchestration, and the fake provider plugin. ## Verification - Ran locally before the final fixture-only scrub: - `pnpm -r typecheck` - `pnpm test:run` - `pnpm build` - Ran locally after the final scrub amend: - `pnpm vitest run server/src/__tests__/runtime-api.test.ts` - Reviewer spot checks: - create a sandbox environment backed by the fake provider plugin - run work through that environment - confirm sandbox provider execution does not inherit host secrets implicitly ## Risks - This touches shared contracts, plugin SDK plumbing, server runtime orchestration, and UI environment/workspace flows, so regressions would likely show up as cross-layer mismatches rather than isolated type errors. - Runtime URL discovery and sandbox callback selection are sensitive to host/bind configuration; if that logic is wrong, sandbox-backed callbacks may fail even when execution succeeds. - The fake provider plugin is intentionally deterministic and test-oriented; future providers may expose capability gaps that this branch does not yet cover. ## Model Used - OpenAI Codex coding agent on a GPT-5-class backend in the Paperclip/Codex harness. Exact backend model ID is not exposed in-session. Tool-assisted workflow with shell execution, file editing, git history inspection, and local test execution. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-24 12:15:53 -07:00
"environmentAcquireLease",
{
driverKey: parsed.config.provider,
companyId: input.companyId,
environmentId: input.environment.id,
Generalize sandbox provider core for plugin-only providers (#4449) ## Thinking Path > - Paperclip is a control plane, so optional execution providers should sit at the plugin edge instead of hardcoding provider-specific behavior into core shared/server/ui layers. > - Sandbox environments are already first-class, and the fake provider proves the built-in path; the remaining gap was that real providers still leaked provider-specific config and runtime assumptions into core. > - That coupling showed up in config normalization, secret persistence, capabilities reporting, lease reconstruction, and the board UI form fields. > - As long as core knew about those provider-shaped details, shipping a provider as a pure third-party plugin meant every new provider would still require host changes. > - This pull request generalizes the sandbox provider seam around schema-driven plugin metadata and generic secret-ref handling. > - The runtime and UI now consume provider metadata generically, so core only special-cases the built-in fake provider while third-party providers can live entirely in plugins. ## What Changed - Added generic sandbox-provider capability metadata so plugin-backed providers can expose `configSchema` through shared environment support and the environments capabilities API. - Reworked sandbox config normalization/persistence/runtime resolution to handle schema-declared secret-ref fields generically, storing them as Paperclip secrets and resolving them for probe/execute/release flows. - Generalized plugin sandbox runtime handling so provider validation, reusable-lease matching, lease reconstruction, and plugin worker calls all operate on provider-agnostic config instead of provider-shaped branches. - Replaced hardcoded sandbox provider form fields in Company Settings with schema-driven rendering and blocked agent environment selection from the built-in fake provider. - Added regression coverage for the generic seam across shared support helpers plus environment config, probe, routes, runtime, and sandbox-provider runtime tests. ## Verification - `pnpm vitest --run packages/shared/src/environment-support.test.ts server/src/__tests__/environment-config.test.ts server/src/__tests__/environment-probe.test.ts server/src/__tests__/environment-routes.test.ts server/src/__tests__/environment-runtime.test.ts server/src/__tests__/sandbox-provider-runtime.test.ts` - `pnpm -r typecheck` ## Risks - Plugin sandbox providers now depend more heavily on accurate `configSchema` declarations; incorrect schemas can misclassify secret-bearing fields or omit required config. - Reusable lease matching is now metadata-driven for plugin-backed providers, so providers that fail to persist stable metadata may reprovision instead of resuming an existing lease. - The UI form is now fully schema-driven for plugin-backed sandbox providers; provider manifests without good defaults or descriptions may produce a rougher operator experience. ## Model Used - OpenAI Codex via `codex_local` - Model ID: `gpt-5.4` - Reasoning effort: `high` - Context window observed in runtime session metadata: `258400` tokens - Capabilities used: terminal tool execution, git, and local code/test inspection ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-24 18:03:41 -07:00
config: workerConfig,
Add sandbox environment support (#4415) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies. > - The environment/runtime layer decides where agent work executes and how the control plane reaches those runtimes. > - Today Paperclip can run locally and over SSH, but sandboxed execution needs a first-class environment model instead of one-off adapter behavior. > - We also want sandbox providers to be pluggable so the core does not hardcode every provider implementation. > - This branch adds the Sandbox environment path, the provider contract, and a deterministic fake provider plugin. > - That required synchronized changes across shared contracts, plugin SDK surfaces, server runtime orchestration, and the UI environment/workspace flows. > - The result is that sandbox execution becomes a core control-plane capability while keeping provider implementations extensible and testable. ## What Changed - Added sandbox runtime support to the environment execution path, including runtime URL discovery, sandbox execution targeting, orchestration, and heartbeat integration. - Added plugin-provider support for sandbox environments so providers can be supplied via plugins instead of hardcoded server logic. - Added the fake sandbox provider plugin with deterministic behavior suitable for local and automated testing. - Updated shared types, validators, plugin protocol definitions, and SDK helpers to carry sandbox provider and workspace-runtime contracts across package boundaries. - Updated server routes and services so companies can create sandbox environments, select them for work, and execute work through the sandbox runtime path. - Updated the UI environment and workspace surfaces to expose sandbox environment configuration and selection. - Added test coverage for sandbox runtime behavior, provider seams, environment route guards, orchestration, and the fake provider plugin. ## Verification - Ran locally before the final fixture-only scrub: - `pnpm -r typecheck` - `pnpm test:run` - `pnpm build` - Ran locally after the final scrub amend: - `pnpm vitest run server/src/__tests__/runtime-api.test.ts` - Reviewer spot checks: - create a sandbox environment backed by the fake provider plugin - run work through that environment - confirm sandbox provider execution does not inherit host secrets implicitly ## Risks - This touches shared contracts, plugin SDK plumbing, server runtime orchestration, and UI environment/workspace flows, so regressions would likely show up as cross-layer mismatches rather than isolated type errors. - Runtime URL discovery and sandbox callback selection are sensitive to host/bind configuration; if that logic is wrong, sandbox-backed callbacks may fail even when execution succeeds. - The fake provider plugin is intentionally deterministic and test-oriented; future providers may expose capability gaps that this branch does not yet cover. ## Model Used - OpenAI Codex coding agent on a GPT-5-class backend in the Paperclip/Codex harness. Exact backend model ID is not exposed in-session. Tool-assisted workflow with shell execution, file editing, git history inspection, and local test execution. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-24 12:15:53 -07:00
runId: input.heartbeatRunId,
workspaceMode: input.executionWorkspaceMode ?? undefined,
},
);
const resolvedLeasePolicy = parsed.config.reuseLease
? "reuse_by_environment"
: "ephemeral";
return await environmentsSvc.acquireLease({
companyId: input.companyId,
environmentId: input.environment.id,
executionWorkspaceId: input.executionWorkspaceId,
issueId: input.issueId,
heartbeatRunId: input.heartbeatRunId,
leasePolicy: resolvedLeasePolicy,
provider: parsed.config.provider,
Generalize sandbox provider core for plugin-only providers (#4449) ## Thinking Path > - Paperclip is a control plane, so optional execution providers should sit at the plugin edge instead of hardcoding provider-specific behavior into core shared/server/ui layers. > - Sandbox environments are already first-class, and the fake provider proves the built-in path; the remaining gap was that real providers still leaked provider-specific config and runtime assumptions into core. > - That coupling showed up in config normalization, secret persistence, capabilities reporting, lease reconstruction, and the board UI form fields. > - As long as core knew about those provider-shaped details, shipping a provider as a pure third-party plugin meant every new provider would still require host changes. > - This pull request generalizes the sandbox provider seam around schema-driven plugin metadata and generic secret-ref handling. > - The runtime and UI now consume provider metadata generically, so core only special-cases the built-in fake provider while third-party providers can live entirely in plugins. ## What Changed - Added generic sandbox-provider capability metadata so plugin-backed providers can expose `configSchema` through shared environment support and the environments capabilities API. - Reworked sandbox config normalization/persistence/runtime resolution to handle schema-declared secret-ref fields generically, storing them as Paperclip secrets and resolving them for probe/execute/release flows. - Generalized plugin sandbox runtime handling so provider validation, reusable-lease matching, lease reconstruction, and plugin worker calls all operate on provider-agnostic config instead of provider-shaped branches. - Replaced hardcoded sandbox provider form fields in Company Settings with schema-driven rendering and blocked agent environment selection from the built-in fake provider. - Added regression coverage for the generic seam across shared support helpers plus environment config, probe, routes, runtime, and sandbox-provider runtime tests. ## Verification - `pnpm vitest --run packages/shared/src/environment-support.test.ts server/src/__tests__/environment-config.test.ts server/src/__tests__/environment-probe.test.ts server/src/__tests__/environment-routes.test.ts server/src/__tests__/environment-runtime.test.ts server/src/__tests__/sandbox-provider-runtime.test.ts` - `pnpm -r typecheck` ## Risks - Plugin sandbox providers now depend more heavily on accurate `configSchema` declarations; incorrect schemas can misclassify secret-bearing fields or omit required config. - Reusable lease matching is now metadata-driven for plugin-backed providers, so providers that fail to persist stable metadata may reprovision instead of resuming an existing lease. - The UI form is now fully schema-driven for plugin-backed sandbox providers; provider manifests without good defaults or descriptions may produce a rougher operator experience. ## Model Used - OpenAI Codex via `codex_local` - Model ID: `gpt-5.4` - Reasoning effort: `high` - Context window observed in runtime session metadata: `258400` tokens - Capabilities used: terminal tool execution, git, and local code/test inspection ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-24 18:03:41 -07:00
providerLeaseId: acquiredLease.providerLeaseId,
expiresAt: acquiredLease.expiresAt ? new Date(acquiredLease.expiresAt) : undefined,
Add sandbox environment support (#4415) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies. > - The environment/runtime layer decides where agent work executes and how the control plane reaches those runtimes. > - Today Paperclip can run locally and over SSH, but sandboxed execution needs a first-class environment model instead of one-off adapter behavior. > - We also want sandbox providers to be pluggable so the core does not hardcode every provider implementation. > - This branch adds the Sandbox environment path, the provider contract, and a deterministic fake provider plugin. > - That required synchronized changes across shared contracts, plugin SDK surfaces, server runtime orchestration, and the UI environment/workspace flows. > - The result is that sandbox execution becomes a core control-plane capability while keeping provider implementations extensible and testable. ## What Changed - Added sandbox runtime support to the environment execution path, including runtime URL discovery, sandbox execution targeting, orchestration, and heartbeat integration. - Added plugin-provider support for sandbox environments so providers can be supplied via plugins instead of hardcoded server logic. - Added the fake sandbox provider plugin with deterministic behavior suitable for local and automated testing. - Updated shared types, validators, plugin protocol definitions, and SDK helpers to carry sandbox provider and workspace-runtime contracts across package boundaries. - Updated server routes and services so companies can create sandbox environments, select them for work, and execute work through the sandbox runtime path. - Updated the UI environment and workspace surfaces to expose sandbox environment configuration and selection. - Added test coverage for sandbox runtime behavior, provider seams, environment route guards, orchestration, and the fake provider plugin. ## Verification - Ran locally before the final fixture-only scrub: - `pnpm -r typecheck` - `pnpm test:run` - `pnpm build` - Ran locally after the final scrub amend: - `pnpm vitest run server/src/__tests__/runtime-api.test.ts` - Reviewer spot checks: - create a sandbox environment backed by the fake provider plugin - run work through that environment - confirm sandbox provider execution does not inherit host secrets implicitly ## Risks - This touches shared contracts, plugin SDK plumbing, server runtime orchestration, and UI environment/workspace flows, so regressions would likely show up as cross-layer mismatches rather than isolated type errors. - Runtime URL discovery and sandbox callback selection are sensitive to host/bind configuration; if that logic is wrong, sandbox-backed callbacks may fail even when execution succeeds. - The fake provider plugin is intentionally deterministic and test-oriented; future providers may expose capability gaps that this branch does not yet cover. ## Model Used - OpenAI Codex coding agent on a GPT-5-class backend in the Paperclip/Codex harness. Exact backend model ID is not exposed in-session. Tool-assisted workflow with shell execution, file editing, git history inspection, and local test execution. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-24 12:15:53 -07:00
metadata: {
driver: input.environment.driver,
executionWorkspaceMode: input.executionWorkspaceMode,
Fix runtime state race, workspace sync, plugin startup, and orphaned leases (#4804) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Agents run inside environments that are leased, and the server manages runtime state, workspace configuration, and plugin lifecycle > - Several edge cases caused failures during concurrent operations: a race condition in runtime state insertion could produce duplicate-key errors, reused workspaces didn't sync their configuration when the parent issue was updated, sandbox provider plugins could be queried before registration completed, and orphaned environment leases from failed runs were never released > - This PR fixes these four runtime/environment issues > - The benefit is more reliable concurrent agent execution and proper resource cleanup ## What Changed - `services/heartbeat.ts`: Fixed a race condition where concurrent runtime state inserts could fail with a duplicate-key error by using an upsert pattern - `services/issues.ts`: Sync reused workspace configuration when an issue is updated, so the workspace reflects the latest issue state - `services/environment-runtime.ts`: Fixed a startup race where sandbox provider plugins could be queried before registration completed, by awaiting plugin readiness before resolving environment drivers - `services/heartbeat.ts`: Release environment leases for orphaned runs that lost their process without cleanup ## Verification - `pnpm test` — all existing and new tests pass, including new tests for runtime state upsert and process recovery lease cleanup - `pnpm typecheck` — clean - Manual: trigger concurrent agent runs to verify no duplicate-key failures; verify orphaned leases are released after process loss ## Risks - Low risk. The runtime state upsert changes insert-to-upsert behavior, which could mask a legitimate duplicate if two different runs produce the same key — but this is prevented by the run ID being part of the key. The plugin startup await is bounded by the existing registration timeout. ## Model Used Codex GPT 5.4 high via Paperclip. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-29 16:37:10 -07:00
pluginId: pluginProvider.resolved.plugin.id,
pluginKey: pluginProvider.resolved.plugin.pluginKey,
Add sandbox environment support (#4415) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies. > - The environment/runtime layer decides where agent work executes and how the control plane reaches those runtimes. > - Today Paperclip can run locally and over SSH, but sandboxed execution needs a first-class environment model instead of one-off adapter behavior. > - We also want sandbox providers to be pluggable so the core does not hardcode every provider implementation. > - This branch adds the Sandbox environment path, the provider contract, and a deterministic fake provider plugin. > - That required synchronized changes across shared contracts, plugin SDK surfaces, server runtime orchestration, and the UI environment/workspace flows. > - The result is that sandbox execution becomes a core control-plane capability while keeping provider implementations extensible and testable. ## What Changed - Added sandbox runtime support to the environment execution path, including runtime URL discovery, sandbox execution targeting, orchestration, and heartbeat integration. - Added plugin-provider support for sandbox environments so providers can be supplied via plugins instead of hardcoded server logic. - Added the fake sandbox provider plugin with deterministic behavior suitable for local and automated testing. - Updated shared types, validators, plugin protocol definitions, and SDK helpers to carry sandbox provider and workspace-runtime contracts across package boundaries. - Updated server routes and services so companies can create sandbox environments, select them for work, and execute work through the sandbox runtime path. - Updated the UI environment and workspace surfaces to expose sandbox environment configuration and selection. - Added test coverage for sandbox runtime behavior, provider seams, environment route guards, orchestration, and the fake provider plugin. ## Verification - Ran locally before the final fixture-only scrub: - `pnpm -r typecheck` - `pnpm test:run` - `pnpm build` - Ran locally after the final scrub amend: - `pnpm vitest run server/src/__tests__/runtime-api.test.ts` - Reviewer spot checks: - create a sandbox environment backed by the fake provider plugin - run work through that environment - confirm sandbox provider execution does not inherit host secrets implicitly ## Risks - This touches shared contracts, plugin SDK plumbing, server runtime orchestration, and UI environment/workspace flows, so regressions would likely show up as cross-layer mismatches rather than isolated type errors. - Runtime URL discovery and sandbox callback selection are sensitive to host/bind configuration; if that logic is wrong, sandbox-backed callbacks may fail even when execution succeeds. - The fake provider plugin is intentionally deterministic and test-oriented; future providers may expose capability gaps that this branch does not yet cover. ## Model Used - OpenAI Codex coding agent on a GPT-5-class backend in the Paperclip/Codex harness. Exact backend model ID is not exposed in-session. Tool-assisted workflow with shell execution, file editing, git history inspection, and local test execution. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-24 12:15:53 -07:00
sandboxProviderPlugin: true,
Generalize sandbox provider core for plugin-only providers (#4449) ## Thinking Path > - Paperclip is a control plane, so optional execution providers should sit at the plugin edge instead of hardcoding provider-specific behavior into core shared/server/ui layers. > - Sandbox environments are already first-class, and the fake provider proves the built-in path; the remaining gap was that real providers still leaked provider-specific config and runtime assumptions into core. > - That coupling showed up in config normalization, secret persistence, capabilities reporting, lease reconstruction, and the board UI form fields. > - As long as core knew about those provider-shaped details, shipping a provider as a pure third-party plugin meant every new provider would still require host changes. > - This pull request generalizes the sandbox provider seam around schema-driven plugin metadata and generic secret-ref handling. > - The runtime and UI now consume provider metadata generically, so core only special-cases the built-in fake provider while third-party providers can live entirely in plugins. ## What Changed - Added generic sandbox-provider capability metadata so plugin-backed providers can expose `configSchema` through shared environment support and the environments capabilities API. - Reworked sandbox config normalization/persistence/runtime resolution to handle schema-declared secret-ref fields generically, storing them as Paperclip secrets and resolving them for probe/execute/release flows. - Generalized plugin sandbox runtime handling so provider validation, reusable-lease matching, lease reconstruction, and plugin worker calls all operate on provider-agnostic config instead of provider-shaped branches. - Replaced hardcoded sandbox provider form fields in Company Settings with schema-driven rendering and blocked agent environment selection from the built-in fake provider. - Added regression coverage for the generic seam across shared support helpers plus environment config, probe, routes, runtime, and sandbox-provider runtime tests. ## Verification - `pnpm vitest --run packages/shared/src/environment-support.test.ts server/src/__tests__/environment-config.test.ts server/src/__tests__/environment-probe.test.ts server/src/__tests__/environment-routes.test.ts server/src/__tests__/environment-runtime.test.ts server/src/__tests__/sandbox-provider-runtime.test.ts` - `pnpm -r typecheck` ## Risks - Plugin sandbox providers now depend more heavily on accurate `configSchema` declarations; incorrect schemas can misclassify secret-bearing fields or omit required config. - Reusable lease matching is now metadata-driven for plugin-backed providers, so providers that fail to persist stable metadata may reprovision instead of resuming an existing lease. - The UI form is now fully schema-driven for plugin-backed sandbox providers; provider manifests without good defaults or descriptions may produce a rougher operator experience. ## Model Used - OpenAI Codex via `codex_local` - Model ID: `gpt-5.4` - Reasoning effort: `high` - Context window observed in runtime session metadata: `258400` tokens - Capabilities used: terminal tool execution, git, and local code/test inspection ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-24 18:03:41 -07:00
...sandboxConfigForLeaseMetadata(storedConfig),
...stripSecretRefValuesFromPluginLeaseMetadata({
metadata: acquiredLease.metadata,
Fix runtime state race, workspace sync, plugin startup, and orphaned leases (#4804) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Agents run inside environments that are leased, and the server manages runtime state, workspace configuration, and plugin lifecycle > - Several edge cases caused failures during concurrent operations: a race condition in runtime state insertion could produce duplicate-key errors, reused workspaces didn't sync their configuration when the parent issue was updated, sandbox provider plugins could be queried before registration completed, and orphaned environment leases from failed runs were never released > - This PR fixes these four runtime/environment issues > - The benefit is more reliable concurrent agent execution and proper resource cleanup ## What Changed - `services/heartbeat.ts`: Fixed a race condition where concurrent runtime state inserts could fail with a duplicate-key error by using an upsert pattern - `services/issues.ts`: Sync reused workspace configuration when an issue is updated, so the workspace reflects the latest issue state - `services/environment-runtime.ts`: Fixed a startup race where sandbox provider plugins could be queried before registration completed, by awaiting plugin readiness before resolving environment drivers - `services/heartbeat.ts`: Release environment leases for orphaned runs that lost their process without cleanup ## Verification - `pnpm test` — all existing and new tests pass, including new tests for runtime state upsert and process recovery lease cleanup - `pnpm typecheck` — clean - Manual: trigger concurrent agent runs to verify no duplicate-key failures; verify orphaned leases are released after process loss ## Risks - Low risk. The runtime state upsert changes insert-to-upsert behavior, which could mask a legitimate duplicate if two different runs produce the same key — but this is prevented by the run ID being part of the key. The plugin startup await is bounded by the existing registration timeout. ## Model Used Codex GPT 5.4 high via Paperclip. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-29 16:37:10 -07:00
schema: pluginProvider.resolved.driver.configSchema as Record<string, unknown> | null | undefined,
Generalize sandbox provider core for plugin-only providers (#4449) ## Thinking Path > - Paperclip is a control plane, so optional execution providers should sit at the plugin edge instead of hardcoding provider-specific behavior into core shared/server/ui layers. > - Sandbox environments are already first-class, and the fake provider proves the built-in path; the remaining gap was that real providers still leaked provider-specific config and runtime assumptions into core. > - That coupling showed up in config normalization, secret persistence, capabilities reporting, lease reconstruction, and the board UI form fields. > - As long as core knew about those provider-shaped details, shipping a provider as a pure third-party plugin meant every new provider would still require host changes. > - This pull request generalizes the sandbox provider seam around schema-driven plugin metadata and generic secret-ref handling. > - The runtime and UI now consume provider metadata generically, so core only special-cases the built-in fake provider while third-party providers can live entirely in plugins. ## What Changed - Added generic sandbox-provider capability metadata so plugin-backed providers can expose `configSchema` through shared environment support and the environments capabilities API. - Reworked sandbox config normalization/persistence/runtime resolution to handle schema-declared secret-ref fields generically, storing them as Paperclip secrets and resolving them for probe/execute/release flows. - Generalized plugin sandbox runtime handling so provider validation, reusable-lease matching, lease reconstruction, and plugin worker calls all operate on provider-agnostic config instead of provider-shaped branches. - Replaced hardcoded sandbox provider form fields in Company Settings with schema-driven rendering and blocked agent environment selection from the built-in fake provider. - Added regression coverage for the generic seam across shared support helpers plus environment config, probe, routes, runtime, and sandbox-provider runtime tests. ## Verification - `pnpm vitest --run packages/shared/src/environment-support.test.ts server/src/__tests__/environment-config.test.ts server/src/__tests__/environment-probe.test.ts server/src/__tests__/environment-routes.test.ts server/src/__tests__/environment-runtime.test.ts server/src/__tests__/sandbox-provider-runtime.test.ts` - `pnpm -r typecheck` ## Risks - Plugin sandbox providers now depend more heavily on accurate `configSchema` declarations; incorrect schemas can misclassify secret-bearing fields or omit required config. - Reusable lease matching is now metadata-driven for plugin-backed providers, so providers that fail to persist stable metadata may reprovision instead of resuming an existing lease. - The UI form is now fully schema-driven for plugin-backed sandbox providers; provider manifests without good defaults or descriptions may produce a rougher operator experience. ## Model Used - OpenAI Codex via `codex_local` - Model ID: `gpt-5.4` - Reasoning effort: `high` - Context window observed in runtime session metadata: `258400` tokens - Capabilities used: terminal tool execution, git, and local code/test inspection ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-24 18:03:41 -07:00
}),
Add sandbox environment support (#4415) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies. > - The environment/runtime layer decides where agent work executes and how the control plane reaches those runtimes. > - Today Paperclip can run locally and over SSH, but sandboxed execution needs a first-class environment model instead of one-off adapter behavior. > - We also want sandbox providers to be pluggable so the core does not hardcode every provider implementation. > - This branch adds the Sandbox environment path, the provider contract, and a deterministic fake provider plugin. > - That required synchronized changes across shared contracts, plugin SDK surfaces, server runtime orchestration, and the UI environment/workspace flows. > - The result is that sandbox execution becomes a core control-plane capability while keeping provider implementations extensible and testable. ## What Changed - Added sandbox runtime support to the environment execution path, including runtime URL discovery, sandbox execution targeting, orchestration, and heartbeat integration. - Added plugin-provider support for sandbox environments so providers can be supplied via plugins instead of hardcoded server logic. - Added the fake sandbox provider plugin with deterministic behavior suitable for local and automated testing. - Updated shared types, validators, plugin protocol definitions, and SDK helpers to carry sandbox provider and workspace-runtime contracts across package boundaries. - Updated server routes and services so companies can create sandbox environments, select them for work, and execute work through the sandbox runtime path. - Updated the UI environment and workspace surfaces to expose sandbox environment configuration and selection. - Added test coverage for sandbox runtime behavior, provider seams, environment route guards, orchestration, and the fake provider plugin. ## Verification - Ran locally before the final fixture-only scrub: - `pnpm -r typecheck` - `pnpm test:run` - `pnpm build` - Ran locally after the final scrub amend: - `pnpm vitest run server/src/__tests__/runtime-api.test.ts` - Reviewer spot checks: - create a sandbox environment backed by the fake provider plugin - run work through that environment - confirm sandbox provider execution does not inherit host secrets implicitly ## Risks - This touches shared contracts, plugin SDK plumbing, server runtime orchestration, and UI environment/workspace flows, so regressions would likely show up as cross-layer mismatches rather than isolated type errors. - Runtime URL discovery and sandbox callback selection are sensitive to host/bind configuration; if that logic is wrong, sandbox-backed callbacks may fail even when execution succeeds. - The fake provider plugin is intentionally deterministic and test-oriented; future providers may expose capability gaps that this branch does not yet cover. ## Model Used - OpenAI Codex coding agent on a GPT-5-class backend in the Paperclip/Codex harness. Exact backend model ID is not exposed in-session. Tool-assisted workflow with shell execution, file editing, git history inspection, and local test execution. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-24 12:15:53 -07:00
},
});
}
// Built-in sandbox provider path.
const reusableProviderLeaseId = parsed.config.reuseLease
? (await environmentsSvc
.listLeases(input.environment.id)
.then((leases) => findReusableSandboxLeaseId({ config: parsed.config, leases })))
: null;
const providerLease = await acquireSandboxProviderLease({
config: parsed.config,
environmentId: input.environment.id,
heartbeatRunId: input.heartbeatRunId,
issueId: input.issueId,
reusableProviderLeaseId,
});
const resolvedLeasePolicy = parsed.config.reuseLease
? "reuse_by_environment"
: "ephemeral";
return await environmentsSvc.acquireLease({
companyId: input.companyId,
environmentId: input.environment.id,
executionWorkspaceId: input.executionWorkspaceId,
issueId: input.issueId,
heartbeatRunId: input.heartbeatRunId,
leasePolicy: resolvedLeasePolicy,
provider: parsed.config.provider,
providerLeaseId: providerLease.providerLeaseId,
metadata: {
driver: input.environment.driver,
executionWorkspaceMode: input.executionWorkspaceMode,
...providerLease.metadata,
},
});
},
async releaseRunLease(input) {
// Check if this lease was acquired through a plugin.
if (input.lease.metadata?.sandboxProviderPlugin) {
return await releasePluginBackedSandboxLease(input);
}
const metadataConfig = sandboxConfigFromLeaseMetadata(input.lease);
// If no built-in provider handles this metadata, try plugin path.
if (!metadataConfig) {
const looseConfig = sandboxConfigFromLeaseMetadataLoose(input.lease);
if (looseConfig && !isBuiltinSandboxProvider(looseConfig.provider)) {
return await releasePluginBackedSandboxLease(input);
}
}
const parsed = metadataConfig
? await resolveEnvironmentDriverConfigForRuntime(db, input.lease.companyId, {
driver: "sandbox",
config: metadataConfig as unknown as Record<string, unknown>,
})
: await resolveEnvironmentDriverConfigForRuntime(db, input.lease.companyId, input.environment);
if (parsed.driver !== "sandbox") {
throw new Error(`Expected sandbox environment config for lease "${input.lease.id}".`);
}
let cleanupStatus: "success" | "failed" = "success";
try {
await releaseSandboxProviderLease({
config: parsed.config,
providerLeaseId: input.lease.providerLeaseId,
status: input.status,
});
} catch {
cleanupStatus = "failed";
}
const releaseStatus = input.lease.leasePolicy === "retain_on_failure" && input.status === "failed"
? "retained" as const
: input.status;
return await environmentsSvc.releaseLease(input.lease.id, releaseStatus, {
failureReason: input.status === "failed" ? "adapter_or_run_failure" : undefined,
cleanupStatus,
});
},
async realizeWorkspace(input) {
// Plugin-backed sandbox providers: delegate workspace realization.
if (input.lease.metadata?.sandboxProviderPlugin && pluginWorkerManager) {
const pluginId = readString(input.lease.metadata?.pluginId);
const providerKey =
readString(input.lease.metadata?.provider) ??
(input.environment.driver === "sandbox"
? (parseEnvironmentDriverConfig(input.environment).config as SandboxEnvironmentConfig).provider
: null);
if (pluginId && providerKey) {
const config = await resolvePluginSandboxRuntimeConfig({
environment: input.environment,
lease: input.lease,
provider: providerKey,
});
return await pluginWorkerManager.call(pluginId, "environmentRealizeWorkspace", {
driverKey: providerKey,
companyId: input.lease.companyId,
environmentId: input.environment.id,
Generalize sandbox provider core for plugin-only providers (#4449) ## Thinking Path > - Paperclip is a control plane, so optional execution providers should sit at the plugin edge instead of hardcoding provider-specific behavior into core shared/server/ui layers. > - Sandbox environments are already first-class, and the fake provider proves the built-in path; the remaining gap was that real providers still leaked provider-specific config and runtime assumptions into core. > - That coupling showed up in config normalization, secret persistence, capabilities reporting, lease reconstruction, and the board UI form fields. > - As long as core knew about those provider-shaped details, shipping a provider as a pure third-party plugin meant every new provider would still require host changes. > - This pull request generalizes the sandbox provider seam around schema-driven plugin metadata and generic secret-ref handling. > - The runtime and UI now consume provider metadata generically, so core only special-cases the built-in fake provider while third-party providers can live entirely in plugins. ## What Changed - Added generic sandbox-provider capability metadata so plugin-backed providers can expose `configSchema` through shared environment support and the environments capabilities API. - Reworked sandbox config normalization/persistence/runtime resolution to handle schema-declared secret-ref fields generically, storing them as Paperclip secrets and resolving them for probe/execute/release flows. - Generalized plugin sandbox runtime handling so provider validation, reusable-lease matching, lease reconstruction, and plugin worker calls all operate on provider-agnostic config instead of provider-shaped branches. - Replaced hardcoded sandbox provider form fields in Company Settings with schema-driven rendering and blocked agent environment selection from the built-in fake provider. - Added regression coverage for the generic seam across shared support helpers plus environment config, probe, routes, runtime, and sandbox-provider runtime tests. ## Verification - `pnpm vitest --run packages/shared/src/environment-support.test.ts server/src/__tests__/environment-config.test.ts server/src/__tests__/environment-probe.test.ts server/src/__tests__/environment-routes.test.ts server/src/__tests__/environment-runtime.test.ts server/src/__tests__/sandbox-provider-runtime.test.ts` - `pnpm -r typecheck` ## Risks - Plugin sandbox providers now depend more heavily on accurate `configSchema` declarations; incorrect schemas can misclassify secret-bearing fields or omit required config. - Reusable lease matching is now metadata-driven for plugin-backed providers, so providers that fail to persist stable metadata may reprovision instead of resuming an existing lease. - The UI form is now fully schema-driven for plugin-backed sandbox providers; provider manifests without good defaults or descriptions may produce a rougher operator experience. ## Model Used - OpenAI Codex via `codex_local` - Model ID: `gpt-5.4` - Reasoning effort: `high` - Context window observed in runtime session metadata: `258400` tokens - Capabilities used: terminal tool execution, git, and local code/test inspection ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-24 18:03:41 -07:00
config: stripSandboxProviderEnvelope(config as SandboxEnvironmentConfig),
Add sandbox environment support (#4415) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies. > - The environment/runtime layer decides where agent work executes and how the control plane reaches those runtimes. > - Today Paperclip can run locally and over SSH, but sandboxed execution needs a first-class environment model instead of one-off adapter behavior. > - We also want sandbox providers to be pluggable so the core does not hardcode every provider implementation. > - This branch adds the Sandbox environment path, the provider contract, and a deterministic fake provider plugin. > - That required synchronized changes across shared contracts, plugin SDK surfaces, server runtime orchestration, and the UI environment/workspace flows. > - The result is that sandbox execution becomes a core control-plane capability while keeping provider implementations extensible and testable. ## What Changed - Added sandbox runtime support to the environment execution path, including runtime URL discovery, sandbox execution targeting, orchestration, and heartbeat integration. - Added plugin-provider support for sandbox environments so providers can be supplied via plugins instead of hardcoded server logic. - Added the fake sandbox provider plugin with deterministic behavior suitable for local and automated testing. - Updated shared types, validators, plugin protocol definitions, and SDK helpers to carry sandbox provider and workspace-runtime contracts across package boundaries. - Updated server routes and services so companies can create sandbox environments, select them for work, and execute work through the sandbox runtime path. - Updated the UI environment and workspace surfaces to expose sandbox environment configuration and selection. - Added test coverage for sandbox runtime behavior, provider seams, environment route guards, orchestration, and the fake provider plugin. ## Verification - Ran locally before the final fixture-only scrub: - `pnpm -r typecheck` - `pnpm test:run` - `pnpm build` - Ran locally after the final scrub amend: - `pnpm vitest run server/src/__tests__/runtime-api.test.ts` - Reviewer spot checks: - create a sandbox environment backed by the fake provider plugin - run work through that environment - confirm sandbox provider execution does not inherit host secrets implicitly ## Risks - This touches shared contracts, plugin SDK plumbing, server runtime orchestration, and UI environment/workspace flows, so regressions would likely show up as cross-layer mismatches rather than isolated type errors. - Runtime URL discovery and sandbox callback selection are sensitive to host/bind configuration; if that logic is wrong, sandbox-backed callbacks may fail even when execution succeeds. - The fake provider plugin is intentionally deterministic and test-oriented; future providers may expose capability gaps that this branch does not yet cover. ## Model Used - OpenAI Codex coding agent on a GPT-5-class backend in the Paperclip/Codex harness. Exact backend model ID is not exposed in-session. Tool-assisted workflow with shell execution, file editing, git history inspection, and local test execution. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-24 12:15:53 -07:00
lease: {
providerLeaseId: input.lease.providerLeaseId,
metadata: input.lease.metadata ?? undefined,
expiresAt: input.lease.expiresAt?.toISOString() ?? null,
},
workspace: input.workspace,
});
}
}
const record = buildWorkspaceRealizationRecordFromDriverInput({
environment: input.environment,
lease: input.lease,
workspace: input.workspace,
cwd:
typeof input.lease.metadata?.remoteCwd === "string" && input.lease.metadata.remoteCwd.trim().length > 0
? input.lease.metadata.remoteCwd.trim()
: input.workspace.remotePath ?? input.workspace.localPath ?? null,
});
return {
cwd: record.remote.path ?? record.local.path,
metadata: {
workspaceRealization: record,
},
};
},
async execute(input) {
// Plugin-backed sandbox providers: delegate command execution.
if (input.lease.metadata?.sandboxProviderPlugin && pluginWorkerManager) {
const pluginId = readString(input.lease.metadata?.pluginId);
const providerKey = readString(input.lease.metadata?.provider);
if (pluginId && providerKey) {
const config = await resolvePluginSandboxRuntimeConfig({
environment: input.environment,
lease: input.lease,
provider: providerKey,
});
Add sandbox callback bridge for remote environment API access (#4801) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Agents can run inside sandboxed environments like E2B, which are isolated from the host network > - Sandboxed agents need to call back to the Paperclip API to report progress, post comments, and update issue status > - But sandbox environments cannot reach the Paperclip server directly because they run in isolated network namespaces > - This PR adds a callback bridge that proxies API requests from the sandbox to the Paperclip server, running as a local HTTP server on the host that forwards authenticated requests > - The bridge is started automatically when an adapter launches a sandbox execution, and torn down when the run completes > - The benefit is sandboxed agents can interact with the Paperclip API without requiring network-level access to the host, enabling E2B and similar providers to work end-to-end ## What Changed - Added `sandbox-callback-bridge.ts` in `packages/adapter-utils/` — a lightweight HTTP bridge server that accepts requests from sandbox environments and proxies them to the Paperclip API with authentication - Added request validation and security policy: the bridge only forwards requests to the configured API URL, validates content types, enforces size limits, and rejects non-API paths - Wired the bridge into all remote adapter execute paths (claude, codex, cursor, gemini, pi) — the bridge starts before the agent process and the bridge URL is passed via environment variables - Updated `environment-execution-target.ts` to prefer the explicit API URL from environment lease metadata for sandbox callback routing - Fixed Claude sandbox runtime setup to work with the bridge configuration - Added comprehensive test coverage for bridge request handling, policy enforcement, and sandbox execution integration - Fixed browser bundling — the bridge module is excluded from the frontend bundle via the adapter-utils index export ## Verification - `pnpm test` — all existing and new tests pass, including bridge unit tests and sandbox execution integration tests - `pnpm typecheck` — clean - Manual: configure an E2B environment, run an agent task, verify the agent can post comments and update issue status through the bridge ## Risks - Medium. This is a new network-facing component (HTTP server on localhost). The security policy restricts forwarding to the configured API URL only and validates all requests, but any proxy introduces attack surface. The bridge binds to localhost only and is scoped to the lifetime of a single agent run. ## Model Used Codex GPT 5.4 high via Paperclip. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-29 16:37:34 -07:00
const sanitizedConfig = stripSandboxProviderEnvelope(config as SandboxEnvironmentConfig);
Add sandbox environment support (#4415) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies. > - The environment/runtime layer decides where agent work executes and how the control plane reaches those runtimes. > - Today Paperclip can run locally and over SSH, but sandboxed execution needs a first-class environment model instead of one-off adapter behavior. > - We also want sandbox providers to be pluggable so the core does not hardcode every provider implementation. > - This branch adds the Sandbox environment path, the provider contract, and a deterministic fake provider plugin. > - That required synchronized changes across shared contracts, plugin SDK surfaces, server runtime orchestration, and the UI environment/workspace flows. > - The result is that sandbox execution becomes a core control-plane capability while keeping provider implementations extensible and testable. ## What Changed - Added sandbox runtime support to the environment execution path, including runtime URL discovery, sandbox execution targeting, orchestration, and heartbeat integration. - Added plugin-provider support for sandbox environments so providers can be supplied via plugins instead of hardcoded server logic. - Added the fake sandbox provider plugin with deterministic behavior suitable for local and automated testing. - Updated shared types, validators, plugin protocol definitions, and SDK helpers to carry sandbox provider and workspace-runtime contracts across package boundaries. - Updated server routes and services so companies can create sandbox environments, select them for work, and execute work through the sandbox runtime path. - Updated the UI environment and workspace surfaces to expose sandbox environment configuration and selection. - Added test coverage for sandbox runtime behavior, provider seams, environment route guards, orchestration, and the fake provider plugin. ## Verification - Ran locally before the final fixture-only scrub: - `pnpm -r typecheck` - `pnpm test:run` - `pnpm build` - Ran locally after the final scrub amend: - `pnpm vitest run server/src/__tests__/runtime-api.test.ts` - Reviewer spot checks: - create a sandbox environment backed by the fake provider plugin - run work through that environment - confirm sandbox provider execution does not inherit host secrets implicitly ## Risks - This touches shared contracts, plugin SDK plumbing, server runtime orchestration, and UI environment/workspace flows, so regressions would likely show up as cross-layer mismatches rather than isolated type errors. - Runtime URL discovery and sandbox callback selection are sensitive to host/bind configuration; if that logic is wrong, sandbox-backed callbacks may fail even when execution succeeds. - The fake provider plugin is intentionally deterministic and test-oriented; future providers may expose capability gaps that this branch does not yet cover. ## Model Used - OpenAI Codex coding agent on a GPT-5-class backend in the Paperclip/Codex harness. Exact backend model ID is not exposed in-session. Tool-assisted workflow with shell execution, file editing, git history inspection, and local test execution. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-24 12:15:53 -07:00
return await pluginWorkerManager.call(pluginId, "environmentExecute", {
driverKey: providerKey,
companyId: input.lease.companyId,
environmentId: input.environment.id,
Add sandbox callback bridge for remote environment API access (#4801) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Agents can run inside sandboxed environments like E2B, which are isolated from the host network > - Sandboxed agents need to call back to the Paperclip API to report progress, post comments, and update issue status > - But sandbox environments cannot reach the Paperclip server directly because they run in isolated network namespaces > - This PR adds a callback bridge that proxies API requests from the sandbox to the Paperclip server, running as a local HTTP server on the host that forwards authenticated requests > - The bridge is started automatically when an adapter launches a sandbox execution, and torn down when the run completes > - The benefit is sandboxed agents can interact with the Paperclip API without requiring network-level access to the host, enabling E2B and similar providers to work end-to-end ## What Changed - Added `sandbox-callback-bridge.ts` in `packages/adapter-utils/` — a lightweight HTTP bridge server that accepts requests from sandbox environments and proxies them to the Paperclip API with authentication - Added request validation and security policy: the bridge only forwards requests to the configured API URL, validates content types, enforces size limits, and rejects non-API paths - Wired the bridge into all remote adapter execute paths (claude, codex, cursor, gemini, pi) — the bridge starts before the agent process and the bridge URL is passed via environment variables - Updated `environment-execution-target.ts` to prefer the explicit API URL from environment lease metadata for sandbox callback routing - Fixed Claude sandbox runtime setup to work with the bridge configuration - Added comprehensive test coverage for bridge request handling, policy enforcement, and sandbox execution integration - Fixed browser bundling — the bridge module is excluded from the frontend bundle via the adapter-utils index export ## Verification - `pnpm test` — all existing and new tests pass, including bridge unit tests and sandbox execution integration tests - `pnpm typecheck` — clean - Manual: configure an E2B environment, run an agent task, verify the agent can post comments and update issue status through the bridge ## Risks - Medium. This is a new network-facing component (HTTP server on localhost). The security policy restricts forwarding to the configured API URL only and validates all requests, but any proxy introduces attack surface. The bridge binds to localhost only and is scoped to the lifetime of a single agent run. ## Model Used Codex GPT 5.4 high via Paperclip. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-29 16:37:34 -07:00
config: sanitizedConfig,
Add sandbox environment support (#4415) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies. > - The environment/runtime layer decides where agent work executes and how the control plane reaches those runtimes. > - Today Paperclip can run locally and over SSH, but sandboxed execution needs a first-class environment model instead of one-off adapter behavior. > - We also want sandbox providers to be pluggable so the core does not hardcode every provider implementation. > - This branch adds the Sandbox environment path, the provider contract, and a deterministic fake provider plugin. > - That required synchronized changes across shared contracts, plugin SDK surfaces, server runtime orchestration, and the UI environment/workspace flows. > - The result is that sandbox execution becomes a core control-plane capability while keeping provider implementations extensible and testable. ## What Changed - Added sandbox runtime support to the environment execution path, including runtime URL discovery, sandbox execution targeting, orchestration, and heartbeat integration. - Added plugin-provider support for sandbox environments so providers can be supplied via plugins instead of hardcoded server logic. - Added the fake sandbox provider plugin with deterministic behavior suitable for local and automated testing. - Updated shared types, validators, plugin protocol definitions, and SDK helpers to carry sandbox provider and workspace-runtime contracts across package boundaries. - Updated server routes and services so companies can create sandbox environments, select them for work, and execute work through the sandbox runtime path. - Updated the UI environment and workspace surfaces to expose sandbox environment configuration and selection. - Added test coverage for sandbox runtime behavior, provider seams, environment route guards, orchestration, and the fake provider plugin. ## Verification - Ran locally before the final fixture-only scrub: - `pnpm -r typecheck` - `pnpm test:run` - `pnpm build` - Ran locally after the final scrub amend: - `pnpm vitest run server/src/__tests__/runtime-api.test.ts` - Reviewer spot checks: - create a sandbox environment backed by the fake provider plugin - run work through that environment - confirm sandbox provider execution does not inherit host secrets implicitly ## Risks - This touches shared contracts, plugin SDK plumbing, server runtime orchestration, and UI environment/workspace flows, so regressions would likely show up as cross-layer mismatches rather than isolated type errors. - Runtime URL discovery and sandbox callback selection are sensitive to host/bind configuration; if that logic is wrong, sandbox-backed callbacks may fail even when execution succeeds. - The fake provider plugin is intentionally deterministic and test-oriented; future providers may expose capability gaps that this branch does not yet cover. ## Model Used - OpenAI Codex coding agent on a GPT-5-class backend in the Paperclip/Codex harness. Exact backend model ID is not exposed in-session. Tool-assisted workflow with shell execution, file editing, git history inspection, and local test execution. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-24 12:15:53 -07:00
lease: {
providerLeaseId: input.lease.providerLeaseId,
metadata: input.lease.metadata ?? undefined,
expiresAt: input.lease.expiresAt?.toISOString() ?? null,
},
command: input.command,
args: input.args,
cwd: input.cwd,
env: input.env,
stdin: input.stdin,
timeoutMs: input.timeoutMs,
Improve E2B plugin configuration UX and fix execution timeouts (#4802) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - E2B is a sandbox provider plugin that runs agent code in isolated cloud environments > - Operators configure E2B through the plugin settings page > - But the E2B API key configuration was unclear — the settings field description didn't explain that pasted keys are auto-saved as company secrets, and the fallback to the host `E2B_API_KEY` variable wasn't documented > - Additionally, long-running E2B sandbox commands were timing out because the plugin environment RPC driver used a fixed timeout, and environment commands competed for the single foreground command slot > - This PR clarifies the E2B configuration UX, fixes RPC timeouts for plugin environment execution, and runs E2B environment commands in background mode to avoid blocking the foreground slot > - The benefit is clearer E2B setup for operators and more reliable sandbox command execution ## What Changed - Updated E2B plugin manifest and settings UI to clarify API key configuration — field description now explains that pasted keys are saved as company secrets and documents the `E2B_API_KEY` host fallback - Added test coverage for the plugin settings page rendering - Fixed `plugin-environment-driver.ts` to pass the configured timeout through to RPC calls instead of using a hardcoded default - Updated `environment-runtime.ts` to propagate timeout from the environment lease to the plugin driver - Changed E2B sandbox command execution to use background handles so long-running agent commands don't block the foreground slot needed by the callback bridge ## Verification - `pnpm test` — all existing and new tests pass - `pnpm typecheck` — clean - Manual: navigate to plugin settings, verify E2B API key field shows the updated description text - Manual: run an E2B-backed agent task with a long-running command, verify it completes without RPC timeout ## Risks - Low risk. Configuration UX change is cosmetic. The timeout fix passes an existing value through instead of dropping it. Background command execution is a behavioral change but only affects E2B sandbox commands — the foreground slot is still available for bridge health checks. ## Model Used Codex GPT 5.4 high via Paperclip. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-29 17:12:30 -07:00
}, resolvePluginExecuteRpcTimeoutMs({
requestedTimeoutMs: input.timeoutMs,
config: sanitizedConfig,
}));
Add sandbox environment support (#4415) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies. > - The environment/runtime layer decides where agent work executes and how the control plane reaches those runtimes. > - Today Paperclip can run locally and over SSH, but sandboxed execution needs a first-class environment model instead of one-off adapter behavior. > - We also want sandbox providers to be pluggable so the core does not hardcode every provider implementation. > - This branch adds the Sandbox environment path, the provider contract, and a deterministic fake provider plugin. > - That required synchronized changes across shared contracts, plugin SDK surfaces, server runtime orchestration, and the UI environment/workspace flows. > - The result is that sandbox execution becomes a core control-plane capability while keeping provider implementations extensible and testable. ## What Changed - Added sandbox runtime support to the environment execution path, including runtime URL discovery, sandbox execution targeting, orchestration, and heartbeat integration. - Added plugin-provider support for sandbox environments so providers can be supplied via plugins instead of hardcoded server logic. - Added the fake sandbox provider plugin with deterministic behavior suitable for local and automated testing. - Updated shared types, validators, plugin protocol definitions, and SDK helpers to carry sandbox provider and workspace-runtime contracts across package boundaries. - Updated server routes and services so companies can create sandbox environments, select them for work, and execute work through the sandbox runtime path. - Updated the UI environment and workspace surfaces to expose sandbox environment configuration and selection. - Added test coverage for sandbox runtime behavior, provider seams, environment route guards, orchestration, and the fake provider plugin. ## Verification - Ran locally before the final fixture-only scrub: - `pnpm -r typecheck` - `pnpm test:run` - `pnpm build` - Ran locally after the final scrub amend: - `pnpm vitest run server/src/__tests__/runtime-api.test.ts` - Reviewer spot checks: - create a sandbox environment backed by the fake provider plugin - run work through that environment - confirm sandbox provider execution does not inherit host secrets implicitly ## Risks - This touches shared contracts, plugin SDK plumbing, server runtime orchestration, and UI environment/workspace flows, so regressions would likely show up as cross-layer mismatches rather than isolated type errors. - Runtime URL discovery and sandbox callback selection are sensitive to host/bind configuration; if that logic is wrong, sandbox-backed callbacks may fail even when execution succeeds. - The fake provider plugin is intentionally deterministic and test-oriented; future providers may expose capability gaps that this branch does not yet cover. ## Model Used - OpenAI Codex coding agent on a GPT-5-class backend in the Paperclip/Codex harness. Exact backend model ID is not exposed in-session. Tool-assisted workflow with shell execution, file editing, git history inspection, and local test execution. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-24 12:15:53 -07:00
}
}
throw new Error("Sandbox driver does not support direct command execution for built-in providers.");
},
};
async function releasePluginBackedSandboxLease(
input: EnvironmentDriverReleaseInput,
): Promise<EnvironmentLease | null> {
const metadata = input.lease.metadata ?? {};
const pluginId = readString(metadata.pluginId);
const providerKey = readString(metadata.provider);
let cleanupStatus: "success" | "failed" = "success";
if (pluginId && providerKey && pluginWorkerManager?.isRunning(pluginId)) {
try {
const config = await resolvePluginSandboxRuntimeConfig({
environment: input.environment,
lease: input.lease,
provider: providerKey,
});
await pluginWorkerManager.call(pluginId, "environmentReleaseLease", {
driverKey: providerKey,
companyId: input.lease.companyId,
environmentId: input.environment.id,
Generalize sandbox provider core for plugin-only providers (#4449) ## Thinking Path > - Paperclip is a control plane, so optional execution providers should sit at the plugin edge instead of hardcoding provider-specific behavior into core shared/server/ui layers. > - Sandbox environments are already first-class, and the fake provider proves the built-in path; the remaining gap was that real providers still leaked provider-specific config and runtime assumptions into core. > - That coupling showed up in config normalization, secret persistence, capabilities reporting, lease reconstruction, and the board UI form fields. > - As long as core knew about those provider-shaped details, shipping a provider as a pure third-party plugin meant every new provider would still require host changes. > - This pull request generalizes the sandbox provider seam around schema-driven plugin metadata and generic secret-ref handling. > - The runtime and UI now consume provider metadata generically, so core only special-cases the built-in fake provider while third-party providers can live entirely in plugins. ## What Changed - Added generic sandbox-provider capability metadata so plugin-backed providers can expose `configSchema` through shared environment support and the environments capabilities API. - Reworked sandbox config normalization/persistence/runtime resolution to handle schema-declared secret-ref fields generically, storing them as Paperclip secrets and resolving them for probe/execute/release flows. - Generalized plugin sandbox runtime handling so provider validation, reusable-lease matching, lease reconstruction, and plugin worker calls all operate on provider-agnostic config instead of provider-shaped branches. - Replaced hardcoded sandbox provider form fields in Company Settings with schema-driven rendering and blocked agent environment selection from the built-in fake provider. - Added regression coverage for the generic seam across shared support helpers plus environment config, probe, routes, runtime, and sandbox-provider runtime tests. ## Verification - `pnpm vitest --run packages/shared/src/environment-support.test.ts server/src/__tests__/environment-config.test.ts server/src/__tests__/environment-probe.test.ts server/src/__tests__/environment-routes.test.ts server/src/__tests__/environment-runtime.test.ts server/src/__tests__/sandbox-provider-runtime.test.ts` - `pnpm -r typecheck` ## Risks - Plugin sandbox providers now depend more heavily on accurate `configSchema` declarations; incorrect schemas can misclassify secret-bearing fields or omit required config. - Reusable lease matching is now metadata-driven for plugin-backed providers, so providers that fail to persist stable metadata may reprovision instead of resuming an existing lease. - The UI form is now fully schema-driven for plugin-backed sandbox providers; provider manifests without good defaults or descriptions may produce a rougher operator experience. ## Model Used - OpenAI Codex via `codex_local` - Model ID: `gpt-5.4` - Reasoning effort: `high` - Context window observed in runtime session metadata: `258400` tokens - Capabilities used: terminal tool execution, git, and local code/test inspection ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-24 18:03:41 -07:00
config: stripSandboxProviderEnvelope(config as SandboxEnvironmentConfig),
Add sandbox environment support (#4415) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies. > - The environment/runtime layer decides where agent work executes and how the control plane reaches those runtimes. > - Today Paperclip can run locally and over SSH, but sandboxed execution needs a first-class environment model instead of one-off adapter behavior. > - We also want sandbox providers to be pluggable so the core does not hardcode every provider implementation. > - This branch adds the Sandbox environment path, the provider contract, and a deterministic fake provider plugin. > - That required synchronized changes across shared contracts, plugin SDK surfaces, server runtime orchestration, and the UI environment/workspace flows. > - The result is that sandbox execution becomes a core control-plane capability while keeping provider implementations extensible and testable. ## What Changed - Added sandbox runtime support to the environment execution path, including runtime URL discovery, sandbox execution targeting, orchestration, and heartbeat integration. - Added plugin-provider support for sandbox environments so providers can be supplied via plugins instead of hardcoded server logic. - Added the fake sandbox provider plugin with deterministic behavior suitable for local and automated testing. - Updated shared types, validators, plugin protocol definitions, and SDK helpers to carry sandbox provider and workspace-runtime contracts across package boundaries. - Updated server routes and services so companies can create sandbox environments, select them for work, and execute work through the sandbox runtime path. - Updated the UI environment and workspace surfaces to expose sandbox environment configuration and selection. - Added test coverage for sandbox runtime behavior, provider seams, environment route guards, orchestration, and the fake provider plugin. ## Verification - Ran locally before the final fixture-only scrub: - `pnpm -r typecheck` - `pnpm test:run` - `pnpm build` - Ran locally after the final scrub amend: - `pnpm vitest run server/src/__tests__/runtime-api.test.ts` - Reviewer spot checks: - create a sandbox environment backed by the fake provider plugin - run work through that environment - confirm sandbox provider execution does not inherit host secrets implicitly ## Risks - This touches shared contracts, plugin SDK plumbing, server runtime orchestration, and UI environment/workspace flows, so regressions would likely show up as cross-layer mismatches rather than isolated type errors. - Runtime URL discovery and sandbox callback selection are sensitive to host/bind configuration; if that logic is wrong, sandbox-backed callbacks may fail even when execution succeeds. - The fake provider plugin is intentionally deterministic and test-oriented; future providers may expose capability gaps that this branch does not yet cover. ## Model Used - OpenAI Codex coding agent on a GPT-5-class backend in the Paperclip/Codex harness. Exact backend model ID is not exposed in-session. Tool-assisted workflow with shell execution, file editing, git history inspection, and local test execution. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-24 12:15:53 -07:00
providerLeaseId: input.lease.providerLeaseId,
leaseMetadata: metadata,
});
} catch {
cleanupStatus = "failed";
}
} else {
cleanupStatus = "failed";
}
const releaseStatus =
input.lease.leasePolicy === "retain_on_failure" && input.status === "failed"
? ("retained" as const)
: input.status;
return await environmentsSvc.releaseLease(input.lease.id, releaseStatus, {
failureReason: input.status === "failed" ? "adapter_or_run_failure" : undefined,
cleanupStatus,
});
}
}
function parseExpiresAt(value: string | null | undefined): Date | null {
if (!value) return null;
const parsed = new Date(value);
return Number.isNaN(parsed.getTime()) ? null : parsed;
}
function pluginDriverProviderKey(config: PluginEnvironmentConfig): string {
return `${config.pluginKey}:${config.driverKey}`;
}
function readString(value: unknown): string | null {
return typeof value === "string" && value.length > 0 ? value : null;
}
const INTERNAL_PLUGIN_SANDBOX_CONFIG_KEYS = new Set([
"driver",
"executionWorkspaceMode",
"pluginId",
"pluginKey",
"providerMetadata",
Let sandbox providers declare shell defaults (#5114) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Agents execute in sandboxed remote environments served by pluggable sandbox > providers (E2B today, more later) > - Today every sandbox command runs under `sh -lc` regardless of what the > provider's container actually ships > - That misses bash-only shell init on E2B (which ships bash) and prevents > future providers from declaring a different default — there's no way for a > provider to say "I have bash, use it" > - This PR adds a `shellCommand` field to sandbox execution targets so providers > can declare their preferred shell ("bash" for E2B), threads it through the > sandbox-managed-runtime client, callback bridge, and execution-target shell > helper, and validates the value at the lease-metadata boundary > - The benefit is that sandbox commands run under the right shell on the right > provider, and adding new sandbox providers only needs to declare a shell > preference ## What Changed - Added `packages/adapter-utils/src/sandbox-shell.ts` exporting `preferredShellForSandbox(shellCommand)` (returns `"bash"` if input is `"bash"`, else `"sh"`) - Added `shellCommand?: "bash" | "sh" | null` to `AdapterSandboxExecutionTarget` and `CommandManagedRuntimeSpec`; threaded it through `runAdapterExecutionTargetShellCommand`, `prepareAdapterExecutionTargetRuntime`, and `startAdapterExecutionTargetPaperclipBridge` - `createCommandManagedRuntimeClient`, `prepareCommandManagedRuntime`, and `createCommandManagedSandboxCallbackBridgeQueueClient` now take an optional `shellCommand` and use `preferredShellForSandbox` to pick the shell - `startSandboxCallbackBridgeServer` accepts a `shellCommand` for its server startup, readiness probe, and stop hook - E2B sandbox plugin declares `shellCommand: "bash"` in `leaseMetadata` - `resolveEnvironmentExecutionTarget` reads `shellCommand` from lease metadata (validating against `"bash" | "sh" | null`) - `environment-runtime.ts` adds `"shellCommand"` to `INTERNAL_PLUGIN_SANDBOX_CONFIG_KEYS` so the field round-trips through internal plugin config without leaking to external plugin metadata - Updated tests in `command-managed-runtime.test.ts`, `execution-target-sandbox.test.ts`, `sandbox-callback-bridge.test.ts`, `environment-execution-target.test.ts` ## Verification - `pnpm --filter @paperclipai/adapter-utils test` - `pnpm --filter @paperclipai/server test -- environment-execution-target` - `pnpm --filter @paperclipai/sandbox-providers-e2b test` - Manual QA: boot a Paperclip instance, create an E2B-backed environment, run a claude_local agent against it, and confirm the run completes (verifies bash shell semantics flow through the callback bridge end-to-end) ## Risks - E2B sandbox commands now run under `bash -lc` instead of `sh -lc`. Bash is a strict superset for the commands we issue (no busybox-only flags in our shell scripts), so risk is low. The shellCommand field is opt-in via lease metadata — providers that don't declare it stay on `sh`. - New optional field on `CommandManagedRuntimeSpec` and `AdapterSandboxExecutionTarget`. Consumers ignoring the field retain previous behaviour (sh). - Lease metadata now carries an additional field. Existing leases without `shellCommand` resolve to `null` and fall back to sh — backwards compatible. ## Model Used - OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI - Provider: OpenAI - Used to author the code changes in this PR ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots — N/A (no UI changes) - [ ] I have updated relevant documentation to reflect my changes — N/A - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-05-03 12:19:35 -07:00
"shellCommand",
Add sandbox environment support (#4415) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies. > - The environment/runtime layer decides where agent work executes and how the control plane reaches those runtimes. > - Today Paperclip can run locally and over SSH, but sandboxed execution needs a first-class environment model instead of one-off adapter behavior. > - We also want sandbox providers to be pluggable so the core does not hardcode every provider implementation. > - This branch adds the Sandbox environment path, the provider contract, and a deterministic fake provider plugin. > - That required synchronized changes across shared contracts, plugin SDK surfaces, server runtime orchestration, and the UI environment/workspace flows. > - The result is that sandbox execution becomes a core control-plane capability while keeping provider implementations extensible and testable. ## What Changed - Added sandbox runtime support to the environment execution path, including runtime URL discovery, sandbox execution targeting, orchestration, and heartbeat integration. - Added plugin-provider support for sandbox environments so providers can be supplied via plugins instead of hardcoded server logic. - Added the fake sandbox provider plugin with deterministic behavior suitable for local and automated testing. - Updated shared types, validators, plugin protocol definitions, and SDK helpers to carry sandbox provider and workspace-runtime contracts across package boundaries. - Updated server routes and services so companies can create sandbox environments, select them for work, and execute work through the sandbox runtime path. - Updated the UI environment and workspace surfaces to expose sandbox environment configuration and selection. - Added test coverage for sandbox runtime behavior, provider seams, environment route guards, orchestration, and the fake provider plugin. ## Verification - Ran locally before the final fixture-only scrub: - `pnpm -r typecheck` - `pnpm test:run` - `pnpm build` - Ran locally after the final scrub amend: - `pnpm vitest run server/src/__tests__/runtime-api.test.ts` - Reviewer spot checks: - create a sandbox environment backed by the fake provider plugin - run work through that environment - confirm sandbox provider execution does not inherit host secrets implicitly ## Risks - This touches shared contracts, plugin SDK plumbing, server runtime orchestration, and UI environment/workspace flows, so regressions would likely show up as cross-layer mismatches rather than isolated type errors. - Runtime URL discovery and sandbox callback selection are sensitive to host/bind configuration; if that logic is wrong, sandbox-backed callbacks may fail even when execution succeeds. - The fake provider plugin is intentionally deterministic and test-oriented; future providers may expose capability gaps that this branch does not yet cover. ## Model Used - OpenAI Codex coding agent on a GPT-5-class backend in the Paperclip/Codex harness. Exact backend model ID is not exposed in-session. Tool-assisted workflow with shell execution, file editing, git history inspection, and local test execution. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-24 12:15:53 -07:00
"sandboxProviderPlugin",
]);
function sanitizePluginSandboxConfigFromLeaseMetadata(
metadata: Record<string, unknown> | null | undefined,
): Record<string, unknown> {
const sanitized: Record<string, unknown> = {};
for (const [key, value] of Object.entries(metadata ?? {})) {
if (INTERNAL_PLUGIN_SANDBOX_CONFIG_KEYS.has(key)) continue;
sanitized[key] = value;
}
return sanitized;
}
function sandboxConfigForLeaseMetadata(config: SandboxEnvironmentConfig): Record<string, unknown> {
return { ...config };
}
function tryParseCurrentPluginConfig(environment: Environment): PluginEnvironmentConfig | null {
if (environment.driver !== "plugin") {
return null;
}
try {
const parsed = parseEnvironmentDriverConfig(environment);
return parsed.driver === "plugin" ? parsed.config : null;
} catch {
return null;
}
}
function createPluginEnvironmentDriver(
db: Db,
workerManager: PluginWorkerManager,
): EnvironmentRuntimeDriver {
const environmentsSvc = environmentService(db);
const pluginRegistry = pluginRegistryService(db);
async function resolvePluginDriver(config: PluginEnvironmentConfig) {
const plugin = await pluginRegistry.getByKey(config.pluginKey);
if (!plugin || plugin.status !== "ready") {
throw new Error(`Plugin environment driver "${pluginDriverProviderKey(config)}" is not ready.`);
}
const driver = plugin.manifestJson.environmentDrivers?.find(
(candidate) => candidate.driverKey === config.driverKey,
);
if (!driver) {
throw new Error(`Plugin "${config.pluginKey}" does not declare environment driver "${config.driverKey}".`);
}
if (!workerManager.isRunning(plugin.id)) {
throw new Error(`Plugin environment driver "${pluginDriverProviderKey(config)}" has no running worker.`);
}
return { plugin };
}
async function resolvePluginDriverForRelease(input: EnvironmentDriverReleaseInput) {
const metadata = input.lease.metadata ?? {};
const metadataPluginId = readString(metadata.pluginId);
const metadataPluginKey = readString(metadata.pluginKey);
const metadataDriverKey = readString(metadata.driverKey);
const currentConfig = tryParseCurrentPluginConfig(input.environment);
if (!metadataPluginId && !metadataPluginKey && !metadataDriverKey) {
if (!currentConfig) {
throw new Error(`Expected plugin environment config for driver "${input.environment.driver}".`);
}
const { plugin } = await resolvePluginDriver(currentConfig);
return {
plugin,
pluginKey: currentConfig.pluginKey,
driverKey: currentConfig.driverKey,
driverConfig: currentConfig.driverConfig,
};
}
const plugin = metadataPluginId
? await pluginRegistry.getById(metadataPluginId)
: metadataPluginKey
? await pluginRegistry.getByKey(metadataPluginKey)
: currentConfig
? await pluginRegistry.getByKey(currentConfig.pluginKey)
: null;
const driverKey = metadataDriverKey ?? currentConfig?.driverKey;
const pluginKey = metadataPluginKey ?? plugin?.pluginKey ?? currentConfig?.pluginKey ?? "unknown";
if (!driverKey) {
throw new Error(`Plugin environment driver "${pluginKey}:unknown" is missing a driver key.`);
}
if (!plugin || plugin.status !== "ready") {
throw new Error(`Plugin environment driver "${pluginKey}:${driverKey}" is not ready.`);
}
const declaredDriver = plugin.manifestJson.environmentDrivers?.find(
(candidate) => candidate.driverKey === driverKey,
);
if (!declaredDriver) {
throw new Error(`Plugin "${plugin.pluginKey}" does not declare environment driver "${driverKey}".`);
}
if (!workerManager.isRunning(plugin.id)) {
throw new Error(`Plugin environment driver "${plugin.pluginKey}:${driverKey}" has no running worker.`);
}
const currentConfigStillMatches =
currentConfig?.pluginKey === plugin.pluginKey && currentConfig.driverKey === driverKey;
return {
plugin,
pluginKey: plugin.pluginKey,
driverKey,
driverConfig: currentConfigStillMatches ? currentConfig.driverConfig : {},
};
}
return {
driver: "plugin",
async acquireRunLease(input) {
const parsed = parseEnvironmentDriverConfig(input.environment);
if (parsed.driver !== "plugin") {
throw new Error(`Expected plugin environment config for driver "${input.environment.driver}".`);
}
const { plugin } = await resolvePluginDriver(parsed.config);
const providerLease = await workerManager.call(plugin.id, "environmentAcquireLease", {
driverKey: parsed.config.driverKey,
companyId: input.companyId,
environmentId: input.environment.id,
config: parsed.config.driverConfig,
runId: input.heartbeatRunId,
workspaceMode: input.executionWorkspaceMode ?? undefined,
});
return await environmentsSvc.acquireLease({
companyId: input.companyId,
environmentId: input.environment.id,
executionWorkspaceId: input.executionWorkspaceId,
issueId: input.issueId,
heartbeatRunId: input.heartbeatRunId,
leasePolicy: "ephemeral",
provider: `plugin:${parsed.config.pluginKey}:${parsed.config.driverKey}`,
providerLeaseId: providerLease.providerLeaseId,
expiresAt: parseExpiresAt(providerLease.expiresAt),
metadata: {
providerMetadata: providerLease.metadata ?? {},
driver: input.environment.driver,
executionWorkspaceMode: input.executionWorkspaceMode,
pluginId: plugin.id,
pluginKey: parsed.config.pluginKey,
driverKey: parsed.config.driverKey,
},
});
},
async releaseRunLease(input) {
const { plugin, driverKey, driverConfig } = await resolvePluginDriverForRelease(input);
await workerManager.call(plugin.id, "environmentReleaseLease", {
driverKey,
companyId: input.lease.companyId,
environmentId: input.environment.id,
config: driverConfig,
providerLeaseId: input.lease.providerLeaseId,
leaseMetadata: input.lease.metadata ?? undefined,
});
return await environmentsSvc.releaseLease(input.lease.id, input.status);
},
async resumeRunLease(input) {
if (!input.lease.providerLeaseId) {
throw new Error(`Plugin environment lease "${input.lease.id}" does not have a provider lease id to resume.`);
}
const { pluginKey, driverKey, driverConfig } = await resolvePluginDriverForRelease({
...input,
status: "released",
});
return await resumePluginEnvironmentLease({
db,
workerManager,
companyId: input.lease.companyId,
environmentId: input.environment.id,
config: {
pluginKey,
driverKey,
driverConfig,
},
providerLeaseId: input.lease.providerLeaseId,
leaseMetadata: input.lease.metadata ?? undefined,
});
},
async destroyRunLease(input) {
const { pluginKey, driverKey, driverConfig } = await resolvePluginDriverForRelease({
...input,
status: "failed",
});
await destroyPluginEnvironmentLease({
db,
workerManager,
companyId: input.lease.companyId,
environmentId: input.environment.id,
config: {
pluginKey,
driverKey,
driverConfig,
},
providerLeaseId: input.lease.providerLeaseId,
leaseMetadata: input.lease.metadata ?? undefined,
});
return await environmentsSvc.releaseLease(input.lease.id, "failed");
},
async realizeWorkspace(input) {
const { plugin, pluginKey, driverKey, driverConfig } = await resolvePluginDriverForRelease({
environment: input.environment,
lease: input.lease,
status: "released",
});
return await realizePluginEnvironmentWorkspace({
db,
workerManager,
pluginId: plugin.id,
config: {
pluginKey,
driverKey,
driverConfig,
},
params: {
driverKey,
companyId: input.lease.companyId,
environmentId: input.environment.id,
config: driverConfig,
lease: {
providerLeaseId: input.lease.providerLeaseId,
metadata: input.lease.metadata ?? undefined,
expiresAt: input.lease.expiresAt?.toISOString() ?? null,
},
workspace: input.workspace,
},
});
},
async execute(input) {
const { plugin, pluginKey, driverKey, driverConfig } = await resolvePluginDriverForRelease({
environment: input.environment,
lease: input.lease,
status: "released",
});
return await executePluginEnvironmentCommand({
db,
workerManager,
pluginId: plugin.id,
config: {
pluginKey,
driverKey,
driverConfig,
},
params: {
driverKey,
companyId: input.lease.companyId,
environmentId: input.environment.id,
config: driverConfig,
lease: {
providerLeaseId: input.lease.providerLeaseId,
metadata: input.lease.metadata ?? undefined,
expiresAt: input.lease.expiresAt?.toISOString() ?? null,
},
command: input.command,
args: input.args,
cwd: input.cwd,
env: input.env,
stdin: input.stdin,
timeoutMs: input.timeoutMs,
},
});
},
};
}
export function environmentRuntimeService(
db: Db,
options: {
drivers?: EnvironmentRuntimeDriver[];
pluginWorkerManager?: PluginWorkerManager;
Fix runtime state race, workspace sync, plugin startup, and orphaned leases (#4804) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Agents run inside environments that are leased, and the server manages runtime state, workspace configuration, and plugin lifecycle > - Several edge cases caused failures during concurrent operations: a race condition in runtime state insertion could produce duplicate-key errors, reused workspaces didn't sync their configuration when the parent issue was updated, sandbox provider plugins could be queried before registration completed, and orphaned environment leases from failed runs were never released > - This PR fixes these four runtime/environment issues > - The benefit is more reliable concurrent agent execution and proper resource cleanup ## What Changed - `services/heartbeat.ts`: Fixed a race condition where concurrent runtime state inserts could fail with a duplicate-key error by using an upsert pattern - `services/issues.ts`: Sync reused workspace configuration when an issue is updated, so the workspace reflects the latest issue state - `services/environment-runtime.ts`: Fixed a startup race where sandbox provider plugins could be queried before registration completed, by awaiting plugin readiness before resolving environment drivers - `services/heartbeat.ts`: Release environment leases for orphaned runs that lost their process without cleanup ## Verification - `pnpm test` — all existing and new tests pass, including new tests for runtime state upsert and process recovery lease cleanup - `pnpm typecheck` — clean - Manual: trigger concurrent agent runs to verify no duplicate-key failures; verify orphaned leases are released after process loss ## Risks - Low risk. The runtime state upsert changes insert-to-upsert behavior, which could mask a legitimate duplicate if two different runs produce the same key — but this is prevented by the run ID being part of the key. The plugin startup await is bounded by the existing registration timeout. ## Model Used Codex GPT 5.4 high via Paperclip. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-29 16:37:10 -07:00
pluginWorkerReadyTimeoutMs?: number;
pluginWorkerReadyPollMs?: number;
Add sandbox environment support (#4415) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies. > - The environment/runtime layer decides where agent work executes and how the control plane reaches those runtimes. > - Today Paperclip can run locally and over SSH, but sandboxed execution needs a first-class environment model instead of one-off adapter behavior. > - We also want sandbox providers to be pluggable so the core does not hardcode every provider implementation. > - This branch adds the Sandbox environment path, the provider contract, and a deterministic fake provider plugin. > - That required synchronized changes across shared contracts, plugin SDK surfaces, server runtime orchestration, and the UI environment/workspace flows. > - The result is that sandbox execution becomes a core control-plane capability while keeping provider implementations extensible and testable. ## What Changed - Added sandbox runtime support to the environment execution path, including runtime URL discovery, sandbox execution targeting, orchestration, and heartbeat integration. - Added plugin-provider support for sandbox environments so providers can be supplied via plugins instead of hardcoded server logic. - Added the fake sandbox provider plugin with deterministic behavior suitable for local and automated testing. - Updated shared types, validators, plugin protocol definitions, and SDK helpers to carry sandbox provider and workspace-runtime contracts across package boundaries. - Updated server routes and services so companies can create sandbox environments, select them for work, and execute work through the sandbox runtime path. - Updated the UI environment and workspace surfaces to expose sandbox environment configuration and selection. - Added test coverage for sandbox runtime behavior, provider seams, environment route guards, orchestration, and the fake provider plugin. ## Verification - Ran locally before the final fixture-only scrub: - `pnpm -r typecheck` - `pnpm test:run` - `pnpm build` - Ran locally after the final scrub amend: - `pnpm vitest run server/src/__tests__/runtime-api.test.ts` - Reviewer spot checks: - create a sandbox environment backed by the fake provider plugin - run work through that environment - confirm sandbox provider execution does not inherit host secrets implicitly ## Risks - This touches shared contracts, plugin SDK plumbing, server runtime orchestration, and UI environment/workspace flows, so regressions would likely show up as cross-layer mismatches rather than isolated type errors. - Runtime URL discovery and sandbox callback selection are sensitive to host/bind configuration; if that logic is wrong, sandbox-backed callbacks may fail even when execution succeeds. - The fake provider plugin is intentionally deterministic and test-oriented; future providers may expose capability gaps that this branch does not yet cover. ## Model Used - OpenAI Codex coding agent on a GPT-5-class backend in the Paperclip/Codex harness. Exact backend model ID is not exposed in-session. Tool-assisted workflow with shell execution, file editing, git history inspection, and local test execution. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-24 12:15:53 -07:00
} = {},
) {
const environmentsSvc = environmentService(db);
const drivers = new Map<string, EnvironmentRuntimeDriver>();
const defaultDrivers = [
createLocalEnvironmentDriver(db),
createSshEnvironmentDriver(db),
Fix runtime state race, workspace sync, plugin startup, and orphaned leases (#4804) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Agents run inside environments that are leased, and the server manages runtime state, workspace configuration, and plugin lifecycle > - Several edge cases caused failures during concurrent operations: a race condition in runtime state insertion could produce duplicate-key errors, reused workspaces didn't sync their configuration when the parent issue was updated, sandbox provider plugins could be queried before registration completed, and orphaned environment leases from failed runs were never released > - This PR fixes these four runtime/environment issues > - The benefit is more reliable concurrent agent execution and proper resource cleanup ## What Changed - `services/heartbeat.ts`: Fixed a race condition where concurrent runtime state inserts could fail with a duplicate-key error by using an upsert pattern - `services/issues.ts`: Sync reused workspace configuration when an issue is updated, so the workspace reflects the latest issue state - `services/environment-runtime.ts`: Fixed a startup race where sandbox provider plugins could be queried before registration completed, by awaiting plugin readiness before resolving environment drivers - `services/heartbeat.ts`: Release environment leases for orphaned runs that lost their process without cleanup ## Verification - `pnpm test` — all existing and new tests pass, including new tests for runtime state upsert and process recovery lease cleanup - `pnpm typecheck` — clean - Manual: trigger concurrent agent runs to verify no duplicate-key failures; verify orphaned leases are released after process loss ## Risks - Low risk. The runtime state upsert changes insert-to-upsert behavior, which could mask a legitimate duplicate if two different runs produce the same key — but this is prevented by the run ID being part of the key. The plugin startup await is bounded by the existing registration timeout. ## Model Used Codex GPT 5.4 high via Paperclip. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-29 16:37:10 -07:00
createSandboxEnvironmentDriver(db, {
pluginWorkerManager: options.pluginWorkerManager,
pluginWorkerReadyTimeoutMs: options.pluginWorkerReadyTimeoutMs,
pluginWorkerReadyPollMs: options.pluginWorkerReadyPollMs,
}),
Add sandbox environment support (#4415) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies. > - The environment/runtime layer decides where agent work executes and how the control plane reaches those runtimes. > - Today Paperclip can run locally and over SSH, but sandboxed execution needs a first-class environment model instead of one-off adapter behavior. > - We also want sandbox providers to be pluggable so the core does not hardcode every provider implementation. > - This branch adds the Sandbox environment path, the provider contract, and a deterministic fake provider plugin. > - That required synchronized changes across shared contracts, plugin SDK surfaces, server runtime orchestration, and the UI environment/workspace flows. > - The result is that sandbox execution becomes a core control-plane capability while keeping provider implementations extensible and testable. ## What Changed - Added sandbox runtime support to the environment execution path, including runtime URL discovery, sandbox execution targeting, orchestration, and heartbeat integration. - Added plugin-provider support for sandbox environments so providers can be supplied via plugins instead of hardcoded server logic. - Added the fake sandbox provider plugin with deterministic behavior suitable for local and automated testing. - Updated shared types, validators, plugin protocol definitions, and SDK helpers to carry sandbox provider and workspace-runtime contracts across package boundaries. - Updated server routes and services so companies can create sandbox environments, select them for work, and execute work through the sandbox runtime path. - Updated the UI environment and workspace surfaces to expose sandbox environment configuration and selection. - Added test coverage for sandbox runtime behavior, provider seams, environment route guards, orchestration, and the fake provider plugin. ## Verification - Ran locally before the final fixture-only scrub: - `pnpm -r typecheck` - `pnpm test:run` - `pnpm build` - Ran locally after the final scrub amend: - `pnpm vitest run server/src/__tests__/runtime-api.test.ts` - Reviewer spot checks: - create a sandbox environment backed by the fake provider plugin - run work through that environment - confirm sandbox provider execution does not inherit host secrets implicitly ## Risks - This touches shared contracts, plugin SDK plumbing, server runtime orchestration, and UI environment/workspace flows, so regressions would likely show up as cross-layer mismatches rather than isolated type errors. - Runtime URL discovery and sandbox callback selection are sensitive to host/bind configuration; if that logic is wrong, sandbox-backed callbacks may fail even when execution succeeds. - The fake provider plugin is intentionally deterministic and test-oriented; future providers may expose capability gaps that this branch does not yet cover. ## Model Used - OpenAI Codex coding agent on a GPT-5-class backend in the Paperclip/Codex harness. Exact backend model ID is not exposed in-session. Tool-assisted workflow with shell execution, file editing, git history inspection, and local test execution. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-04-24 12:15:53 -07:00
...(options.pluginWorkerManager
? [createPluginEnvironmentDriver(db, options.pluginWorkerManager)]
: []),
];
for (const driver of options.drivers ?? defaultDrivers) {
drivers.set(driver.driver, driver);
}
function getDriver(driverKey: string): EnvironmentRuntimeDriver | null {
return drivers.get(driverKey) ?? null;
}
function requireDriver(environment: Pick<Environment, "driver">): EnvironmentRuntimeDriver {
const driver = getDriver(environment.driver);
if (!driver) {
throw new Error(
`Environment driver "${environment.driver}" is not registered in the environment runtime yet.`,
);
}
return driver;
}
function requireDriverKey(driverKey: string): EnvironmentRuntimeDriver {
const driver = getDriver(driverKey);
if (!driver) {
throw new Error(
`Environment driver "${driverKey}" is not registered in the environment runtime yet.`,
);
}
return driver;
}
return {
getDriver,
async acquireRunLease(input: {
companyId: string;
environment: Environment;
issueId: string | null;
heartbeatRunId: string;
persistedExecutionWorkspace: Pick<ExecutionWorkspace, "id" | "mode"> | null;
}): Promise<EnvironmentRuntimeLeaseRecord> {
if (input.environment.status !== "active") {
throw new Error(`Environment "${input.environment.name}" is not active.`);
}
const leaseContext = buildEnvironmentLeaseContext({
persistedExecutionWorkspace: input.persistedExecutionWorkspace,
});
const driver = requireDriver(input.environment);
const lease = await driver.acquireRunLease({
companyId: input.companyId,
environment: input.environment,
issueId: input.issueId,
heartbeatRunId: input.heartbeatRunId,
executionWorkspaceId: leaseContext.executionWorkspaceId,
executionWorkspaceMode: leaseContext.executionWorkspaceMode,
});
return {
environment: input.environment,
lease,
leaseContext,
};
},
async releaseRunLeases(
heartbeatRunId: string,
status: Extract<EnvironmentLeaseStatus, "released" | "expired" | "failed"> = "released",
): Promise<EnvironmentRuntimeLeaseRecord[]> {
const leaseRows = await db
.select()
.from(environmentLeases)
.where(
and(
eq(environmentLeases.heartbeatRunId, heartbeatRunId),
inArray(environmentLeases.status, ["active"]),
),
);
if (leaseRows.length === 0) {
return [];
}
const released: EnvironmentRuntimeLeaseRecord[] = [];
for (const leaseRow of leaseRows) {
const environment = await environmentsSvc.getById(leaseRow.environmentId);
if (!environment) continue;
const leaseSnapshot: EnvironmentLease = {
id: leaseRow.id,
companyId: leaseRow.companyId,
environmentId: leaseRow.environmentId,
executionWorkspaceId: leaseRow.executionWorkspaceId ?? null,
issueId: leaseRow.issueId ?? null,
heartbeatRunId: leaseRow.heartbeatRunId ?? null,
status: leaseRow.status as EnvironmentLease["status"],
leasePolicy: leaseRow.leasePolicy as EnvironmentLease["leasePolicy"],
provider: leaseRow.provider ?? null,
providerLeaseId: leaseRow.providerLeaseId ?? null,
acquiredAt: leaseRow.acquiredAt,
lastUsedAt: leaseRow.lastUsedAt,
expiresAt: leaseRow.expiresAt ?? null,
releasedAt: leaseRow.releasedAt ?? null,
failureReason: leaseRow.failureReason ?? null,
cleanupStatus: leaseRow.cleanupStatus as EnvironmentLease["cleanupStatus"],
metadata: (leaseRow.metadata as Record<string, unknown> | null) ?? null,
createdAt: leaseRow.createdAt,
updatedAt: leaseRow.updatedAt,
};
const driver = getDriver(getLeaseDriverKey(leaseSnapshot, environment));
const lease = driver
? await driver.releaseRunLease({
environment,
lease: leaseSnapshot,
status,
})
: await environmentsSvc.releaseLease(leaseRow.id, status);
if (!lease) continue;
released.push({
environment,
lease,
leaseContext: {
executionWorkspaceId: lease.executionWorkspaceId,
executionWorkspaceMode:
(lease.metadata?.executionWorkspaceMode as ExecutionWorkspace["mode"] | null | undefined) ?? null,
},
});
}
return released;
},
async resumeRunLease(input: EnvironmentDriverLeaseInput): Promise<PluginEnvironmentLease | EnvironmentLease | null> {
const driver = requireDriverKey(getLeaseDriverKey(input.lease, input.environment));
if (!driver.resumeRunLease) {
throw new Error(`Environment driver "${driver.driver}" does not support lease resume.`);
}
return await driver.resumeRunLease(input);
},
async destroyRunLease(input: EnvironmentDriverLeaseInput): Promise<EnvironmentLease | null> {
const driver = requireDriverKey(getLeaseDriverKey(input.lease, input.environment));
if (!driver.destroyRunLease) {
throw new Error(`Environment driver "${driver.driver}" does not support lease destroy.`);
}
return await driver.destroyRunLease(input);
},
async realizeWorkspace(
input: EnvironmentDriverRealizeWorkspaceInput,
): Promise<PluginEnvironmentRealizeWorkspaceResult> {
const driver = requireDriverKey(getLeaseDriverKey(input.lease, input.environment));
if (!driver.realizeWorkspace) {
throw new Error(`Environment driver "${driver.driver}" does not support workspace realization.`);
}
return await driver.realizeWorkspace(input);
},
async execute(input: EnvironmentDriverExecuteInput): Promise<PluginEnvironmentExecuteResult> {
const driver = requireDriverKey(getLeaseDriverKey(input.lease, input.environment));
if (!driver.execute) {
throw new Error(`Environment driver "${driver.driver}" does not support command execution.`);
}
return await driver.execute(input);
},
};
}
export type EnvironmentRuntimeService = ReturnType<typeof environmentRuntimeService>;