Commit graph

5 commits

Author SHA1 Message Date
Devin Foley
486fb88a15
Add Cloudflare sandbox provider plugin (#5687)
> _Stacked on top of #5685#5686. Diff against master includes commits
from earlier PRs in the stack — review focuses on the two new commits
(`Extend sandbox callback bridge for Worker-hosted plugins` + `Add
Cloudflare sandbox provider plugin`)._

## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Each agent runs in a sandbox environment, and operators choose which
provider backs that sandbox — today E2B and Daytona are bundled with the
platform
> - Cloudflare Workers + Durable Objects + the Sandbox SDK offer a
credible new option: globally distributed, cheap idle, and
operator-deployable as a single Worker
> - To plug it in, Paperclip needs (a) a provider plugin that speaks the
`PaperclipPluginManifestV1` lifecycle and (b) a small operator-deployed
Worker — the **bridge** — that adapts Paperclip's runtime RPCs to the
Cloudflare Sandbox SDK
> - The plugin extends the existing sandbox-callback-bridge with a
`bridge.transport: "worker"` discriminator so the platform routes
runtime RPCs through the Worker bridge instead of the in-process runner
> - This pull request adds the plugin, the bridge Worker template, and
the supporting adapter-utils + server hooks the new transport needs
> - The benefit is that operators can run sandboxes on Cloudflare's edge
with no new platform code beyond installing the plugin and deploying the
Worker

## What Changed

**Shared support (`Extend sandbox callback bridge for Worker-hosted
plugins`):**

- `packages/adapter-utils/src/sandbox-callback-bridge.{ts,test.ts}`:
expose `expectedHostHeader` so plugin-side bridge clients can verify the
canonical request envelope before forwarding.
- `packages/adapter-utils/src/command-managed-runtime.{ts,test.ts}`:
relax the always-fresh runner construction so callers can re-use a
runner across exec calls (Worker-hosted bridges hold the runner inside a
Durable Object).
- `server/src/services/environment-runtime.ts` +
`environment-runtime.test.ts`: route Worker-hosted bridges through the
same env-shaping path as E2B and pin the `requestEnv` contract.
- `server/src/services/plugin-environment-driver.ts`: thread an optional
`issueId` through the runtime descriptor so bridges can scope leases to
the originating issue (used by Cloudflare to map a sandbox to the
issue/workflow for billing and audit).
- `packages/plugins/sdk/src/protocol.ts`: add `issueId?` to
`PluginEnvironmentDriverBaseParams` and the new `bridge.transport:
"worker"` discriminator that the new plugin declares.
- `server/__tests__/heartbeat-plugin-environment.test.ts`: pin the
heartbeat path against the new runtime descriptor.

**The Cloudflare plugin itself (`Add Cloudflare sandbox provider
plugin`):**

- `packages/plugins/sandbox-providers/cloudflare/`: plugin entry,
manifest, plugin runtime (lifecycle + bridge client), config parsing,
and Vitest coverage. Manifest declares `bridge.transport: "worker"` so
the platform routes runtime RPCs through the bridge client.
- `bridge-template/`: a Worker template the operator deploys with
`wrangler`. Owns Durable Object-backed sessions (`sessions.ts`),
exec/stream routes (`exec.ts`, `routes.ts`), and an HMAC auth layer
(`auth.ts`) that pins the `Host` header surface. Includes the
SDK-contract-correct exec implementation, lease recovery, and chunked
stdout/stderr streaming.
- Tests cover lease/session handoff (`bridge-template/src/exec.test.ts`,
`routes.test.ts`), bridge client request shaping
(`src/bridge-client.test.ts`), and end-to-end plugin behavior
(`src/plugin.test.ts`) including streamed exec output. 27 tests in
total.
- `README.md` walks the operator through deploying the bridge Worker,
registering the plugin, and configuring the runtime.

## Verification

- `pnpm typecheck`
- `pnpm exec vitest run --no-coverage
packages/adapter-utils/src/sandbox-callback-bridge.test.ts
packages/adapter-utils/src/command-managed-runtime.test.ts
server/src/__tests__/environment-runtime.test.ts
server/src/__tests__/heartbeat-plugin-environment.test.ts`
- `(cd packages/plugins/sandbox-providers/cloudflare && pnpm test)` — 27
passing

For an operator-side smoke test:

1. Deploy the bridge: `cd
packages/plugins/sandbox-providers/cloudflare/bridge-template &&
wrangler deploy`
2. Register the plugin in your Paperclip instance, point its bridge URL
at the deployed Worker, set the HMAC shared secret.
3. Create a sandbox environment whose provider is `cloudflare`, then run
a Codex or Claude job against it.

## Risks

- Adds a new `bridge.transport: "worker"` code path, but the existing
E2B / Daytona transports go through the same shaped helpers and have
explicit test coverage that pins their behavior unchanged.
- The Worker bridge stores session state in a Durable Object; operator
instances must be aware of the corresponding Cloudflare costs (DO
requests, storage). Documented in the README.
- The `issueId` plumbing is optional throughout — existing plugins that
don't supply it continue to work.

## Model Used

- Provider: Anthropic
- Model: Claude Opus 4.7 (1M context)
- Capabilities used: extended reasoning, tool use (Read/Edit/Bash/Grep)

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots — N/A, no UI change
- [x] I have updated relevant documentation to reflect my changes
(plugin README, bridge-template README)
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-11 07:33:13 -07:00
Devin Foley
534aee66ae
Add cursor_cloud adapter for Cursor SDK + Cloud Agents API v1 (#5664)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - There are many adapter types, one per agent-runtime product (Claude,
Codex, OpenCode, Cursor local CLI, etc.)
> - Cursor shipped a public TypeScript SDK on 2026-04-29 that exposes
Cursor's full hosted-agent platform (cloud VMs, harness, MCP, skills,
hooks)
> - Paperclip had no first-class adapter for this — agents that wanted
to use Cursor's managed cloud runtime had to fall back to the local CLI
adapter, which loses the cloud session, streaming, and durable run model
> - This PR adds a new `cursor_cloud` adapter built directly on
`@cursor/sdk`, with Paperclip's heartbeat mapped to Cursor's
durable-agent + per-run model
> - The benefit is that any Paperclip agent can now drive a Cursor cloud
agent across heartbeats with native session reuse, streaming, and
cancellation, while Paperclip remains the source of truth for issue/task
state

## What Changed

- New built-in adapter package `packages/adapters/cursor-cloud` (15
files, ~1.7k LOC) backed by `@cursor/sdk` ^1.0.12
- `src/server/execute.ts` — SDK-first lifecycle: `Agent.create` /
`Agent.resume` / `Agent.getRun` / `agent.send` / `run.stream` /
`run.wait`, with session reuse keyed on the (runtime env type, env name,
repo set) tuple
- `src/server/session.ts` — codec for `cursorAgentId` + `latestRunId` +
repo metadata, persisted in `runtime.sessionParams`
- `src/server/test.ts` — environment probe via `Cursor.me()` and
optional model validation via `Cursor.models.list()`
- `src/ui/parse-stdout.ts` + `src/cli/format-event.ts` — normalize
Cursor SDK message types (`status`, `thinking`, `assistant`, `user`,
`tool_call`, `tool_result`, `result`) into Paperclip transcript events
for the UI and CLI
- Registrations: `packages/shared/src/constants.ts`,
`packages/adapter-utils/src/session-compaction.ts`,
`server/src/adapters/{registry,builtin-adapter-types}.ts`,
`ui/src/adapters/{registry,adapter-display-registry}.ts` +
`ui/src/adapters/cursor-cloud/index.ts`, `cli/src/adapters/registry.ts`,
plus workspace deps in `cli`/`server`/`ui` `package.json`
- `ui/src/components/AgentConfigForm.tsx` — hide local-Cursor
`mode`/thinking-effort field for `cursor_cloud` (different config
surface)
- 11 vitest tests covering execute paths (fresh create, matching-resume,
active-run reattach, non-finished result), session codec round-trip,
transcript parsing, and config building

## Verification

Reviewer steps:

```bash
pnpm install
pnpm --filter @paperclipai/adapter-cursor-cloud typecheck   # → clean
pnpm vitest run packages/adapters/cursor-cloud              # → 11/11 passing
```

End-to-end check against a real Cursor cloud agent (requires
`CURSOR_API_KEY` and Cursor GitHub-app install on the target repo):

1. Create a `cursor_cloud` agent in Paperclip with `repoUrl` set to the
test repo, `repoStartingRef: main`, and `env.CURSOR_API_KEY` set
2. Trigger a heartbeat → adapter calls `Agent.create({ cloud: { env: {
type: "cloud" }, repos: [...] } })`, streams events, terminates on
`finished`
3. Trigger a second heartbeat → adapter calls `Agent.resume` or
`agent.send` follow-up depending on prior-run state, reusing
`cursorAgentId`
4. The Paperclip UI/CLI transcript reflects Cursor `status` / `thinking`
/ `assistant` events as they stream
5. Cancellation from Paperclip maps to `run.cancel()` or Cloud API v1
`cancelRun` for cross-heartbeat cancellation

A direct-SDK smoke run against a real repo (devinfoley/my_test_project @
main) confirmed: `Cursor.me()` ok → `Agent.create` → `agent.send` →
`run.stream()` (30 events) → terminal status `finished` in ~11s.

## Risks

- **New adapter, additive only.** No existing adapter or registry is
replaced; current `cursor` local-CLI adapter is untouched. Default
behavior of any existing agent is unchanged.
- **External dependency on `@cursor/sdk`.** Cursor's SDK is v1.0.x and
may evolve. Mocked unit tests cover the public surface used here; if the
SDK breaks compatibility we update the adapter independently.
- **Cost/budget.** `cursor_cloud` runs on Cursor's billed cloud VMs;
operators must understand they are spending money outside Paperclip's
budget controls when they enable this adapter. Same shape as other
API-billed adapters.
- **No webhook support in V1.** The SDK already provides
stream/wait/cancel/reattach, so V1 does not require a public callback
URL. If a future use case needs out-of-band wakes, we add a Cloud API v1
webhook bridge as a separate change. This is called out in the issue
plan document.
- **Lockfile.** Per repo policy, `pnpm-lock.yaml` is intentionally not
in this PR — CI's lockfile workflow will update it on merge given the
manifest changes.

## Model Used

- Provider: Anthropic Claude (via Claude Code / Paperclip `claude_local`
adapter)
- Model: `claude-opus-4-7` (Claude Opus 4.7), knowledge cutoff January
2026
- Mode: standard tool-use with extended reasoning
- Context: ~200k token window
- Capabilities used: code generation, multi-file edits, shell/test
execution, GitHub PR workflow

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass (11/11 in
`packages/adapters/cursor-cloud`)
- [x] I have added or updated tests where applicable (4 new test files,
11 cases)
- [ ] If this change affects the UI, I have included before/after
screenshots (the only UI change is hiding the local-Cursor mode field on
the `cursor_cloud` adapter — happy to attach a screenshot if the
reviewer wants one)
- [x] I have updated relevant documentation to reflect my changes (issue
plan document supersedes the pre-SDK design; tracked in PAPA-203)
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-10 17:21:04 -07:00
Devin Foley
433dfed33d
Enable CI publish for plugin-daytona (#5586)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The release pipeline gates new public packages behind a bootstrap
policy: `scripts/check-release-package-bootstrap.mjs` requires every
package marked `publishFromCi: true` in
`scripts/release-package-manifest.json` to already exist on npm
> - PR #5580 added the new Daytona sandbox provider plugin but had to
land with `publishFromCi: false` because the package had never been
published, so CI's release plan would have failed bootstrap validation
otherwise
> - Now that `@paperclipai/plugin-daytona` has been bootstrap-published
to npm by hand, the temporary `false` flag is the only thing keeping it
out of the standard CI publish flow
> - This pull request flips the Daytona entry to `publishFromCi: true`,
matching every other release-enabled package in the manifest
> - The benefit is that future tagged releases will publish the Daytona
plugin automatically alongside the rest of the monorepo's public
packages

## What Changed

- Single-line flip in `scripts/release-package-manifest.json`:
`@paperclipai/plugin-daytona` is now `publishFromCi: true`

## Verification

- `node ./scripts/release-package-map.mjs check` → `Release package
manifest OK: 19 enabled for CI publish, 0 disabled pending bootstrap`
(was 18 + 1)
- `node ./scripts/check-release-package-bootstrap.mjs
scripts/release-package-manifest.json` against `origin/master` →
`Release bootstrap OK for changed manifests:
@paperclipai/plugin-daytona`, confirming npm sees the
bootstrap-published package
- No code changes; no tests required beyond the existing manifest
validators

## Risks

- Low risk. Only effect is that the next release run will include
`@paperclipai/plugin-daytona` in its publish set
- If the npm bootstrap was incomplete, CI's bootstrap check will fail
loudly before any release tag goes out — same safety net the policy is
designed to provide

## Model Used

- Claude Opus 4.7 (`claude-opus-4-7`), extended thinking, tool use
enabled

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [ ] I have added or updated tests where applicable (N/A —
manifest-only flag flip, covered by existing validators)
- [ ] If this change affects the UI, I have included before/after
screenshots (N/A — release config)
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-09 16:58:35 -07:00
Devin Foley
06e6ee25cd
Add Daytona sandbox provider plugin (#5580)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Agents need isolated sandbox environments to execute work safely;
Paperclip already supports E2B as a sandbox provider plugin
> - Users want to use Daytona (https://www.daytona.io/) as an
alternative sandbox backend, but no plugin existed for it
> - Without a Daytona plugin, teams that prefer Daytona's
pricing/regions/runtime can't run Paperclip agents on it
> - This pull request adds a `@paperclip/sandbox-provider-daytona`
plugin that mirrors the existing E2B plugin shape and wires up Daytona's
`@daytonaio/sdk` for sandbox lifecycle, command execution, and shell
detection
> - The benefit is that operators can pick Daytona as a first-class
sandbox provider without touching core code, broadening Paperclip's
runtime options

## What Changed

- New plugin package `packages/plugins/sandbox-providers/daytona` with
manifest, worker entry, and provider implementation backed by
`@daytonaio/sdk`
- Implements sandbox create/destroy/exec/upload/download lifecycle,
shell command detection, and config/env wiring consistent with the E2B
plugin
- Adds unit tests under `src/plugin.test.ts` and a README documenting
setup and the `DAYTONA_API_KEY` requirement
- Minor adjustments in `scripts/paperclip-issue-update.sh`,
`packages/shared/src/issue-thread-interactions.test.ts`, and
`packages/shared/src/validators/issue.ts` to support the integration

## Verification

- Re-ran the full sandbox provider matrix on the QA Paperclip instance
using Daytona as the runtime — all 6 adapters executed inside the
Daytona sandbox with zero `environmentExecute` timeouts
- 5/6 adapters pass cleanly (or with informational warns); the only
failure is `codex_local`, which is an OpenAI quota/billing issue
unrelated to Daytona
- `pnpm --filter @paperclip/sandbox-provider-daytona test` runs the
plugin unit tests

## Risks

- New optional plugin; no behavior change for users who don't enable it
- Requires `DAYTONA_API_KEY` for runtime use — documented in the plugin
README
- Daytona SDK is a new external dependency; tracked in the plugin's own
package.json so it doesn't affect the core install footprint

## Model Used

- Claude Opus 4.7 (`claude-opus-4-7`), extended thinking, tool use
enabled

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots (N/A — backend plugin)
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-09 11:50:12 -07:00
Devin Foley
29401b231b
fix(ci): gate new release packages on npm bootstrap (#5146)
## Thinking Path

> - Paperclip is a control plane for autonomous agent companies, so its
release automation is part of the core operator trust boundary.
> - The affected subsystem is npm/GitHub Actions release publishing for
the public monorepo packages.
> - The concrete failure was that a newly added package reached
`master`, the canary workflow attempted its first publish, and npm
trusted publishing was not yet bootstrapped for that package.
> - That means the problem is not just one broken run; it is a missing
pre-merge guard that lets release-ineligible packages land and only fail
once `publish_canary` runs.
> - This pull request makes release enrollment explicit, validates that
enrollment in CI, and adds a PR-time bootstrap check against npm for
changed release-enabled package manifests.
> - The result is that we keep trusted publishing, avoid teaching CI to
`npm adduser`, and move this class of failure from post-merge canary
time to pre-merge review time.

## What Changed

- Added `scripts/release-package-manifest.json` so release-managed
public packages are explicitly enrolled instead of being inferred from
every non-private workspace package.
- Hardened `scripts/release-package-map.mjs` to validate the manifest
before release workflows rewrite versions or assemble publish payloads.
- Added `scripts/check-release-package-bootstrap.mjs` and wired it into
`.github/workflows/pr.yml` so PRs that change a release-enabled package
manifest fail if that package does not already exist on npm.
- Added release-package manifest coverage tests to
`scripts/release-package-map.test.mjs` and included them in `pnpm run
test:release-registry`.
- Wired manifest validation into `.github/workflows/release.yml` and
documented the first-publish bootstrap policy in `doc/PUBLISHING.md` and
`doc/RELEASE-AUTOMATION-SETUP.md`.

## Verification

- `pnpm run test:release-registry`
- `./scripts/release.sh canary --skip-verify --dry-run`
- Confirmed the committed diff contains no obvious PII/secrets via
targeted pattern scan before pushing.

## Risks

- Low risk overall: this is CI/release-policy code, not product runtime
logic.
- The new PR bootstrap check depends on npm metadata availability, so a
transient npm outage could block a PR that changes a release-enabled
package manifest.
- The manifest introduces a new source of truth that must stay aligned
with public package additions, but that is intentional and now enforced.

## Model Used

- OpenAI Codex via the `codex_local` Paperclip adapter; GPT-5-based
coding agent with tool use, terminal execution, git, and GitHub CLI.
Exact served model ID/context window are not exposed by the local
runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-03 19:31:28 -07:00