Docker / build-and-push (push) Failing after 3m41s
Refresh Lockfile / refresh (push) Failing after 5m12s
Release / verify_canary (push) Failing after 10m53s
Release / verify_stable (push) Has been skipped
Release / publish_canary (push) Has been skipped
Release / preview_stable (push) Has been skipped
Release / publish_stable (push) Has been skipped
Fixes#2391Fixes#3394Fixes#4094Fixes#5501Fixes#5916Fixes#6215Fixes#6514
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies
> - Plugins extend the platform by registering agent-callable tools
backed by long-running worker processes
> - `PluginToolDispatcher` is the boundary between the HTTP
`/api/plugins/tools/execute` route and `PluginWorkerManager`, which owns
those worker processes
> - `PluginWorkerManager` keys live workers by the plugin's **database
UUID**, but `plugin-loader` was registering tools using only `pluginKey`
— so every tool call did `workerManager.isRunning(pluginKey)` and always
got `false`
> - As a result, every `POST /api/plugins/tools/execute` against a
tool-exposing plugin returned 502 `worker for plugin X is not running`,
even though the worker process was alive (hit in production by
`vexion.council-chat`; `mem0-sync` would be next)
> - This pull request threads the DB UUID through the dispatcher →
registry hop and hardens the contract so omitting the UUID is a
compile-time error, not a silent fallback
> - The benefit is plugin tool execution actually works for any plugin
declaring `manifest.tools[]`, and the type system prevents the same bug
from recurring
## What Changed
- `server/src/services/plugin-loader.ts` — pass in-scope `pluginId` (DB
UUID) as the third argument to `toolDispatcher.registerPluginTools`.
Single-line root fix.
- `server/src/services/plugin-tool-dispatcher.ts` —
`registerPluginTools` now takes `pluginDbId: string` (required, was
optional). JSDoc updated to document the worker-routing contract and why
the optional signature masked the bug.
- `server/src/services/plugin-tool-registry.ts` — `registerPlugin`
throws on missing/empty `pluginDbId` so any new call site that forgets
the UUID fails immediately rather than silently falling back to
`pluginKey`.
- `server/src/__tests__/plugin-tool-dispatcher-pluginDbId.test.ts` — new
focused regression suite covering the activation path, disable→enable
lifecycle, worker re-spawn, and the empty-UUID guard.
## Verification
- `pnpm vitest run
server/src/__tests__/plugin-tool-dispatcher-pluginDbId.test.ts` — 6/6
passing.
- `pnpm vitest run server/src/__tests__/plugin-database.test.ts
server/src/__tests__/plugin-routes-authz.test.ts
server/src/__tests__/plugin-lifecycle-restart.test.ts` — 48/48 passing
on the merge commit.
- `pnpm --filter @paperclipai/server typecheck` — no new errors
introduced by these files.
- Manual repro path:
1. Install a plugin that declares `manifest.tools[]` and uses
`runWorker`.
2. Confirm status `ready` and a live worker (`paperclipai plugin
diagnostics <key>`).
3. `POST /api/plugins/tools/execute` with `{ tool:
"<pluginKey>:<toolName>", parameters, runContext }`.
4. Pre-fix: HTTP 502, `worker for plugin <key> is not running`.
Post-fix: tool dispatches normally.
## Risks
- Low risk. The signature tightening (`pluginDbId?` → `pluginDbId`) is a
back-compatible behavioral fix at the only production call site
(`plugin-loader`), which already had the UUID in scope.
- Test/recovery paths that previously omitted the UUID must now supply
it; the new error message identifies the missing arg explicitly.
- No database migration, no API/schema change, no plugin-author-facing
change.
- The merge commit pulls master into the PR branch additively (no
rebase); reviewers can read the fix commits independently of the merge.
## Model Used
- Provider/model: Anthropic Claude (Opus 4.7, `claude-opus-4-7`) for the
additive merge-conflict resolution, PR description rewrite, and Greptile
follow-up; original fix authored by
[@Ramon-nassa](https://github.com/Ramon-nassa).
- Capabilities used: tool use (file edit, shell, GitHub CLI), extended
thinking off, no code execution by the model.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots (N/A — server-only change)
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
---
## Original Summary (preserved from contributor)
`plugin-loader` activates plugins and calls
```ts
toolDispatcher.registerPluginTools(pluginKey, manifest)
```
with only two args. `PluginToolDispatcher.registerPluginTools` forwards
them to `registry.registerPlugin(pluginKey, manifest)`. The registry
falls back `pluginDbId ?? pluginKey`, but `PluginWorkerManager` keys
live workers by the DB UUID — so the downstream
```ts
workerManager.isRunning(pluginKey) // always false
```
causes every `POST /api/plugins/tools/execute` to fail with `worker for
plugin X is not running`, even when the worker process is alive and
healthy. **This hits every plugin that exposes tools** (we hit it in
`vexion.council-chat`; `mem0-sync` would too).
Reported-by: Vexion / Ramon Nassar (vexion.council-chat plugin, MO-068).
---------
Co-authored-by: ramon nassar <ramon@tabs.co>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: Devin Foley <devin@devinfoley.com>
Co-authored-by: Paperclip <noreply@paperclip.ing>
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies, including
how we ship the public `paperclip` repo itself
> - The `PR` and `commitperclip PR Review` workflows are the CI gating
layer that decides whether any pull request — human or bot — can be
merged to `master`
> - Dependabot opens dependency PRs that always carry a `pnpm-lock.yaml`
diff and an auto-generated PR body, but our `policy` job hard-fails any
non-`chore/refresh-lockfile` lockfile change, and our `commitperclip`
quality gate requires a Thinking-Path / What-Changed / Verification /
Risks / Model template Dependabot can't produce
> - Because `policy` fails first, every downstream lane (`Build`,
`Typecheck + Release Registry`, `General tests`, `Verify serialized
server`, `Canary Dry Run`, `e2e`, and the required `verify` check) skips
and `verify` fails — so we never see whether the upgrade is actually
safe
> - Socket.dev (PR Alerts + Project Report) and Snyk already run on
every dependency PR and are the supply-chain compensating control
against malicious upgrades; the missing piece is just letting our own
build/test signal run so a human can merge with confidence
> - This pull request adds a narrow Dependabot bypass to the two gates
that block on lockfile diffs and PR-template prose, while leaving every
other policy and security check active
> - The benefit is that Dependabot PRs like #7331 will now run the full
PR matrix, giving reviewers real evidence to approve or reject — without
weakening any check that targets supply-chain or build-correctness risk
## What Changed
- `.github/workflows/pr.yml` — extended the existing
`chore/refresh-lockfile` bypass on the `policy` job's "Block manual
lockfile edits" step to also skip when `github.actor ==
'dependabot[bot]'`. Every other policy step (Dockerfile deps stage
validation, `no-git-push` enforcement, release-package map check,
release bootstrap, manifest-driven `pnpm install --lockfile-only`
resolution) keeps running on Dependabot PRs.
- `.github/workflows/commitperclip-review.yml` — gated the `Run quality
gates` step and the dependent `Fail if quality gates failed` step on
`github.event.pull_request.user.login != 'dependabot[bot]'`. `Run
security gates` (`check-pr-security.mjs`) stays unconditional so
supply-chain visibility into Dependabot lockfile churn is preserved.
No changes to `.github/scripts/*.mjs` — keeping the bypass at the
workflow level avoids churning unit-tested code.
## Verification
- CI on this PR: `policy` should pass and the downstream lanes (`Build`,
`Typecheck + Release Registry`, `General tests`, `Verify serialized
server`, `Canary Dry Run`, `e2e`, `verify`) should all run normally
(this PR isn't from Dependabot, so the bypass condition is false —
proves we didn't accidentally widen the exemption).
- After merge, ask Dependabot to rebase #7331 (`@dependabot rebase`) and
confirm:
- `PR / policy` → `success` (lockfile step now `skipped`, other policy
steps `success`)
- `PR / Build`, `PR / Typecheck + Release Registry`, `PR / General tests
(server|workspaces-a|workspaces-b)`, `PR / Verify serialized server
(1/4..4/4)`, `PR / Canary Dry Run`, `PR / e2e` → all execute (none
`skipped`)
- `PR / verify` → `success` once the matrix passes
- `commitperclip PR Review / review` → `success` (quality-gates steps
`skipped` for Dependabot; security gates ran)
- Socket and Snyk checks unchanged
- Local sanity-check: `git diff origin/master..HEAD` shows only the two
workflow files, 7 added / 2 removed lines.
## Risks
- **Auto-merging a poisoned dep.** Mitigated by Socket.dev + Snyk +
human merge approval. This change only affects CI gating, not who clicks
"Merge".
- **Spoofing `github.actor` as `dependabot[bot]`.** GitHub sets
`github.actor` from the push actor; spoofing requires a compromised
Dependabot install token, which is the same threat model that already
lets an attacker push anything to a Dependabot-controlled branch — not a
new risk surface.
- **Policy "Validate dependency resolution when manifests change" step
running `pnpm install --lockfile-only --no-frozen-lockfile` on a
Dependabot lockfile.** That step intentionally uses `--lockfile-only`,
so it only verifies the manifest resolves and does not push or commit
the result. Existing behavior is unchanged.
- Low overall: the diff is two workflow-level `if:` conditions in steps
that already had bypasses.
## Model Used
- Provider: Anthropic Claude (via Claude Code in the Paperclip executor)
- Model ID: claude-opus-4-7
- Context window: 200K
- Reasoning mode: standard tool-use; no extended thinking required for
this change
- Capabilities used: file edit, bash, GraphQL/REST API calls
- Plan was drafted, approved by board, and split into child issues
before implementation; see
[PAPA-490](https://paperclip.ing/PAPA/issues/PAPA-490) for the planning
thread.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass (this change is
workflow-only — no code under test; lint via `yamllint` clean)
- [x] I have added or updated tests where applicable (workflow gating;
no script changes, no unit-testable surface)
- [x] If this change affects the UI, I have included before/after
screenshots (no UI changes)
- [x] I have updated relevant documentation to reflect my changes (no
docs reference these gates)
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
Reverts paperclipai/paperclip#7423
Decided to keep this in place so we can automate issue reproduction in
the future. We all make mistakes. Even me, if you can believe it.
## Thinking Path
> - Paperclip is a control plane for AI-agent companies, with the CLI
acting as a scriptable operator and agent interface to that control
plane.
> - The REST API surface has grown across companies, agents, issues,
routines, plugins, auth, workspaces, secrets, and operational inspection
commands.
> - The CLI had drifted from that API surface: some commands were
missing, some command shapes differed from docs/reference material, and
several edge cases only failed during end-to-end local-source testing.
> - The local development runbook requires these tests to be disposable
and isolated from a real `~/.paperclip`, `~/.codex`, or `~/.claude`
installation.
> - This pull request adds broad CLI/API parity coverage, fixes the
actionable bugs found during that pass, and records the reproducible
test log under `doc/logs`.
> - The benefit is a more complete, scriptable CLI surface with
regression coverage for the command families exercised by the parity
run.
## What Changed
- Added or expanded CLI command coverage for access/auth, companies,
agents, projects, goals, issues and subresources, routines, plugins,
workspaces, activity/run/cost/dashboard inspection, assets, skills,
secrets, tokens, prompt/wake flows, and local setup helpers.
- Fixed CLI/API parity bugs found during the run, including context
profile patching, issue interaction optional payloads, malformed
tree-hold errors, environment duplicate handling, configure
invalid-section exit codes, worktree pnpm invocation, token agent ID
resolution, plugin tool worker lookup, and routine webhook secret
cleanup.
- Added missing CLI wrappers and route coverage for health/access,
invite resolution URL forwarding, join status normalization, secret
lifecycle commands, LLM docs routes, available-skill isolation, positive
board-claim coverage, and interactive `connect` prompt-flow tests.
- Added a schema-backed `/api/openapi.json` route sufficient for CLI
parity and `paperclipai openapi --json` smoke coverage.
- Added `doc/logs/2026-05-24-cli-api-parity-e2e-log.md` with the
detailed living test/bug log and renamed the log directory from
`doc/bugs` to `doc/logs`.
- Added `doc/plans/2026-05-23-cli-api-parity.md` and the OpenAPI parity
reference used during the pass.
OpenAPI note: this PR intentionally does not try to subsume
`feature/openapi-spec`. The OpenAPI implementation here is schema-backed
and better than the earlier route-inventory stub, but
`feature/openapi-spec` is the fuller/better OpenAPI branch because it
includes exact mounted-route coverage tests and additional current route
coverage. That branch should stay as its own PR and can supersede this
OpenAPI route implementation.
## Verification
Targeted automated checks run:
- `pnpm exec vitest run server/src/__tests__/openapi-routes.test.ts`
- `pnpm exec vitest run server/src/__tests__/board-claim.test.ts`
- `pnpm exec vitest run cli/src/__tests__/connect.test.ts`
- `pnpm exec vitest run cli/src/__tests__/agent-lifecycle.test.ts`
- `pnpm exec vitest run server/src/__tests__/plugin-database.test.ts`
- `pnpm exec vitest run server/src/__tests__/routines-service.test.ts`
- `pnpm --dir cli typecheck`
- `pnpm --dir server typecheck`
Manual/local E2E verification:
- Ran the full disposable local-source CLI/API parity pass with isolated
`PAPERCLIP_HOME`, `PAPERCLIP_CONFIG`, `PAPERCLIP_CONTEXT`,
`PAPERCLIP_AUTH_STORE`, `CODEX_HOME`, and `CLAUDE_HOME` under
`tmp/cli-api-parity`.
- Verified `DATABASE_URL` and `DATABASE_MIGRATION_URL` stayed unset for
the scratch server.
- Verified live health and schema-backed OpenAPI responses on
non-default port `3197`.
- Revoked created board/agent tokens and cleaned up temporary plugins,
secrets, non-default environments, and project workspaces.
- See `doc/logs/2026-05-24-cli-api-parity-e2e-log.md` for the full
command-by-command reproduction log.
Not run:
- Full `pnpm test`, `pnpm test:run`, or `pnpm build` were not run after
the entire branch because the branch is broad and the parity pass used
focused test/typecheck verification plus live isolated CLI reruns.
## Risks
- This is a broad PR and touches many CLI command modules, so review
surface is high. The changes are grouped around one theme, but a split
may be easier if maintainers prefer narrower PRs.
- The OpenAPI route in this PR is not the final/best OpenAPI
implementation. `feature/openapi-spec` has stronger exact-route coverage
and should remain the source for the dedicated OpenAPI PR.
- The living log is intentionally detailed and large. It is useful for
reproducibility but adds documentation weight.
- No UI changes are intended; screenshots are not applicable.
> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.
## Model Used
- OpenAI Codex, GPT-5-based coding agent in Codex desktop. Exact served
model/context-window identifier was not exposed in the local app. Work
used shell/Git/GitHub CLI tooling, local source inspection, targeted
test execution, and live isolated Paperclip CLI/API smoke testing.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
---------
Co-authored-by: Devin Foley <devin@devinfoley.com>
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies
> - The `commitperclip` PR quality gates enforce hygiene on every PR
before merge
> - One of those gates required PRs to link to a tracking issue, which
adds friction for small/internal changes that don't need a tracker entry
> - The repository owner decided the linked-issue requirement is no
longer the right default
> - This pull request removes the linked-issue gate (the script, its
tests, and the orchestrator wiring)
> - The benefit is fewer false-failing PR checks and one less mandatory
authoring step
## What Changed
- Deleted `.github/scripts/check-pr-linked-issue.mjs`
- Deleted `.github/scripts/tests/check-pr-linked-issue.test.mjs`
- Removed the `checkLinkedIssue` import, the
`Promise.resolve(checkLinkedIssue(...))` entry in the `Promise.all`
block, the `issueResult` destructured binding, and the
`...issueResult.failures` spread from
`.github/scripts/run-quality-gates.mjs`
## Verification
- `node --test .github/scripts/tests/*.test.mjs` — 72/72 tests pass
across the remaining 4 gate suites
- `git grep -n
'check-pr-linked-issue\|checkLinkedIssue\|check-pr-linked'` — no matches
- Inspected `run-quality-gates.mjs` — no orphaned `issueResult`
references
## Risks
- Low risk. Pure removal of one optional gate; the
`.github/workflows/commitperclip-review.yml` workflow only invokes the
orchestrator and needs no changes. PR template and `CONTRIBUTING.md` do
not mention linked issues, so no docs change is required.
## Model Used
- Claude (Anthropic), `claude-opus-4-7`, extended-thinking mode, tool
use enabled
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable (existing
gate-suite tests still pass; removed gate's tests deleted with it)
- [x] If this change affects the UI, I have included before/after
screenshots (n/a — CI script change)
- [x] I have updated relevant documentation to reflect my changes (no
docs reference the removed gate)
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
Co-authored-by: Paperclip <noreply@paperclip.ing>
## Thinking Path
> - Paperclip orchestrates AI agents in a control plane backed by
Postgres + drizzle-orm
> - `listComments` is the cursor-paginated comment listing on the issues
service; the cursor branch uses Drizzle's `gt`/`lt`/`eq` against
`issueComments.createdAt`
> - On postgres.js v3.4.8, passing a `Date` instance through the
comparison helpers triggers `TypeError [ERR_INVALID_ARG_TYPE]: The
"string" argument must be of type string or an instance of Buffer or
ArrayBuffer. Received an instance of Date`
> - The driver's binding path expects a Date constructed via the
standard runtime, but drizzle's `select` returns instances that don't
satisfy that check in this version
> - This PR coerces `anchor.createdAt` through `toISOString()` → `new
Date(...)` so the comparison helpers always receive a binding-safe Date,
then folds in a follow-up that hoists the Date into a single allocation
reused across all four `gt`/`lt`/`eq` call sites
> - The benefit is `listComments` cursor pagination stops 500-ing on Pg
v3.4.8 with one Date allocation per call instead of four, exercised by
both ascending and descending cursor tests
## What Changed
- `server/src/services/issues.ts` — coerce `anchor.createdAt` to a
binding-safe `Date` once and reuse the same instance across all four
cursor comparisons (`gt` / `lt` / `eq`)
- `server/src/__tests__/issues-service.test.ts` — add an
ascending-cursor sibling test so both `gt` and `lt` cursor paths are
exercised; the existing descending test continues to pass
## Verification
```bash
# Both cursor branches
pnpm --filter @paperclipai/server exec vitest run \
src/__tests__/issues-service.test.ts -t "anchor comment"
# → 2 passed, 41 skipped
# Production smoke
curl -s "$PAPERCLIP_API_URL/api/issues/<issueId>/comments?after=<commentId>&order=asc" \
-H "Authorization: Bearer $PAPERCLIP_API_KEY"
# Expect: JSON array, no 500 TypeError
```
## Risks
- Low risk. Pure cursor-pagination internals in `listComments`; no
schema, migration, or external contract changes
- Drizzle's `gt`/`lt`/`eq` continue to receive a `Date` for the
timestamp column, producing the same bound parameter as before
- Behavioural surface is exercised by ascending + descending cursor
tests against a real Postgres test database
## Model Used
- Claude Opus 4.7 (`claude-opus-4-7`), no extended-thinking mode, used
for the hoist+test follow-up commit
## Fixes
Closes#2612, Closes#3661, Closes#3830
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots — n/a, backend-only
- [x] I have updated relevant documentation to reflect my changes — n/a,
no docs touched
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
---------
Co-authored-by: Elena Voronova <elena@paperclip.ing>
Co-authored-by: Paperclip <noreply@paperclip.ing>
Co-authored-by: Devin Foley <devin@devinfoley.com>
Fixes#6470
## Thinking Path
> - Paperclip is an open-source AI agent platform receiving a high
volume of community PRs — currently 2,398 open
> - The contributor experience is broken: PRs sit for months with no
feedback, contributors don't know why they're stuck, and maintainers
spend review time on PRs that are missing basics
> - Common problems: no linked issue, no test coverage, incomplete PR
template, manually-edited lockfile — all catchable before human review
> - At the same time, accepting untrusted PRs from unknown contributors
is a real attack surface: malicious packages, secret injection,
tampering with CI scripts, and code touching the sensitive paths from
the April security advisories
> - This PR adds automated gates that run on every PR: quality failures
get a clear comment telling contributors exactly what to fix, security
concerns are silently flagged as draft advisories and block merge via a
pending check run
> - The benefit is a dramatically faster feedback loop for good-faith
contributors and a meaningful security layer for the maintainers
reviewing them
## What Changed
- **`.github/workflows/commitperclip-review.yml`** — new workflow using
`pull_request_target` (runs in base branch context, has secrets, never
executes PR code). Runs quality gates + security gates on every PR
open/update.
- **`.github/dependabot.yml`** — weekly automated dependency
vulnerability PRs for npm and GitHub Actions.
- **`.github/scripts/get-bot-token.mjs`** — generates a short-lived
commitperclip installation token from `COMMITPERCLIP_KEY` secret.
- **`.github/scripts/run-quality-gates.mjs`** — orchestrates 5 quality
gates, posts/updates a single consolidated comment on the PR.
- **`.github/scripts/check-pr-template.mjs`** — validates all 5 required
template sections, Thinking Path depth (≥3 sentences), Model Used not
placeholder.
- **`.github/scripts/check-pr-linked-issue.mjs`** — requires `Fixes
#NNN` or issue URL in PR body.
- **`.github/scripts/check-pr-test-coverage.mjs`** — requires at least
one test file in the diff.
- **`.github/scripts/check-pr-lockfile.mjs`** — blocks manual
`pnpm-lock.yaml` edits (only the refresh bot may change it).
- **`.github/scripts/check-pr-dependencies.mjs`** — informational
comment when new npm packages are added.
- **`.github/scripts/check-pr-security.mjs`** — 6 silent security
checks: secret patterns, CI workflow tampering, build script changes,
supply chain (new packages in lockfile), suspicious test patterns
(outbound network/shell exec/env var reads), and changes to the 9
sensitive path prefixes from the April advisories. When any fire:
creates a draft security advisory + sets `security-review` check to
`in_progress` (blocks merge). When clean: sets `security-review` to
`success`.
- **`actions/dependency-review-action@v4`** — per-PR dependency
vulnerability check (fails if new dep has known CVE).
- **44 unit tests** across all gate modules (`node:test`, no external
deps).
## Verification
Run all unit tests locally:
```bash
node --test .github/scripts/tests/*.test.mjs
```
Expected: 44 pass, 0 fail.
End-to-end: open a PR missing the template, linked issue, and test files
→ commitperclip posts a consolidated comment listing all failures. Open
a PR with all gates satisfied → `✅ All checks passing` comment posted,
all check runs green.
## Risks
**`pull_request_target` security model:** This workflow runs in base
branch context and has access to secrets. It explicitly checks out `ref:
master` (never PR code) and reads the PR diff via GitHub API only — no
PR code is ever executed. This is the correct pattern for running
secret-bearing checks on fork PRs; deviating from it (e.g. checking out
the PR branch) would be a security vulnerability.
**False positives on security gates:** The sensitive-path gate flags any
PR touching the 9 path prefixes from the April advisories. Legitimate
fixes to those paths will trigger draft advisories. This is intentional
— those paths warrant a human look regardless. The `security-review`
check can be manually resolved by a maintainer once reviewed.
**commitperclip not yet installed:** Until the app is installed on this
repo and the `COMMITPERCLIP_KEY` secret is added, the workflow will fail
on the token generation step. The quality gate comment won't post, but
Dependency Review will still run independently.
## Model Used
Claude Sonnet 4.5, 200k context window, extended thinking enabled, tool
use: read/edit files, bash execution, GitHub API calls
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass (44/44)
- [x] I have added or updated tests where applicable (44 unit tests
across all gate modules)
- [ ] If this change affects the UI, I have included before/after
screenshots (N/A — CI only)
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
---
## One-time setup needed from you, Dotta
1. **Install commitperclip app** on this repo:
https://github.com/apps/commitperclip/installations/new
2. **Add `COMMITPERCLIP_KEY`** as a repository secret (Actions →
Secrets) — ask @brandonburr for the key
3. **Add `security_advisories: write` and `checks: write`** to the
commitperclip app permissions (commit-capital org → Settings → Apps →
commitperclip → Permissions)
4. **Install Socket.dev** from GitHub Marketplace for supply chain
scanning
5. **Branch protection** (optional but recommended): require
`commitperclip-review` and `security-review` checks to pass before merge
## Dashboard integration note
The `commitperclip-review` check run result maps cleanly to your PR
triage dashboard. A single filter on your Worker:
```javascript
const gatesCheck = checkRuns.find(r => r.name === 'commitperclip-review');
if (gatesCheck?.conclusion === 'failure') return null; // filter from queue
```
For security flags: `GET
/repos/paperclipai/paperclip/security-advisories?state=draft` — advisory
titles include `PR #NNN` for cross-referencing. PRs with a matching
draft advisory have `security-review` in `in_progress` state (grey
spinner, can't merge via branch protection).
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Devin Foley <devin@devinfoley.com>
Co-authored-by: Paperclip <noreply@paperclip.ing>
Adds the wireframe skill (low-fi black-and-white SVG wireframes + viewer
page) as a bundled catalog skill under
catalog/bundled/product/wireframe, alongside its references/ docs and
assets/ templates. Regenerates generated/catalog.json (8 -> 9 skills).
The skill ships static svg/html template assets, so its derived trust
level is "assets" rather than "markdown_only". The server's real
install-time security gate (assertCatalogSkillInstallable) blocks only
"scripts_executables", and "assets" skills are installable, so the
shipped-catalog markdown-only invariant is refined to gate on executable
scripts instead. No skill ships executable scripts.
Co-Authored-By: Paperclip <noreply@paperclip.ing>
Surface attachment-backed artifact work products as a first-class
Output section on the issue detail page so cloud users can watch and
download agent-generated videos without host filesystem access.
- ui/src/lib/issue-output.ts: formatBytes/formatDuration/getOutputFileGlyph
helpers + getIssueOutputs selector that validates the Phase-2 attachment
artifact metadata contract and tolerates malformed metadata (degraded).
- issue-output components: IssueOutputSection, OutputPrimaryCard (native
<video>/image/generic), OutputRow, OutputVideoPlayer, OutputFileTile.
- IssueDetail: fetch work products and render the Output section between
Documents and Attachments; reuse formatBytes in the attachments list.
- DesignGuide: showcase multiple-output, degraded, and empty states.
- Focused tests for video output, empty state, multiple outputs, and
failed attachment metadata (15 tests).
Co-Authored-By: Paperclip <noreply@paperclip.ing>
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies.
> - The recovery subsystem is responsible for keeping assigned work
moving when a live heartbeat run disappears or fails.
> - `continuation_recovery` is the path that re-enqueues stranded
`in_progress` issues after an interrupted continuation attempt.
> - That path recently gained cause-aware retry classes and transient
retry caps, but the streak counter was still aggregating mixed failure
causes into one retry history.
> - That meant a sequence like `timeout -> timeout -> adapter_failed ->
adapter_failed` could escalate as a false `3x adapter_failed` streak
even though the latest cause had only happened twice.
> - This pull request makes continuation retry streaks count only
consecutive failures whose `errorCode` matches the latest run and adds a
regression test for the mixed-cause case.
> - The benefit is that transient retry backoff and escalation now match
the actual current failure cause instead of inheriting stale budget from
unrelated failures.
## What Changed
- Updated `summarizeRecentContinuationRetries(...)` to stop counting as
soon as the continuation failure cause no longer matches the latest
run's `errorCode`.
- Wired the continuation recovery escalation/backoff path to pass the
latest classified `errorCode` into the retry streak summarizer.
- Added a regression test proving mixed-cause continuation failures do
not consume the transient retry cap for a new failure cause.
## Verification
- `pnpm exec vitest run
server/src/__tests__/heartbeat-process-recovery.test.ts`
## Risks
- Low risk. The behavioral change is intentionally narrow, but any
future continuation retry modes that rely on `errorCode = null` will now
be counted as a separate streak bucket and should be kept in mind when
adding new retry classifications.
## Model Used
- OpenAI Codex via Paperclip `codex_local` (GPT-5-based Codex coding
agent; exact backend revision is not surfaced in the runtime), with tool
use, shell execution, and patch application in the local repository.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [ ] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
---------
Co-authored-by: Paperclip <noreply@paperclip.ing>
## Thinking Path
> - Paperclip orchestrates AI agents and provisions sandboxed execution
environments for them; one of those provisioners is the exe.dev plugin,
which runs each agent inside a long-lived VM reached over SSH.
> - The instance-config form for that plugin is rendered generically by
`JsonSchemaForm` from the plugin's `instanceConfigSchema`, so any UX
problem with the form is split between the shared form component and the
plugin's schema/runtime code.
> - Users coming in cold hit a 12-field flat config they couldn't reason
about (PAPA-407), a form that silently submitted `cpu: 0` for untouched
optional fields (PAPA-407 root cause), a `sshPrivateKey` textarea that
truncated RSA-4096 keys at 4096 chars (PAPA-449), a save flow that
accepted clearly-malformed keys and only blew up at lease time with raw
SSH stderr (PAPA-450, PAPA-451), and a manifest that didn't distinguish
"essential" from "advanced" knobs (PAPA-410 / PAPA-411 — duplicate
sub-issues with identical scope; PAPA-418 reconciliation kept PAPA-410
canonical).
> - These problems all point at the same surface (exe.dev sandbox
config) and are tightly coupled in code — PAPA-449/450/451 patch fields
that PAPA-410/411 introduce — so they get reviewed together.
> - This pull request lands the shared-form changes (advanced-options
disclosure, optional-scalar defaults) and the exe.dev-specific changes
(manifest restructure, longer `maxLength`, stderr translation, save-time
key validation) as five focused commits stacked on `master`.
> - The benefit is a config form that defaults to the two fields a new
user actually needs (API key + SSH private key) with a collapsible
disclosure for the rest, no silent truncation or zero-default
submissions, and SSH key problems surfaced at save time with actionable
messages instead of cryptic post-provision failures.
## What Changed
- **JsonSchemaForm advanced-options disclosure** (PAPA-410, PAPA-411 —
same scope, see note above): adds `x-paperclip-advanced` /
`x-paperclip-group` schema annotations and renders flagged fields behind
a collapsible "Advanced options" disclosure that auto-opens when a
hidden field has a validation error. Exe.dev manifest is restructured to
use the new annotations, so essentials (`apiKey`, `sshPrivateKey`) show
by default while the long tail of optional knobs is grouped under "SSH
access" / "VM resources" / "More options" headings.
- **Omit optional scalar defaults** (PAPA-407): `getDefaultForSchema` no
longer materialises `0` / `""` for optional
`number`/`integer`/`string`/`secret-ref` fields without an explicit
`default`. Object recursion drops properties whose default is
`undefined`. Fields that declare a `default` (e.g. `sshPort: 22`) still
round-trip. Adds a regression test against `getDefaultValues`.
- **Raise `sshPrivateKey` `maxLength`** (PAPA-449): bumps the exe.dev
manifest cap from 4096 to 8192 so RSA-4096 OpenSSH private keys (which
can exceed 4 KB with comments/metadata) aren't silently truncated at
submit.
- **Translate `invalid format` SSH stderr** (PAPA-450):
`formatSshFailure` now recognises `Load key … invalid format` in
combined stderr/stdout and returns a specific message naming the
key-format problem ("isn't an OpenSSH/PEM private key — confirm the
secret starts with `-----BEGIN … PRIVATE KEY-----` and isn't the `.pub`
or a PuTTY `.ppk` export") instead of dumping the raw stderr.
- **Save-time SSH key validation** (PAPA-451):
`onEnvironmentValidateConfig` inline-parses `sshPrivateKey` and rejects
common failure modes — pasted public keys, PuTTY `.ppk` format, missing
`-----END-----` footer, non-base64 body — so the form surfaces an inline
error before any VM is provisioned. Secret-ref bindings (UUIDs) are
still passed through unchanged.
## Verification
CI gates (`pnpm typecheck`, `pnpm test`, the targeted vitest suites
below) all pass.
Run locally:
```bash
# Shared form
pnpm --filter @paperclipai/ui exec vitest run src/components/JsonSchemaForm
# 9 tests pass — includes the new "omits optional scalar fields" regression
# and the three advanced-options-disclosure tests.
# exe.dev plugin
cd packages/plugins/sandbox-providers/exe-dev && pnpm test
# 32 tests pass — includes the new sshPrivateKey-validation cases
# and the new "invalid format" stderr-translation case.
```
Manual smoke (after reinstalling the plugin so the DB manifest
refreshes):
1. Open the exe.dev environment config page. **Default view shows API
Key + SSH Private Key only**, with an "Advanced options" disclosure for
everything else (PAPA-410 / PAPA-411).
2. Paste a `.pub` file's contents into SSH Private Key, click Save.
**Inline error** rejecting the wrong-format key (PAPA-451).
3. Re-paste a valid OpenSSH/PEM private key longer than 4096 bytes —
saves cleanly (PAPA-449).
4. Save the form with everything optional left blank — server no longer
rejects with `"cpu must be greater than 0 when provided"` (PAPA-407).
5. Force a bad key through via a stored secret-ref binding and lease a
VM — failure message names the key-format problem instead of dumping raw
SSH stderr (PAPA-450).
## Risks
- **PAPA-410 / PAPA-411 manifest restructure** is the largest surface
here. Schemas using `x-paperclip-*` extensions are forward-compatible
with stricter JSON Schema validators (extensions are ignored by
default), and the form gracefully renders a flat layout when no field
opts in.
- **PAPA-407** changes form-default behaviour: optional scalar fields
that previously round-tripped as `""` / `0` will now be `undefined` and
absent from the submitted payload. Downstream consumers that expected
the empty-string/zero shape need to treat the field as optional.
Spot-checked the existing exe.dev driver — it already uses
`parseOptionalString` / `parseOptionalInteger`, which treat missing
fields as `null` rather than `0`/`""`.
- **PAPA-451** adds a save-time check, so a
previously-saved-but-malformed `sshPrivateKey` raw value will now fail
to re-save. Bound secret-refs are unaffected, matching how the user
reaches the bad-key state today (via the secrets picker).
- **PAPA-449** simply raises a cap; no semantic risk.
- **PAPA-450** only kicks in on the "invalid format" code path; existing
onboarding-marker branch is untouched.
## Model Used
- Provider: Anthropic
- Model: Claude Opus 4.7 (`claude-opus-4-7`)
- Capabilities used: code reading, code editing, test execution, git/PR
mechanics, Paperclip API for issue coordination
## Checklist
- [x] PR body sections present (Thinking Path, What Changed,
Verification, Risks, Model Used, Checklist)
- [x] Unit tests added for the new behaviours (JsonSchemaForm
default-value omission + advanced disclosure; exe.dev plugin validation
+ stderr translation)
- [x] Existing tests still pass locally (`vitest run` on both packages)
- [x] No raw secrets, IP addresses, or machine-local config in commits
or PR body
- [x] Commits are atomic per linked issue (PAPA-410 / PAPA-411,
PAPA-407, PAPA-449, PAPA-450, PAPA-451)
- [x] Branch is up-to-date with `origin/master`
---------
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: Paperclip <noreply@paperclip.ing>
## Release changelog: v2026.529.0
Stable changelog for the **v2026.529.0** release (released 2026-05-29),
generated with the `release-changelog` skill.
- Range: `v2026.525.0..origin/master` — 11 squash-merged PRs
- Adds `releases/v2026.529.0.md`
- **No breaking changes** — migrations are additive (`CREATE TABLE IF
NOT EXISTS`); the only `DROP CONSTRAINT` lines are FK adjustments, not
data loss
- **No external contributors** this cycle — all PR authors are Paperclip
founders, who are excluded from the Contributors section per the skill,
so that section is omitted
### Highlights
- Inline document annotations and comments (#6733)
- Company skills CLI and catalog management (#6782)
- Hide projects and agents from your sidebar (#6677)
- First-admin claim flow for fresh self-hosted deployments (#6755)
- Live Claude model discovery (#6953)
### Improvements
- Bundled plugins now appear in the plugin manager (#6734)
- Tighter workspace lifecycle guarantees (#6969)
### Fixes
- Accepted plans decompose exactly once (#6831)
Docs-only (README brand/license #6810, #6804) and CI-only (#6967)
changes were excluded as not materially user-facing.
Issue: PAP-10155
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-authored-by: Paperclip <noreply@paperclip.ing>
## Thinking Path
> - Paperclip orchestrates AI-agent companies through adapter-backed
local and external runtimes.
> - The agent configuration UI lets operators choose adapter models and
refresh model lists when adapters support live discovery.
> - Codex already had a live refresh path, but Claude Local only exposed
static fallback models and the UI hid the refresh action for Claude.
> - A newly available Claude Opus model should not require a code
release every time the model catalog changes.
> - This pull request adds Anthropic model discovery for Claude Local,
keeps the static fallback current with Claude Opus 4.8, and exposes the
existing refresh button in the Claude Local dropdown.
> - The benefit is that operators can refresh Claude models from the
same model selector flow they already use for Codex.
## What Changed
- Added `claude-opus-4-8` to the Claude Local fallback model list.
- Added Claude model discovery through Anthropic-compatible `GET
/v1/models` when `ANTHROPIC_API_KEY` is available.
- Added normal cache reuse, forced refresh support, a SHA-256-based
API-key fingerprint for cache keys, and warning logging for discovery
errors before fallback.
- Wired `claude_local.refreshModels` into the server adapter registry.
- Enabled the existing `Refresh models` dropdown action for
`claude_local` in `AgentConfigForm`.
- Added tests for Claude fallback, live discovery, API-failure fallback,
forced refresh, and the UI refresh-button gate.
## Verification
- `pnpm exec vitest run server/src/__tests__/adapter-models.test.ts`
- `pnpm exec vitest run ui/src/components/AgentConfigForm.test.ts`
- `pnpm --filter @paperclipai/adapter-claude-local typecheck`
- `pnpm --filter @paperclipai/server typecheck`
- `pnpm --filter @paperclipai/ui typecheck`
- Greptile review reached Confidence Score: 5/5 on commit `b796cf4f1`
with addressed threads resolved.
UI note: the visible change is a conditional action row inside the
existing model dropdown; the regression test covers that `claude_local`
now receives the refresh action.
## Risks
- Low risk. Without `ANTHROPIC_API_KEY`, Claude Local still uses the
static fallback list.
- If Anthropic model discovery fails or times out, Paperclip falls back
to the existing cached or static list.
- Bedrock environments remain on Bedrock-native model IDs.
## Model Used
OpenAI GPT-5 via Codex local coding agent, with repository file access,
shell command execution, git operations, and targeted test/typecheck
verification. Exact context window is not exposed by the runtime.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
## Thinking Path
> - Paperclip orchestrates AI agents across isolated execution
workspaces; the local cwd is the only persistence boundary between runs.
> - Workspace lifecycle (worktree_prepare → execute →
workspace_finalize) and the wake/accept flow are what guarantee that
dependent issues see a consistent worktree.
> - PAPA-380 / PAPA-431 / PAPA-432 / PAPA-440 surfaced three holes in
that contract: silent env reuse across assignees, dependent wakes firing
before finalize, and `issue.interaction.accept` advancing before
finalize landed.
> - PAPA-441 / PAPA-442 then needed to document the "no remote git"
contract and prevent future adapter/runtime code from quietly
reintroducing `git push` as a backdoor sync.
> - This pull request lands those server fixes, the static
`check-no-git-push` enforcement, the AUTHORING.md cross-link, and the
Cody-review follow-ups on the PAPA-430 thread.
> - The benefit is that finalize is a real barrier — board accepts,
dependent wakes, and operator-set env all respect it — and adapter code
can't bypass it via raw `git push`.
## What Changed
- **server (PAPA-380, PAPA-431):** `execution-workspace-policy` refuses
silent env reuse when the assignee's resolved env disagrees with the
workspace it would inherit. The inheritance protection is now scoped to
the actual inheritance signal — explicit issue-level `environmentId` is
honored even when the agent's default env is `null`.
- **server (PAPA-432):** `heartbeat.ts` gates dependent wakes on
`listUnfinalizedExecutionWorkspaceIds`, and writes a
`workspace_finalize` row on the succeeded path. Write failures now
surface instead of being swallowed so dependents aren't silently
stranded behind a missing row.
- **server (PAPA-440):** `issue-thread-interactions.acceptInteraction`
adds a workspace_finalize precondition for `request_confirmation` (not
`suggest_tasks`). Accept returns 409 if finalize hasn't succeeded for
the latest workspace operation.
- **ci (PAPA-442):** new `scripts/check-no-git-push.mjs` static check
scans `packages/adapters/`, `packages/adapter-utils/`, `server/src/`,
and `cli/src/` for any `git push` invocation (string or args-array).
Wired into the `policy` PR job and `test:release-registry`. Operators
can opt in per-call with `// paperclip:allow-git-push: <reason>`.
Release scripts are out of scope by design.
- **docs (PAPA-441):** `AUTHORING.md` documents the no-remote-git
contract and cross-links the static check so adapter authors learn the
rule and the enforcement together.
- **review follow-up (PAPA-430, Cody):** three fixes — env resolver bug,
accept-gate scope (request_confirmation only), and finalize record write
on the succeeded path.
## Verification
- `pnpm exec vitest run
server/src/__tests__/execution-workspace-policy.test.ts
server/src/__tests__/issue-thread-interactions-service.test.ts` → 33/33
pass
- `node scripts/check-no-git-push.test.mjs` → check covers string form,
args-array form, comment exclusions, and per-line allow-comment.
- Manual: server compiles; the policy job runs the check in <1s before
heavier jobs.
## Risks
- **Behavioral shift in accept:** boards accepting
`request_confirmation` while finalize is in-flight now get 409s. This is
intentional — they can retry — but it changes timing on a hot path.
`suggest_tasks` is unaffected.
- **Workspace policy:** the env-reuse refusal is a new error path.
Issues that previously silently reused an env from a different-assignee
workspace will now fail-loud; the resolver still honors explicit
issue-level `executionWorkspaceSettings.environmentId`.
- **CI rule:** any future legitimate `git push` in scoped dirs must be
marked with the allow-comment, which is the intended ergonomic.
## Model Used
- Claude Opus 4.7 (`claude-opus-4-7`, extended thinking), via Claude
Code in the Paperclip executor adapter.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots (N/A — server/CI/docs only)
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
Closes related issues: PAPA-430, PAPA-380, PAPA-431, PAPA-432, PAPA-440,
PAPA-441, PAPA-442
---------
Co-authored-by: Paperclip <noreply@paperclip.ing>
## Thinking Path
> - Paperclip relies on CI browser suites to protect control-plane
workflows, so a stalled browser bootstrap is a release blocker even when
app code is unchanged.
> - The failing signal on [PAPA-457](/PAP/issues/PAPA-457) was specific
to the PR e2e lane timing out before tests started, which pointed at
environment setup rather than assertions.
> - The first shell-only Chromium attempt reduced download size, but the
GitHub Actions log showed Playwright still hanging inside its install
step after the headless shell download finished.
> - That means the real problem is the Playwright browser-install path
itself on the hosted Ubuntu runner, not just the size of the downloaded
artifact.
> - GitHub's Ubuntu runners already ship Google Chrome, and Playwright
can target that binary through the `chrome` channel without downloading
its own Chromium bundle.
> - The safer workflow fix is therefore to remove the Playwright install
step from the affected headless jobs and make the Playwright configs
optionally use runner Chrome only when CI opts into it.
> - This keeps local defaults unchanged, removes the failing
browser-download dependency from CI, and preserves headless coverage for
PR, standalone e2e, and release-smoke workflows.
## What Changed
- Updated `.github/workflows/pr.yml`, `.github/workflows/e2e.yml`, and
`.github/workflows/release-smoke.yml` to stop downloading Playwright
browsers and instead verify the runner's preinstalled `google-chrome`.
- Passed `PAPERCLIP_PLAYWRIGHT_CHANNEL=chrome` into the headless PR,
standalone e2e, and release-smoke test steps so those jobs explicitly
use runner Chrome.
- Updated `tests/e2e/playwright.config.ts` and
`tests/release-smoke/playwright.config.ts` to honor
`PAPERCLIP_PLAYWRIGHT_CHANNEL` while keeping the default
local/browser-bundle behavior unchanged when the env var is absent.
## Verification
- Investigated the failed PR run log and confirmed the prior `Install
Playwright` step stalled after `chromium-headless-shell` reached 100%
download.
- `PLAYWRIGHT_BROWSERS_PATH="$(mktemp -d)"
PAPERCLIP_PLAYWRIGHT_CHANNEL=chrome PAPERCLIP_E2E_SKIP_LLM=true pnpm run
test:e2e`
Result: `7 passed (21.1s)` with an empty temporary Playwright browser
cache, proving the e2e suite runs without any Playwright browser
download when the `chrome` channel is selected.
- `git diff --check`
## Risks
- This assumes GitHub's Ubuntu runner continues to ship `google-chrome`;
if that image contract changes, these workflows would need a dedicated
Chrome install step.
- The `chrome` channel can differ slightly from Playwright-managed
Chromium, so the config gate is intentionally env-scoped to CI workflows
that need the hosted-runner path.
## Model Used
- OpenAI Codex, GPT-5-based coding agent running through Paperclip's
`codex_local` adapter with tool use, shell execution, and repository
editing enabled. The exact internal snapshot/version string is not
exposed in-session.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [ ] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [ ] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
---------
Co-authored-by: Paperclip <noreply@paperclip.ing>
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies, so
planning approvals and child-issue fan-out are part of the core
control-plane loop.
> - Accepted plans are supposed to be a safe bridge from planning into
execution, especially when agents wake from review decisions and reuse
isolated workspaces.
> - The duplicate-subtask incident showed that an accepted plan revision
could be interpreted more than once across overlapping runs, which broke
the single-source-of-truth model for issue decomposition.
> - Fixing that required tightening the backend contract first:
accepted-plan decomposition needs an exact-once fingerprint, durable
claim state, and retry-safe child creation.
> - Once that backend behavior existed, the board still needed
visibility into what happened, so the issue detail view needed a
dedicated decomposition section instead of forcing operators to
reconstruct child creation from raw activity.
> - This pull request adds the exact-once decomposition primitive,
hardens wake routing and regressions around the incident, and surfaces
decomposition state in the UI so future incidents are both prevented and
easier to inspect.
## What Changed
- Added accepted-plan decomposition semantics to
`doc/execution-semantics.md`, including the exact-once fingerprint,
durable claim/result expectations, and retry/resume behavior.
- Added persistent accepted-plan decomposition claims in the backend,
including schema, shared types/validators, service logic, and issue
routes for creating and listing decomposition state.
- Hardened heartbeat routing so an accepted-plan continuation stays
scoped to the relevant planning issue instead of opportunistically
re-decomposing another accepted issue on the same assignee.
- Added regression coverage for the original failure modes: concurrent
same-parent retries, cross-issue accepted-plan isolation, and partial
child recreation under the same fingerprint.
- Added the `Plan decomposition` issue-detail section plus supporting
API/query-key/activity formatting updates so operators can see revision
status, owner, child counts, and the linked child issues directly in the
UI.
- Included the small follow-up UI fix so the decomposition section still
renders when the issue work mode is no longer `planning`.
## Verification
- `pnpm --filter @paperclipai/server typecheck`
- `pnpm --filter @paperclipai/ui typecheck`
- `pnpm --filter @paperclipai/db typecheck`
- `pnpm exec vitest run server/src/__tests__/issues-service.test.ts`
- `pnpm exec vitest run server/src/__tests__/issues-service.test.ts -t
"lists persisted decompositions with child issue summaries"`
- `pnpm exec vitest run server/src/__tests__/issues-service.test.ts -t
"accepted plan decomposition"
server/src/__tests__/heartbeat-accepted-plan-workspace-refresh.test.ts
server/src/__tests__/heartbeat-context-summary.test.ts`
- Manual UI path: create a planning issue without an isolated execution
workspace, add a `plan` document, accept the `request_confirmation`, let
Paperclip create child issues, then reopen the parent issue detail page
and confirm the `Plan decomposition` section shows the accepted
revision, status, idempotent-claim badge, and child links.
- Separate follow-up bug noted during manual UI validation: accepting a
plan on an issue whose run never records `workspace_finalize` is tracked
in `PAPA-445` and is not part of this PR’s fix scope.
## Risks
- This adds a new migration and a large Drizzle snapshot update;
reviewers should confirm the schema shape and generated metadata match
the intended decomposition table.
- The exact-once claim changes sit on the accepted-plan fan-out path, so
regressions there could block legitimate child creation or mis-handle
retries if the claim state machine is wrong.
- The new UI only appears when decomposition records exist; reviewers
should use the manual verification path above rather than expecting
existing issues on a stale local instance to show the section
automatically.
- `PAPA-445` remains an open follow-up for the `workspace_finalize`
accept gate when a planning handoff never records finalize; that bug can
interfere with reproducing the UI flow on isolated workspaces but does
not change the correctness of the exact-once decomposition feature
itself.
> Checked `ROADMAP.md`: this PR is a bug fix / control-plane hardening
change for accepted-plan decomposition, not a new uncoordinated roadmap
feature.
## Model Used
- OpenAI Codex via Paperclip `codex_local` (GPT-5-based coding agent;
exact backend model ID/context window not exposed in the run context),
with repository tool use, shell execution, and code-editing
capabilities.
<img width="806" height="1069" alt="Screenshot 2026-05-27 at 11 05
48 PM"
src="https://github.com/user-attachments/assets/5b00b670-96cd-4470-b0a3-581743bcae28"
/>
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
---------
Co-authored-by: Paperclip <noreply@paperclip.ing>
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies through
company-scoped control-plane workflows.
> - Agents need reusable, inspectable skills that can be installed,
reset, audited, exported, and assigned without bespoke local setup.
> - The existing skill truth model needed cleanup so bundled skills,
optional catalog skills, runtime skills, and adapter-provided skills
have clear provenance.
> - Operators also need a practical CLI and board UI for discovering and
managing company skills.
> - This pull request adds the skills CLI, packaged skills catalog,
company skills APIs, and catalog-aware board UI.
> - The benefit is a more reusable Paperclip company setup where skills
are portable, auditable, and easier for operators and agents to manage.
## What Changed
- Added `paperclipai skills` CLI commands and coverage for catalog
listing, installing, resetting, and inspecting company skills.
- Added a packaged `@paperclipai/skills-catalog` workspace with bundled
and optional skill content plus validation/build tests.
- Added shared company-skill types and validators used across CLI,
server, and UI contracts.
- Added server catalog APIs/services for company skill catalog
operations, reset semantics, audit behavior, and portability provenance.
- Updated adapter skill handling so runtime/catalog provenance remains
explicit across local adapters.
- Added board UI support for browsing and managing catalog-backed
company skills.
- Updated docs for the skills CLI/catalog flow and the company skills
Paperclip skill reference.
- Rebased the branch onto current `paperclipai/paperclip:master`; no
`pnpm-lock.yaml`, `.github/workflows`, or migration files are included
in the final PR diff.
## Verification
- Passed: `pnpm run preflight:workspace-links && pnpm exec vitest run
cli/src/__tests__/skills.test.ts
packages/skills-catalog/src/catalog-builder.test.ts
packages/skills-catalog/src/shipped-catalog.test.ts
packages/shared/src/validators/company-skill.test.ts
packages/adapter-utils/src/server-utils.test.ts
packages/plugins/create-paperclip-plugin/src/entrypoints.test.ts
server/src/__tests__/company-skills-catalog-service.test.ts
server/src/__tests__/company-skills-routes.test.ts
server/src/__tests__/company-portability.test.ts`.
- Passed: `pnpm exec vitest run
server/src/__tests__/workspace-runtime.test.ts -t "default
branch|origin/master|symbolic-ref"`.
- Attempted: full `server/src/__tests__/workspace-runtime.test.ts`. Four
provisioning tests failed while seeding an isolated worktree database
from the local Paperclip instance because the local plugin schema dump
contains a duplicate-column foreign key
(`plugin_content_machine_18a7bc327b.content_case_signals`). The
default-branch tests touched by the rebase conflict passed in the
focused run above.
- Checked final diff: no `pnpm-lock.yaml`, no `.github/workflows`, and
no migration-file changes relative to `master`.
## Risks
- Medium: this is a broad skills/catalog change touching CLI, server
APIs, shared contracts, adapter skill sync, and UI.
- Catalog validation and reset semantics need careful reviewer attention
because they affect reusable company setup and portability.
- No database migrations are included in this PR, so there is no
migration ordering/idempotency risk in the final diff.
- No lockfile is included by design; dependency resolution will be
handled by the repository lockfile workflow.
## Model Used
- OpenAI Codex coding agent based on GPT-5, running in Paperclip via the
`codex_local` adapter with shell, git, GitHub CLI, and code-editing tool
access. Exact hosted model build/context-window metadata is not exposed
in this runtime.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run targeted tests locally and documented the local
workspace-runtime seed failure above
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, screenshots were intentionally
omitted per PAP-10124 instructions; UI behavior is covered by tests and
reviewer inspection
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
---------
Co-authored-by: Paperclip <noreply@paperclip.ing>
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies.
> - Fresh self-hosted deployments need an operator path before any
invite exists.
> - Umbrel installs are private LAN deployments, so a one-time browser
claim is appropriate only when the deployment is private and unclaimed.
> - Public deployments and installs with active invites must keep the
existing invite-only model so admin creation is not exposed broadly.
> - GitHub PR #2927 established the useful direction, but it needed to
be adapted onto current `master` rather than merged as-is.
> - This pull request adds that adapted private-only claim flow across
server, UI, docs, and regression coverage.
> - The benefit is that a fresh private Umbrel-style install can be
claimed from the browser without weakening public deployment access.
## What Changed
- Added a first-admin claim service and access route support for
one-time admin claim eligibility on private unclaimed deployments.
- Updated the bootstrap/access UI so eligible private installs show a
setup claim path, while public and invited deployments keep invite-first
behavior.
- Added a bootstrap-pending setup UX lab covering claim, invite, public,
and signed-in access states.
- Updated deployment and local development docs for authenticated
private/public behavior and the Umbrel-style claim path.
- Added server and UI regression tests for private claim, public
no-claim, active invite fallback, existing board/no-access flows, and
health exposure reporting.
- Stabilized PR handoff verification by serializing the aggregate server
Vitest workspace run, forcing `NODE_ENV=test`, and relaxing the
heartbeat batching test around legitimate recovery follow-up runs.
## Verification
- `pnpm -r typecheck`
- `pnpm build`
- `pnpm vitest --run
server/src/__tests__/heartbeat-comment-wake-batching.test.ts`
- `pnpm vitest --run
server/src/__tests__/health-dev-server-token.test.ts`
- `pnpm test:run`
- QA validation: PAP-10115 passed browser validation with screenshots
for private fresh install claim, active invite versus claim conflict,
public invite-only/claim-absent behavior, existing invite fallback, and
normal board/no-access flows.
- GitHub closeout: issue #2579 and PR #2927 were updated with the
accepted direction: adapt the implementation, do not direct-merge #2927
as-is.
## Risks
- The claim endpoint must remain private-only and one-time; a regression
here could expose admin creation on public deployments.
- Existing invite behavior must remain intact for public deployments and
installs that already have an active invite.
- The stable Vitest harness now serializes the aggregate server
workspace group; this is slower, but it avoids DB-backed suite
collisions under root workspace mode.
> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected - check the roadmap
first. See `CONTRIBUTING.md`.
>
> ROADMAP.md checked: this is a scoped deployment bootstrap/access fix
and does not duplicate a listed roadmap project.
## Model Used
- OpenAI GPT-5 Codex via Paperclip `codex_local` for product
engineering, implementation, and verification, with tool-enabled local
code execution. Paperclip QA browser validation was performed in
PAP-10115 by the assigned QA agent; exact adapter model metadata for
that QA run is not exposed in this PR context.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
---------
Co-authored-by: Paperclip <noreply@paperclip.ing>
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies
> - The README is the first impression for developers and operators
landing on the repo, so it has to reflect the current brand voice and
visual identity
> - The existing README leads with an outdated hero ("Open-source
orchestration for zero-human companies"), keeps a board-centric tagline
that no longer matches the positioning, advertises a removed COMING SOON
teaser, and still uses an old header image and an unnecessary footer
image
> - Out-of-date positioning at the top of the README undercuts the rest
of the doc and the brand guidelines refresh at
https://paperclip.ing/brand
> - This pull request swaps the README header image for the new brand
banner, updates the hero copy and tagline, and trims stale callouts so
the README matches the new brand guidelines
> - The benefit is a README that leads with the current positioning
("Paperclip is the app people use to manage AI agents for work.") and
current visual identity, with no stale teasers or extraneous footer
image
## What Changed
- Added new brand banner at \`doc/assets/banner.jpg\` and pointed the
README header \`<img>\` at it (alt text updated to the new tagline)
- Replaced the \`## What is Paperclip?\` + \`# Open-source orchestration
for zero-human companies\` heading pair with a single H1: \`# Paperclip
is the app people use to manage AI agents for work.\`
- Tightened the opening paragraphs ("Open-source orchestration for teams
of AI agents.", trimmed dashboard sentence, "Under the hood:" line,
period on the OpenClaw/Paperclip tagline)
- Removed the \`COMING SOON: Clipmart\` callout
- Softened the Governance copy by dropping "You're the board." in both
the Features grid and the Systems table
- Fixed typo: "solo-entreprenuer" → "solo entrepreneur"
- Removed the README footer image block entirely
- Updated the closing subline: "Built for people who want to run
companies, not babysit agents." → "Built for people who want to get work
done, not babysit agents."
- Left existing assets untouched on disk: \`doc/assets/header.png\` and
\`doc/assets/footer.jpg\` are unchanged from master (only the README
references changed)
## Verification
- \`git diff master..HEAD --stat\` → only \`README.md\` (10+/18-) and
the new \`doc/assets/banner.jpg\`
- Rendered the README locally and confirmed:
- The header banner shows the new brand image
- The H1 reads "Paperclip is the app people use to manage AI agents for
work."
- No COMING SOON Clipmart callout
- No footer image; closing subline reads "Built for people who want to
get work done, not babysit agents."
- No code paths changed; no test suite applies
## Risks
- Low risk. Docs-only change. \`cli/README.md\` still references the
on-master URL for \`doc/assets/header.png\`, which is intentionally left
in place so that link does not break.
## Model Used
- Claude (Anthropic), model id \`claude-opus-4-7\` ("Opus 4.7"), running
under Claude Code via the Paperclip claude_local adapter.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass (n/a — docs-only)
- [x] I have added or updated tests where applicable (n/a — docs-only)
- [x] If this change affects the UI, I have included before/after
screenshots (n/a — README-only; rendered review described in
Verification)
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
Closes PAPA-439