mirror of
https://github.com/alkimake/paperclip.git
synced 2026-06-14 01:50:39 +09:00
9 commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
9eac727cf1
|
[codex] Add skills CLI and catalog management (#6782)
## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies through company-scoped control-plane workflows. > - Agents need reusable, inspectable skills that can be installed, reset, audited, exported, and assigned without bespoke local setup. > - The existing skill truth model needed cleanup so bundled skills, optional catalog skills, runtime skills, and adapter-provided skills have clear provenance. > - Operators also need a practical CLI and board UI for discovering and managing company skills. > - This pull request adds the skills CLI, packaged skills catalog, company skills APIs, and catalog-aware board UI. > - The benefit is a more reusable Paperclip company setup where skills are portable, auditable, and easier for operators and agents to manage. ## What Changed - Added `paperclipai skills` CLI commands and coverage for catalog listing, installing, resetting, and inspecting company skills. - Added a packaged `@paperclipai/skills-catalog` workspace with bundled and optional skill content plus validation/build tests. - Added shared company-skill types and validators used across CLI, server, and UI contracts. - Added server catalog APIs/services for company skill catalog operations, reset semantics, audit behavior, and portability provenance. - Updated adapter skill handling so runtime/catalog provenance remains explicit across local adapters. - Added board UI support for browsing and managing catalog-backed company skills. - Updated docs for the skills CLI/catalog flow and the company skills Paperclip skill reference. - Rebased the branch onto current `paperclipai/paperclip:master`; no `pnpm-lock.yaml`, `.github/workflows`, or migration files are included in the final PR diff. ## Verification - Passed: `pnpm run preflight:workspace-links && pnpm exec vitest run cli/src/__tests__/skills.test.ts packages/skills-catalog/src/catalog-builder.test.ts packages/skills-catalog/src/shipped-catalog.test.ts packages/shared/src/validators/company-skill.test.ts packages/adapter-utils/src/server-utils.test.ts packages/plugins/create-paperclip-plugin/src/entrypoints.test.ts server/src/__tests__/company-skills-catalog-service.test.ts server/src/__tests__/company-skills-routes.test.ts server/src/__tests__/company-portability.test.ts`. - Passed: `pnpm exec vitest run server/src/__tests__/workspace-runtime.test.ts -t "default branch|origin/master|symbolic-ref"`. - Attempted: full `server/src/__tests__/workspace-runtime.test.ts`. Four provisioning tests failed while seeding an isolated worktree database from the local Paperclip instance because the local plugin schema dump contains a duplicate-column foreign key (`plugin_content_machine_18a7bc327b.content_case_signals`). The default-branch tests touched by the rebase conflict passed in the focused run above. - Checked final diff: no `pnpm-lock.yaml`, no `.github/workflows`, and no migration-file changes relative to `master`. ## Risks - Medium: this is a broad skills/catalog change touching CLI, server APIs, shared contracts, adapter skill sync, and UI. - Catalog validation and reset semantics need careful reviewer attention because they affect reusable company setup and portability. - No database migrations are included in this PR, so there is no migration ordering/idempotency risk in the final diff. - No lockfile is included by design; dependency resolution will be handled by the repository lockfile workflow. ## Model Used - OpenAI Codex coding agent based on GPT-5, running in Paperclip via the `codex_local` adapter with shell, git, GitHub CLI, and code-editing tool access. Exact hosted model build/context-window metadata is not exposed in this runtime. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run targeted tests locally and documented the local workspace-runtime seed failure above - [x] I have added or updated tests where applicable - [x] If this change affects the UI, screenshots were intentionally omitted per PAP-10124 instructions; UI behavior is covered by tests and reviewer inspection - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing> |
||
|
|
96f0279e08
|
Make ACPX-Claude adapter work seamlessly (PAPA-388) (#6590)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies, so when
an adapter fails, the platform must surface enough detail for the next
agent (or human reviewer) to act
> - The `acpx_local` adapter wraps `claude-agent-acp`, which in turn
drives the Claude Code SDK — three layers, three different permission
and error-handling models
> - A user created a `Claude Local ACPX` agent in PAPA-387 and it failed
instantly with the generic `acpx.error / "Internal error"` log,
stranding the work and triggering an opaque `stranded_assigned_issue`
recovery to the CTO
> - Once the diagnostic blackbox was opened, the underlying cause turned
out to be two SDK-level mismatches: a model-name allowlist that rejects
bare IDs like `claude-opus-4-7`, and a Claude Code
permission/Read-sandbox configuration that silently denies every
non-allowlisted tool when the user's `~/.claude/settings.json` has
`defaultMode: "dontAsk"`
> - This pull request fixes both classes of failure in the adapter
itself so new ACPX agents work seamlessly without per-host
configuration, and widens the diagnostic surface so the *next* failure
of any kind is actionable
> - The benefit is that ACPX-Claude can join the regular agent roster —
verified end to end on PAPA-401, where the agent successfully reached
the Paperclip API, opened a worktree, surveyed existing notification
PRs, and posted a structured plan
## What Changed
- Widen ACPX failure diagnostics
(`packages/adapters/acpx-local/src/server/execute.ts`):
- Capture `err.name`, ACP code, `cause.message`, retryable flag, and a
5-frame stack preview into `errorMeta`.
- Promote phase-specific error codes: `ensure_session →
acpx_session_init_failed`, `configure_session →
acpx_session_config_failed`, `turn → acpx_turn_failed`, plus mapping for
`ACP_BACKEND_MISSING` / `ACP_BACKEND_UNAVAILABLE`.
- Set `verbose: true` on the ACPX runtime so its session-event log flows
through `ctx.onLog`.
- Capture child-process stderr via a wrapper-script tee into
`<stateDir>/run-stderr/<runId>.log`, inline the tail into the
`acpx.error` payload as `childStderrTail`, and forward it through
`ctx.onLog("stderr", …)` so it lands in the heartbeat `stderrExcerpt`
column (existing redaction applies).
- Set the model via `ANTHROPIC_MODEL` env for the `claude` agent instead
of `set_config_option(model, …)`. The ACP server's `set_config_option`
handler validates against an internal allowlist and rejects bare IDs
like `claude-opus-4-7`. `ANTHROPIC_MODEL` is read during initialization
and bypasses that check.
- Seed `<worktree>/.claude/settings.local.json` before spawning
`claude-agent-acp` (the seamless-API fix). Since `claude-agent-acp`
hard-codes `settingSources: ["user", "project", "local"]` and "local"
has the highest precedence:
- Set `permissions.defaultMode: "default"`, but **only** if the user's
value is missing or `"dontAsk"` (the broken case). Other modes like
`acceptEdits`/`plan` are preserved.
- Pre-allow Paperclip's Bash surface (`Bash(curl:*)`, `Bash(env:*)`,
`Bash(<cwd>/scripts/paperclip-issue-update.sh:*)`,
`Bash(<cwd>/scripts/paperclip:*)`).
- Widen `permissions.additionalDirectories` to include `stateDir`,
`agentHome`, and the per-company instance root
(`~/.paperclip/instances/<id>/companies/<companyId>`). Scoped to this
company only — does not expose other tenants.
- Existing user entries are merged, not replaced. The resolved roots are
folded into the session fingerprint so warm-session handles invalidate
when they change.
- Sync the existing server-side integration test
(`server/src/__tests__/acpx-local-execute.test.ts`) to assert
`acpx_session_init_failed` instead of the now-removed
`acpx_protocol_error` for `ACP_SESSION_INIT_FAILED` (a follow-up to
commit 1).
## Verification
- `pnpm --filter "@paperclipai/adapter-acpx-local" run typecheck` —
passes.
- `pnpm vitest run` in `packages/adapters/acpx-local` — 35/35 pass,
includes 4 new tests covering the settings.local.json write path (claude
only, merge with pre-existing content, `dontAsk` override, codex no-op).
- `pnpm vitest run src/__tests__/acpx-local-execute.test.ts` in
`server/` — 15/15 pass after the test-sync commit.
- End-to-end manual verification (PAPA-401): the `Claude Local ACPX`
agent that previously hit "restricted environment" now successfully
reaches the Paperclip API, opens its worktree, posts structured plan
comments, and flips the issue to `in_review` without any external
configuration.
## Risks
- **Low**, scoped to the `acpx_local` adapter. The settings.local.json
write is per-worktree (worktrees live under
`.paperclip/worktrees/<issue>/`) and only triggers when `acpxAgent ===
"claude"`. Existing user content is merged with `[...existing,
...paperclip]` and deduped — nothing is overwritten outright.
- The `defaultMode` override is intentionally narrow: it only flips
`"dontAsk"` (which silently denies every tool and is the root cause) to
`"default"`. Users who explicitly picked `acceptEdits`, `plan`, or any
other mode keep their choice.
- Stderr capture goes through the existing `log-redaction` pass before
persisting, so `PAPERCLIP_API_KEY` and similar secrets in the wrapper
env don't leak into heartbeat logs.
> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.
## Model Used
- Claude Opus 4.7 (`claude-opus-4-7`), running in the `claude_local`
adapter via Paperclip's harness. Extended thinking enabled, tool use
enabled.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots — N/A (adapter-only)
- [ ] I have updated relevant documentation to reflect my changes — no
user-facing docs changed; internal commentary in the code change
explains the SDK constraints
- [x] I have considered and documented any risks above
- [ ] I will address all Greptile and reviewer comments before
requesting merge
---------
Co-authored-by: Paperclip <noreply@paperclip.ing>
|
||
|
|
0096b56a1c
|
[codex] Add LLM Wiki plugin host support (#5597)
## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies. > - The plugin system needs host contracts and runtime support before large plugins can integrate cleanly. > - The source branch mixed the LLM Wiki package with supporting host/runtime work, managed plugin skills, root-level storage spaces, and a bookmarks reference plugin. > - [PAP-9173](/PAP/issues/PAP-9173) asked for the current branch to be split by file boundary: plugin package separately from everything else. > - [PAP-9188](/PAP/issues/PAP-9188) clarified that LLM Wiki may have plugin-local spaces, but Paperclip core should not reorganize top-level local storage into spaces. > - Follow-up review clarified that the bookmarks example should not ship in this PR either. > - This pull request contains the non-`packages/plugins/plugin-llm-wiki/` host/runtime work, keeps runtime state under the selected Paperclip instance root, and no longer includes the bookmarks example. ## What Changed - Added/updated plugin host contracts, SDK types, worker RPC plumbing, managed plugin skill support, and related server tests. - Removed the bookmarks example plugin package and its bundled-example/workspace references. - Removed the root-level local spaces CLI/migration surface and restored instance-root runtime defaults for config, db, logs, storage, secrets, workspaces, projects, and adapter homes. - Replaced shared root `space-paths` helpers with `home-paths` helpers for core runtime storage. - Tightened stranded recovery unique-conflict detection so concurrent recovery scans reuse the raced recovery issue when Postgres errors are wrapped. - Kept `packages/plugins/plugin-llm-wiki/` out of this PR diff; plugin-local spaces remain in the stacked plugin-only PR. ## Verification - `pnpm exec vitest run cli/src/__tests__/data-dir.test.ts cli/src/__tests__/home-paths.test.ts cli/src/__tests__/onboard.test.ts packages/shared/src/home-paths.test.ts packages/db/src/runtime-config.test.ts server/src/__tests__/agent-instructions-service.test.ts server/src/__tests__/claude-local-execute.test.ts server/src/__tests__/codex-local-execute.test.ts` - `pnpm exec vitest run packages/db/src/runtime-config.test.ts` - `pnpm exec vitest run server/src/__tests__/plugin-routes-authz.test.ts` - `pnpm --filter @paperclipai/server typecheck` - `pnpm exec vitest run server/src/__tests__/heartbeat-process-recovery.test.ts -t "reuses the raced stranded recovery issue"` skipped locally because embedded Postgres did not initialize on this macOS temp host; the code path was typechecked and is covered by Linux CI. - Boundary check: no core references remain for `PAPERCLIP_SPACE_ID`, `spaces migrate-default`, `@paperclipai/shared/space-paths`, `registerSpacesCommands`, or the removed bookmarks example. - Previous PR head `4f23e034` had green GitHub checks: `verify`, all four serialized server shards, `e2e`, `Canary Dry Run`, `policy`, Snyk, and `Greptile Review`. Current head `582f466d` is re-running checks after the bookmarks deletion. ## Risks - Plugin host changes touch shared runtime paths, so regressions would most likely appear in adapter startup, plugin loading, or local dev path defaults. - Removing the bookmarks example also removes one demonstration of plugin database namespaces plus local-folder persistence; remaining plugin examples still cover bundled example discovery and plugin host flows. - The plugin package itself is intentionally deferred to the stacked plugin-only PR, where LLM Wiki plugin-local spaces live. - Existing installs that tested the transient root-level spaces CLI should stop using it; this PR intentionally removes that unsupported migration surface before merge. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI GPT-5 Codex via Codex CLI, tool use and local code execution enabled; context window not exposed. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass, except where noted above for host-specific embedded Postgres initialization - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge Stacked follow-up: PR #5592 contains only `packages/plugins/plugin-llm-wiki/` and targets this branch. --------- Co-authored-by: Paperclip <noreply@paperclip.ing> |
||
|
|
12cb7b40fd
|
Harden remote workspace sync and restore flows (#5444)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies
> - When an agent runs against a remote target, Paperclip syncs the
workspace out to the remote at run start and restores changes back to
the local workspace at run end
> - The previous restore flow naïvely overwrote local files with
whatever the remote returned, so files that the remote run never touched
but had timestamp/mode drift could be needlessly rewritten — and a
single static `refs/paperclip/ssh-sync/imported` ref made concurrent SSH
workspace exports race on the same git ref
> - This pull request adds a `workspace-restore-merge` module that diffs
a pre-run snapshot against the post-run remote state and only writes
back files the remote actually changed; SSH workspace exports now use a
per-import unique ref so concurrent runs can't trample each other
> - Every adapter's execute path threads the snapshot through
`prepareAdapterExecutionTargetRuntime` so the merge has the baseline it
needs
> - The benefit is workspace restores no longer churn untouched files,
and concurrent SSH runs no longer collide on the import ref
## What Changed
- `packages/adapter-utils/src/workspace-restore-merge.{ts,test.ts}`: new
module — directory snapshot (kind/mode/sha256/symlink target) plus
snapshot-aware merge that writes only the files the remote changed
- `packages/adapter-utils/src/ssh.ts`: SSH workspace export uses a
per-import unique ref (`refs/paperclip/ssh-sync/imported/<uuid>`);
restore goes through the new merge helper; `ssh-fixture.test.ts` covers
the unique-ref + merge paths
- `packages/adapter-utils/src/sandbox-managed-runtime.ts` +
`remote-managed-runtime.ts`: thread the snapshot/merge through the
sandbox and SSH paths
- `packages/adapter-utils/src/server-utils.{ts,test.ts}` +
`execution-target.ts`: helpers for capturing the pre-run snapshot;
`prepareAdapterExecutionTargetRuntime` gains required `runId` and
optional `workspaceRemoteDir`, and returns the realized
`workspaceRemoteDir`
- Each adapter's `execute.ts` (acpx, claude, codex, cursor, gemini,
opencode, pi) takes the snapshot at run start and passes it through to
the runtime restore
- Remote execute test mocks updated to match the new
`prepareWorkspaceForSshExecution` return shape and the per-run
`${managedRemoteWorkspace}` cwd subdirectory
## Verification
- `pnpm vitest run --no-coverage --project @paperclipai/adapter-utils
--project @paperclipai/adapter-acpx-local --project
@paperclipai/adapter-claude-local --project
@paperclipai/adapter-codex-local --project
@paperclipai/adapter-cursor-local --project
@paperclipai/adapter-gemini-local --project
@paperclipai/adapter-opencode-local --project
@paperclipai/adapter-pi-local` — 196/196 passing
- `pnpm typecheck` clean across the workspace
## Risks
Medium. The restore path now writes a strict subset of what it
previously did — files the remote did not touch are no longer rewritten.
If any flow was relying on a touch-without-content-change being copied
back (timestamp or permission propagation only), that behavior is now
skipped. Snapshot capture adds an O(N-files-in-workspace) hash pass at
run start; the cost is bounded by the existing exclude list. The `runId`
parameter on `prepareAdapterExecutionTargetRuntime` is now required —
every in-tree caller is updated; out-of-tree adapter authors need to
pass it.
## Model Used
Claude Opus 4.7 (1M context)
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable — new module +
every adapter execute path covered
- [x] If this change affects the UI, I have included before/after
screenshots — N/A (no UI)
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
|
||
|
|
a1b30c9f35
|
Add planning mode for issue work (#5353)
## Thinking Path > - Paperclip is a control plane for autonomous AI companies. > - Issues are the core unit of work, and issue comments are how board users and agents coordinate execution. > - Some issue conversations need to produce plans and approvals instead of immediate implementation work. > - The existing issue contract did not distinguish standard execution comments from planning-oriented issue work. > - This pull request adds an issue work-mode contract and board UI affordances for standard vs planning mode. > - The benefit is that planning-mode issues can be created, displayed, discussed, and carried through agent heartbeat context without losing the normal issue workflow. ## What Changed - Added `standard` / `planning` issue work-mode contracts across DB, shared validators/types, server issue flows, plugin protocol, and adapter heartbeat payloads. - Added an idempotent `0081_optimal_dormammu` migration for `issues.work_mode`, ordered after current `public-gh/master` migrations. - Updated heartbeat/context summaries and issue-thread interaction behavior so planning work mode is preserved when creating suggested follow-up issues. - Added UI support for planning-mode issue creation, issue rows, detail composer styling, and composer work-mode toggles. - Added focused server/shared/UI tests plus a Playwright visual verification spec for planning-mode surfaces. - Rebased the branch onto current `public-gh/master` and added durable planning-mode screenshots under `doc/assets/pap-3368/`. ## Verification - `pnpm --filter @paperclipai/db run check:migrations` - `pnpm exec vitest run --project @paperclipai/shared packages/shared/src/validators/issue.test.ts` - `pnpm exec vitest run --project @paperclipai/server server/src/__tests__/heartbeat-context-summary.test.ts server/src/__tests__/issue-thread-interactions-service.test.ts server/src/__tests__/issues-goal-context-routes.test.ts --pool=forks --poolOptions.forks.isolate=true` - `pnpm exec vitest run --project @paperclipai/ui ui/src/components/IssueChatThread.test.tsx ui/src/components/NewIssueDialog.test.tsx ui/src/components/IssueRow.test.tsx ui/src/pages/IssueDetail.test.tsx` - `pnpm exec vitest run --project @paperclipai/adapter-utils packages/adapter-utils/src/server-utils.test.ts` - `PAPERCLIP_E2E_SKIP_LLM=true npx playwright test --config tests/e2e/playwright.config.ts tests/e2e/planning-mode-visual-verification.spec.ts` ## Screenshots Desktop planning detail:  Desktop planning row:  Desktop staged standard toggle:  Mobile planning detail:  Mobile planning row:  ## Risks - Medium migration risk: this adds a non-null issue column. The migration uses `ADD COLUMN IF NOT EXISTS` so installations that applied an older branch-local migration number can still apply the final numbered migration safely. - Medium contract risk: issue payloads, plugin payloads, and adapter heartbeat payloads now include work mode; compatibility is handled by defaulting missing values to `standard`. - UI risk is moderate because composer controls changed; focused component tests and visual e2e coverage exercise standard vs planning display and toggle behavior. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex, GPT-5 coding agent in a local Paperclip worktree, with shell/tool use. Exact context-window size is not exposed in this runtime. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing> |
||
|
|
11ffd6f2c5
|
Improve ACPX adapter configuration (#5290)
## Thinking Path > - Paperclip orchestrates AI agents across several adapter implementations. > - ACPX is a local adapter path that can proxy Claude and Codex-style execution. > - Its configuration needed stronger schema defaults, provider-aware model handling, and better UI support. > - Plugin authors also need clear docs for managed resources. > - This pull request improves ACPX adapter configuration and documents plugin-managed resources. > - The benefit is a more predictable adapter setup path without changing unrelated control-plane behavior. ## What Changed - Improved ACPX config schema, execution config handling, UI build config, and route coverage. - Added ACPX model filtering support and tests. - Updated the agent config form and storybook coverage for ACPX model/provider behavior. - Expanded plugin authoring documentation for managed resources. ## Verification - `pnpm install --frozen-lockfile` - `pnpm exec vitest run server/src/__tests__/acpx-local-execute.test.ts server/src/__tests__/adapter-routes.test.ts ui/src/lib/acpx-model-filter.test.ts` ## Risks - Low-to-medium risk: adapter configuration behavior changes can affect ACPX users, but the change is isolated to ACPX/plugin-doc surfaces and covered by targeted adapter tests. ## Model Used - OpenAI GPT-5 Codex via Paperclip `codex_local` adapter, with shell/git/GitHub CLI tool use. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing> |
||
|
|
856c6cb192
|
Fix remote workspace environment shaping (#5118)
> **Stacked PR (part 5 of 7).** Depends on: - PR #5114 - PR #5115 - PR #5116 - PR #5117 > Diff against `master` includes commits from earlier PRs in the stack — the new commit in this PR is the topmost one. ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Agents run with a Paperclip-shaped environment (`PAPERCLIP_WORKSPACE_CWD`, > worktree path, `PAPERCLIP_WORKSPACES_JSON` hints) so the CLI can locate the > correct project tree > - SSH testing reproduced a real failure: a Codex SSH run wrote to > `/tmp/paperclip-env-matrix-...` (the *host* path) instead of the realized > remote workspace at `/home/<user>/paperclip-env-matrix-ssh-claude/...` > because the adapter injected `PAPERCLIP_WORKSPACE_CWD=/tmp/...` into the > remote env > - Code review on the initial codex-only fix asked to roll the same approach > into every other SSH-capable adapter (claude, acpx, cursor, opencode, gemini, > pi) via a shared helper rather than duplicating per-adapter > - This PR adds `shapePaperclipWorkspaceEnvForExecution` in adapter-utils that, > when the execution target is remote: replaces local cwd with the realized > execution cwd, nulls out worktree path (which has no remote meaning), and > rewrites/strips `cwd` entries in workspace hints based on what was actually > synced. Every adapter calls it before invoking the remote runner > - The benefit is that remote runs see the realized remote workspace, host-local > paths stop leaking into remote env, and the rule is unit-tested in one place ## What Changed - Added `shapePaperclipWorkspaceEnvForExecution` to `packages/adapter-utils/src/server-utils.ts` with full unit coverage (`server-utils.test.ts`) - Each of acpx-local, claude-local, codex-local, cursor-local, gemini-local, opencode-local, pi-local now calls the new shaper before issuing the remote command and feeds the shaped values into `applyPaperclipWorkspaceEnv` - Per-adapter `execute.remote.test.ts` files extended to cover the new shaping behaviour: localhost paths replaced with remote cwd, foreign-cwd hints stripped, worktree path nulled out for remote targets - `acpx-local/src/server/execute.test.ts` extended with shaping coverage ## Verification - `pnpm test -- server-utils execute.remote` - `pnpm --filter @paperclipai/adapter-acpx-local test` - Manual QA reproducing the original failure: 1. Provision an E2B sandbox environment for the Paperclip QA company 2. Assign an issue to a remote-targeted claude-local agent and confirm the run starts in the correct remote cwd (no `/Users/...` path leakage in the run logs) 3. Repeat for opencode-local and pi-local ## Risks - Behavioural shift: hints whose `cwd` doesn't match the workspace cwd are now stripped on remote targets. If any adapter relied on a leaked local hint cwd, it will see a missing `cwd` instead. Reviewed all current callers — none do. - Adds a small per-run cost (path resolve + string normalisation) on every remote execution. Negligible. - Worktree path is now nulled out on remote (it has no meaning there). Adapters that previously read the value defensively will continue to work. ## Model Used - OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI - Provider: OpenAI - Used to author the code changes in this PR ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots — N/A - [ ] I have updated relevant documentation to reflect my changes — N/A - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge |
||
|
|
b02e67cea5
|
fix(ci): diff PR workflow paths from merge base (#4903)
## Thinking Path > - Paperclip’s PR workflow is part of the control-plane safety surface because it decides whether a branch is allowed to merge. > - This issue started in that workflow: the lockfile and manifest policy checks were diffing `base.sha..head.sha`, which incorrectly treated unrelated `master` commits as if they belonged to the PR branch. > - The right fix there is to diff from the PR merge base (`base...head`) so policy checks only evaluate files introduced by the branch itself. > - Once that workflow fix was in place, `/checkpr` exposed a second blocker on the PR merge ref: `verify` was failing in newer `master`-side tests that were not part of the original branch diff. > - The actionable repeated failure came from the ACPX local adapter test suite, where a test hard-coded the managed Codex home under `instances/default` even though the stable Vitest runner sets a non-default `PAPERCLIP_INSTANCE_ID`. > - This pull request now includes both the original CI diff-scope fix and the targeted ACPX test fix so the PR’s actual checks align with current base-branch execution. > - The benefit is that the original false-positive lockfile failure is removed, and the merge-ref verify path is hardened against the instance-id isolation used in CI. ## What Changed - Updated `.github/workflows/pr.yml` so the lockfile policy and manifest policy steps diff `pull_request.base.sha...pull_request.head.sha` from the merge base instead of using a two-dot base/head diff. - Added an inline workflow comment explaining why the three-dot diff is required for PR-scoped file detection. - Updated `packages/adapters/acpx-local/src/server/execute.test.ts` so the managed Codex home assertion uses a test-specific `PAPERCLIP_INSTANCE_ID` instead of hard-coding `default`. - Restored `PAPERCLIP_INSTANCE_ID` after that ACPX test finishes so the test remains isolated and does not leak process env changes. ## Verification - Reproduced the original false positive locally by comparing PR heads `#4901` and `#4902` with the old `base..head` logic; both incorrectly included `pnpm-lock.yaml` from unrelated `master` commits. - Verified the new `base...head` logic reduces those PRs to only their actual changed files and excludes `pnpm-lock.yaml`. - Verified a real manifest-changing PR (`#4893`) still reports `package.json` changes under the new logic. - Ran `pnpm -r typecheck` successfully. - Ran `pnpm vitest run packages/adapters/acpx-local/src/server/execute.test.ts` successfully after the ACPX test fix. - Ran `pnpm vitest run packages/db/src/backup-lib.test.ts` successfully against the merge-ref-related DB failure path observed during `/checkpr`. - Pushed commit `9520a976` and allowed PR `#4903` checks to rerun on the updated branch. ## Risks - Low risk: the workflow change only affects how PR policy checks determine the changed file set. - Low risk: the ACPX change is test-only and aligns the test with the instance-isolation behavior already used by `scripts/run-vitest-stable.mjs` in CI. - The remaining operational risk is limited to other unrelated merge-ref-only failures that were not reproduced in the targeted local verification above. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex, `gpt-5-codex`, via the Codex local adapter in Paperclip. - Tool-using coding model with shell execution, git, GitHub CLI, and repository inspection in a local worktree. - Context included the current repo, the Paperclip task thread, PR check output, and the isolated execution workspace. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [ ] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge |
||
|
|
4272c1604d
|
Add ACPX local adapter runtime (#4893)
## Thinking Path > - Paperclip orchestrates AI-agent companies through a control plane that can start, supervise, and recover agent runs. > - Local adapters are the bridge between Paperclip issues and concrete agent runtimes such as Claude, Codex, and other ACP-compatible tools. > - The roadmap calls out broader “bring your own agent” and claw-style agent support, and ACPX gives Paperclip one path to normalize multiple ACP agents behind a single adapter. > - The branch needed to become one reviewable PR against current `paperclipai/paperclip:master`, without carrying stale base conflicts or generated lockfile churn. > - This pull request adds an experimental built-in `acpx_local` adapter, integrates it through the server/CLI/UI adapter surfaces, and adds regression coverage for runtime execution, skill sync, stream parsing, diagnostics, and log redaction. > - The benefit is that Paperclip can run Claude/Codex/custom ACP agents through ACPX while keeping operator configuration, skills, logging, and transcript rendering inside the existing adapter model. ## What Changed - Added `@paperclipai/adapter-acpx-local` with server execution, config schema, ACPX session handling, CLI formatting, UI config helpers, and stdout parsing. - Registered `acpx_local` across CLI, server, shared constants, UI adapter metadata, adapter capabilities, and agent creation/editing surfaces. - Added ACPX runtime execution support with persistent sessions, local-agent JWT environment handling, skill snapshots, runtime skill materialization, and isolation/security regressions. - Added ACPX adapter diagnostics and marked the adapter experimental in the UI. - Added command/env secret redaction for resolved command metadata in adapter-utils, server event storage, and the Agent Detail invocation UI. - Added Storybook coverage for ACPX config, transcript rendering, and skill states, plus PR screenshots under `docs/pr-screenshots/pap-2944/`. - Rebased the branch onto current `public-gh/master`; `pnpm-lock.yaml` is intentionally not included and there are no migration/schema changes. ## Verification - `pnpm exec vitest run packages/adapters/acpx-local/src/server/execute.test.ts packages/adapters/acpx-local/src/server/test.test.ts packages/adapters/acpx-local/src/cli/format-event.test.ts packages/adapters/acpx-local/src/ui/parse-stdout.test.ts packages/adapter-utils/src/server-utils.test.ts server/src/__tests__/redaction.test.ts server/src/__tests__/acpx-local-execute.test.ts server/src/__tests__/acpx-local-skill-sync.test.ts server/src/__tests__/acpx-local-adapter-environment.test.ts server/src/__tests__/adapter-routes.test.ts server/src/__tests__/agent-skills-routes.test.ts ui/src/adapters/metadata.test.ts` — 12 files, 87 tests passed. - `pnpm --filter @paperclipai/adapter-acpx-local typecheck` — passed. - `pnpm --filter @paperclipai/server typecheck` — passed. - `pnpm --filter @paperclipai/ui typecheck` — passed. - Confirmed PR diff does not include `pnpm-lock.yaml`, database schema files, or migrations. Screenshots:    ## Risks - Medium risk: this introduces a new built-in adapter package and touches runtime execution, adapter registration, agent config, skills, and transcript rendering. - ACPX and ACP agent behavior can vary by installed tool versions; the adapter is marked experimental to set operator expectations. - `pnpm-lock.yaml` is excluded per repository PR policy, so dependency lock refresh must be handled by the repo’s automation or maintainers. - No database migration risk: no schema or migration files changed. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex coding agent based on GPT-5, with repository tool use, shell execution, git operations, and local verification. Exact hosted context window was not exposed in this environment. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing> |