paperclip

mirror of https://github.com/alkimake/paperclip.git synced 2026-06-14 01:50:39 +09:00

Author	SHA1	Message	Date
Dotta	a3de1d764d	Add cheap model profiles for local adapters (#4881 ) ## Thinking Path > - Paperclip is a control plane for autonomous AI companies, where adapters are the boundary between the board, agents, and execution runtimes. > - Local adapters currently expose a primary runtime configuration, but operators often need a cheaper model lane for routine or low-risk work. > - That cheap lane has to stay adapter-owned: runtime profile settings should not mutate the primary adapter config or bypass existing auth/secret mediation. > - Issue creation also needs an ergonomic way to request primary, cheap, or custom model behavior for a selected assignee. > - This pull request adds a first-class `cheap` model profile contract across adapter capabilities, heartbeat config resolution, agent configuration, and issue creation. > - The benefit is cheaper task execution can be configured and requested explicitly while preserving adapter boundaries, secret handling, and audit visibility. ## What Changed - Added adapter model-profile capability metadata and a `cheap` profile contract for supported local adapters. - Applied `runtimeConfig.modelProfiles.cheap.adapterConfig` during heartbeat config resolution, including requested/applied/fallback run metadata. - Added agent configuration UI for cheap model profile settings without writing those settings into primary `adapterConfig`. - Added New Issue assignee model lane controls for Primary / Cheap / Custom and request payload handling. - Added run ledger profile badges and Storybook stories for the new cheap-lane UI states. - Added tests for validators, heartbeat model profile application, permission/secret mediation, UI payload helpers, and run ledger rendering. - Added committed UI verification screenshots under `docs/pr-screenshots/pap-2837/`. - Addressed Greptile review feedback around cheap-profile defaults, shared profile types, and fallback test data. ## Verification Local: - `pnpm exec vitest run packages/shared/src/validators/issue.test.ts server/src/__tests__/adapter-registry.test.ts server/src/__tests__/agent-permissions-routes.test.ts server/src/__tests__/heartbeat-model-profile.test.ts ui/src/components/IssueRunLedger.test.tsx ui/src/lib/agent-config-patch.test.ts ui/src/lib/issue-assignee-overrides.test.ts ui/src/lib/new-agent-runtime-config.test.ts` — passed, 8 files / 103 tests. - `pnpm exec vitest run ui/src/lib/new-agent-runtime-config.test.ts ui/src/components/IssueRunLedger.test.tsx` — passed after Greptile/rebase follow-up, 2 files / 17 tests. - `pnpm --filter @paperclipai/ui typecheck` — passed after Greptile/rebase follow-up. - `pnpm -r typecheck` — passed. - `pnpm build` — passed. - `pnpm test:run` — did not complete successfully in this local worktree: it stopped in pre-existing `@paperclipai/adapter-utils` sandbox/SSH fixture suites outside this PR diff. Failures were 5s local timeouts plus `git init -b` unsupported by this machine's Git 2.21.0. The branch-specific targeted suites above passed. - Branch was fetched/rebased onto `public-gh/master`; `git rev-list --left-right --count public-gh/master...HEAD` reports `0 9`. Remote PR checks on latest head `e30bf399146451c86cee98ed528d51d33fa5af5a`: - `policy` — passed. - `verify` — passed. - `e2e` — passed. - `Greptile Review` — passed, confidence score 5/5; Greptile review threads resolved. - `security/snyk (cryppadotta)` — passed. Screenshots: - [New issue cheap lane desktop](https://github.com/paperclipai/paperclip/blob/PAP-2837-plan-cheap-model-for-adapters-that-can-support-it/docs/pr-screenshots/pap-2837/newissue-cheap-desktop.png) - [New issue custom lane desktop](https://github.com/paperclipai/paperclip/blob/PAP-2837-plan-cheap-model-for-adapters-that-can-support-it/docs/pr-screenshots/pap-2837/newissue-custom-desktop.png) - [New issue unsupported adapter desktop](https://github.com/paperclipai/paperclip/blob/PAP-2837-plan-cheap-model-for-adapters-that-can-support-it/docs/pr-screenshots/pap-2837/newissue-unsupported-desktop.png) - [Run ledger model profile badges desktop](https://github.com/paperclipai/paperclip/blob/PAP-2837-plan-cheap-model-for-adapters-that-can-support-it/docs/pr-screenshots/pap-2837/runledger-profile-badges-desktop.png) - Mobile variants are also in `docs/pr-screenshots/pap-2837/`. ## Risks - Medium: heartbeat config mediation now merges runtime model profiles into adapter configs, so adapter secret normalization and host-command restrictions must keep covering nested config paths. - Medium: the UI adds another issue creation choice; unsupported adapters must keep hiding the cheap lane and preserve primary behavior. - Low migration risk: no database migration is included. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used OpenAI Codex coding agent using GPT-5-class reasoning with repo tool use and command execution. Exact served model/context window was not exposed by the runtime. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [ ] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing> Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-30 15:32:04 -05:00
Dotta	1fe1067361	Polish board settings and skills workflow (#4863 ) ## Thinking Path > - Paperclip's board UI and bundled skills are the operator layer for configuring agents, routines, issue workflows, and local troubleshooting loops. > - The prior rollup mixed this operator polish with database backups, backend reliability, thread scale, and cost/workflow primitives. > - This pull request isolates the remaining board QoL, settings, issue-detail integration, adapter config cleanup, and skills smoke tooling. > - It includes some integration-level overlap with the thread and workflow slices so this branch can run from `origin/master` while still preserving the full original work. > - Preferred merge order is the narrower primitives first, then this integration PR last. > - The benefit is that reviewers can inspect the user-facing board/settings/skills layer separately from backend infrastructure changes. ## What Changed - Added board/settings polish for agents, routines, company settings, project workspace detail, and issue detail controls. - Added agent/routine UI regression tests and New Issue dialog coverage. - Integrated issue-detail activity/cost/interaction surfaces and leaf work pause/resume controls. - Cleaned bundled adapter UI config defaults and onboarding copy. - Added terminal-bench loop and work-stoppage diagnosis skills plus a smoke test script. - Updated attachment type handling and Paperclip skill/API guidance. ## Verification - `pnpm install --frozen-lockfile` - `pnpm exec vitest run ui/src/pages/Agents.test.tsx ui/src/pages/Routines.test.tsx ui/src/components/NewIssueDialog.test.tsx ui/src/pages/IssueDetail.test.tsx server/src/__tests__/costs-service.test.ts server/src/__tests__/issue-thread-interaction-routes.test.ts server/src/__tests__/issue-thread-interactions-service.test.ts` - Result: 7 test files passed, 54 tests passed. - `pnpm run smoke:terminal-bench-loop-skill` - Result: JSON output included `"ok": true` and `"cleanup": true`. - UI screenshots not included because verification is focused component/page coverage for the changed board surfaces. ## Risks - This is the integration-heavy PR in the split and intentionally overlaps some component/API primitives with the issue-thread and workflow PRs so it can run from `origin/master`. - Preferred merge order: #4859, #4860, #4861, #4862, then this PR last. If earlier branches merge first, this PR may need a straightforward conflict refresh in shared UI files. - The terminal-bench smoke script creates temporary mock issues and relies on cleanup; the verified run returned `cleanup: true`. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex, GPT-5.5, code execution and GitHub CLI tool use, medium reasoning effort. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing>	2026-04-30 15:28:11 -05:00
Devin Foley	a4ac6ff133	Add sandbox callback bridge for remote environment API access (#4801 ) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Agents can run inside sandboxed environments like E2B, which are isolated from the host network > - Sandboxed agents need to call back to the Paperclip API to report progress, post comments, and update issue status > - But sandbox environments cannot reach the Paperclip server directly because they run in isolated network namespaces > - This PR adds a callback bridge that proxies API requests from the sandbox to the Paperclip server, running as a local HTTP server on the host that forwards authenticated requests > - The bridge is started automatically when an adapter launches a sandbox execution, and torn down when the run completes > - The benefit is sandboxed agents can interact with the Paperclip API without requiring network-level access to the host, enabling E2B and similar providers to work end-to-end ## What Changed - Added `sandbox-callback-bridge.ts` in `packages/adapter-utils/` — a lightweight HTTP bridge server that accepts requests from sandbox environments and proxies them to the Paperclip API with authentication - Added request validation and security policy: the bridge only forwards requests to the configured API URL, validates content types, enforces size limits, and rejects non-API paths - Wired the bridge into all remote adapter execute paths (claude, codex, cursor, gemini, pi) — the bridge starts before the agent process and the bridge URL is passed via environment variables - Updated `environment-execution-target.ts` to prefer the explicit API URL from environment lease metadata for sandbox callback routing - Fixed Claude sandbox runtime setup to work with the bridge configuration - Added comprehensive test coverage for bridge request handling, policy enforcement, and sandbox execution integration - Fixed browser bundling — the bridge module is excluded from the frontend bundle via the adapter-utils index export ## Verification - `pnpm test` — all existing and new tests pass, including bridge unit tests and sandbox execution integration tests - `pnpm typecheck` — clean - Manual: configure an E2B environment, run an agent task, verify the agent can post comments and update issue status through the bridge ## Risks - Medium. This is a new network-facing component (HTTP server on localhost). The security policy restricts forwarding to the configured API URL only and validates all requests, but any proxy introduces attack surface. The bridge binds to localhost only and is scoped to the lifetime of a single agent run. ## Model Used Codex GPT 5.4 high via Paperclip. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge	2026-04-29 16:37:34 -07:00
Devin Foley	f9cf1d2f6a	Add cursor sandbox support and fix SSH workspace sync (#4803 ) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Agents can run inside sandboxed environments like E2B, or on remote hosts via SSH > - The cursor adapter needs to resolve `cursor-agent` inside sandbox environments where it's installed in `~/.local/bin` > - But when using the default `agent` command on a sandbox target, the adapter didn't know to look in `~/.local/bin/cursor-agent`, causing "command not found" failures > - Additionally, repeated SSH runs failed because `git checkout` during workspace sync conflicted with leftover `.paperclip-runtime` files from previous runs > - This PR adds sandbox-aware command resolution for cursor and fixes the SSH workspace sync conflict > - The benefit is cursor works in E2B sandboxes out of the box, and repeated SSH runs don't fail on workspace sync ## What Changed - `cursor-local`: Added `prepareCursorSandboxCommand` — on sandbox targets, reads the remote `$HOME`, prepends `~/.local/bin` to PATH, and prefers `~/.local/bin/cursor-agent` when the default command is requested; tightened the sandbox command probe to validate the binary exists before launching; preserves explicit custom command overrides - `adapter-utils/ssh.ts`: Added `--force` to git checkout in SSH workspace sync to handle `.paperclip-runtime` untracked file conflicts from previous runs ## Verification - `pnpm test` — all existing and new tests pass, including cursor sandbox probe, sandbox execution, and custom command override tests - `pnpm typecheck` — clean - Manual: configure an E2B environment, run a cursor-local task, verify it resolves cursor-agent from the sandbox install path ## Risks - Low-medium. The `--force` flag on git checkout could discard uncommitted changes in the remote workspace, but the workspace is managed by Paperclip and should not contain user edits. ## Model Used Codex GPT 5.4 high via Paperclip. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge	2026-04-29 16:12:06 -07:00
Devin Foley	9b99d30330	Add dedicated environment settings page and test-in-environment (#4798 ) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Agents run inside environments (local, SSH, E2B sandbox) > - Operators need to configure and manage these environments > - But environment settings were buried inside the general company settings page, making them hard to find > - Additionally, when testing an agent from the configuration form, the test always ran locally regardless of which environment was selected > - This PR moves environments into a dedicated top-level company settings section and wires the "Test Environment" button to run inside the selected environment > - The benefit is operators can find and manage environments more easily, and the test button now validates the actual environment the agent will use ## What Changed - Added a dedicated `CompanyEnvironments` settings page with its own route and sidebar entry - Updated `CompanySettingsSidebar` and `CompanySettingsNav` to include the new environments section - Modified the agent test route (`POST /agents/:id/test`) to accept an optional `environmentId` parameter - Updated all adapter `test.ts` handlers to resolve and use the specified execution target environment - Added `resolveTestExecutionTarget` to `execution-target.ts` for remote environment test resolution with cwd fallback - Moved the "Test Environment" button and its feedback display into the `NewAgent` page footer for better UX flow ## Verification - `pnpm test` — all existing and new tests pass - `pnpm typecheck` — clean - Manual: navigate to Company Settings, confirm "Environments" appears as a top-level section - Manual: configure an agent with a non-local environment, click "Test Environment", confirm the test runs inside that environment ## Risks - Low risk. UI-only routing change for the settings page. The test-in-environment change adds an optional parameter with a local fallback, so existing behavior is preserved when no environment is specified. ## Model Used Codex GPT 5.4 high via Paperclip. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge	2026-04-29 15:56:13 -07:00
Devin Foley	d47ffa87f0	Fix CEO AGENT_HOME paths and centralize workspace env propagation (#4551 ) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies. > - The local adapter layer is responsible for turning Paperclip runtime context into the environment seen by the child agent process. > - The CEO onboarding bundle tells the agent where to read and write its persistent memory and fact files. > - That bundle was using `./memory/...` and `./life/...`, which only works when the process cwd happens to equal the agent home directory. > - At the same time, six local adapters each duplicated the same workspace-env propagation logic, including `AGENT_HOME`, which makes this contract easy to drift. > - This pull request fixes the CEO instructions to use `$AGENT_HOME/...` and centralizes workspace-env propagation in one shared helper with shared tests. > - The benefit is a real bug fix for agent memory paths plus a single tested contract that makes future built-in adapter work less likely to forget `AGENT_HOME`. ## What Changed - Updated `server/src/onboarding-assets/ceo/HEARTBEAT.md` to use `$AGENT_HOME/memory/...` and `$AGENT_HOME/life/...` instead of cwd-relative `./memory/...` and `./life/...`. - Added `applyPaperclipWorkspaceEnv(...)` in `packages/adapter-utils/src/server-utils.ts` to centralize `PAPERCLIP_WORKSPACE_*` and `AGENT_HOME` propagation. - Added shared helper coverage in `packages/adapter-utils/src/server-utils.test.ts` for both populated and skip-empty cases. - Switched the built-in local adapters (`claude_local`, `codex_local`, `cursor_local`, `gemini_local`, `opencode_local`, `pi_local`) over to the shared helper instead of inline env assignment blocks. ## Verification - `pnpm install` - `pnpm exec vitest run packages/adapter-utils/src/server-utils.test.ts packages/adapters/claude-local/src/server/execute.remote.test.ts packages/adapters/codex-local/src/server/execute.remote.test.ts packages/adapters/cursor-local/src/server/execute.remote.test.ts packages/adapters/gemini-local/src/server/execute.remote.test.ts packages/adapters/opencode-local/src/server/execute.remote.test.ts packages/adapters/pi-local/src/server/execute.remote.test.ts` - Result: 7 test files passed, 31 tests passed, 0 failures. ## Risks - Low risk. - The only behavioral surface is the shared env propagation refactor across six adapters; if the helper diverged from prior semantics, an adapter could miss a workspace env var. - The shared helper test plus the affected adapter execute tests reduce that risk, and the helper preserves the prior "set only non-empty strings" behavior. ## Model Used - OpenAI Codex via Paperclip `codex_local` agent runtime; tool-assisted coding workflow with shell execution, file patching, git operations, and API interaction. The exact backend model identifier and context window are not surfaced by this local runtime. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge	2026-04-26 13:57:35 -07:00
Dotta	9a8d219949	[codex] Stabilize tests and local maintenance assets (#4423 ) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - A fast-moving control plane needs stable local tests and repeatable local maintenance tools so contributors can safely split and review work > - Several route suites needed stronger isolation, Codex manual model selection needed a faster-mode option, and local browser cleanup missed Playwright's headless shell binary > - Storybook static output also needed to be preserved as a generated review artifact from the working branch > - This pull request groups the test/local-dev maintenance pieces so they can be reviewed separately from product runtime changes > - The benefit is more predictable contributor verification and cleaner local maintenance without mixing these changes into feature PRs ## What Changed - Added stable Vitest runner support and serialized route/authz test isolation. - Fixed workspace runtime authz route mocks and stabilized Claude/company-import related assertions. - Allowed Codex fast mode for manually selected models. - Broadened the agent browser cleanup script to detect `chrome-headless-shell` as well as Chrome for Testing. - Preserved generated Storybook static output from the source branch. ## Verification - `pnpm exec vitest run src/__tests__/workspace-runtime-routes-authz.test.ts src/__tests__/claude-local-execute.test.ts --config vitest.config.ts` from `server/` passed: 2 files, 19 tests. - `pnpm exec vitest run src/server/codex-args.test.ts --config vitest.config.ts` from `packages/adapters/codex-local/` passed: 1 file, 3 tests. - `bash -n scripts/kill-agent-browsers.sh && scripts/kill-agent-browsers.sh --dry` passed; dry-run detected `chrome-headless-shell` processes without killing them. - `test -f ui/storybook-static/index.html && test -f ui/storybook-static/assets/forms-editors.stories-Dry7qwx2.js` passed. - `git diff --check public-gh/master..pap-2228-test-local-maintenance -- . ':(exclude)ui/storybook-static'` passed. - `pnpm exec vitest run cli/src/__tests__/company-import-export-e2e.test.ts --config cli/vitest.config.ts` did not complete in the isolated split worktree because `paperclipai run` exited during build prep with `TS2688: Cannot find type definition file for 'react'`; this appears to be caused by the worktree dependency symlink setup, not the code under test. - Confirmed this PR does not include `pnpm-lock.yaml`. ## Risks - Medium risk: the stable Vitest runner changes how route/authz tests are scheduled. - Generated `ui/storybook-static` files are large and contain minified third-party output; `git diff --check` reports whitespace inside those generated assets, so reviewers may choose to drop or regenerate that artifact before merge. - No database migrations. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex coding agent based on GPT-5, with shell, git, Paperclip API, and GitHub CLI tool use in the local Paperclip workspace. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge Note: screenshot checklist item is not applicable to source UI behavior; the included Storybook static output is generated artifact preservation from the source branch. --------- Co-authored-by: Paperclip <noreply@paperclip.ing>	2026-04-24 15:11:42 -05:00
Dotta	8f1cd0474f	[codex] Improve transient recovery and Codex model refresh (#4383 ) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Adapter execution and retry classification decide whether agent work pauses, retries, or recovers automatically > - Transient provider failures need to be classified precisely so Paperclip does not convert retryable upstream conditions into false hard failures > - At the same time, operators need an up-to-date model list for Codex-backed agents and prompts should nudge agents toward targeted verification instead of repo-wide sweeps > - This pull request tightens transient recovery classification for Claude and Codex, updates the agent prompt guidance, and adds Codex model refresh support end-to-end > - The benefit is better automatic retry behavior plus fresher operator-facing model configuration ## What Changed - added Codex usage-limit retry-window parsing and Claude extra-usage transient classification - normalized the heartbeat transient-recovery contract across adapter executions and heartbeat scheduling - documented that deferred comment wakes only reopen completed issues for human/comment-reopen interactions, while system follow-ups leave closed work closed - updated adapter-utils prompt guidance to prefer targeted verification - added Codex model refresh support in the server route, registry, shared types, and agent config form - added adapter/server tests covering the new parsing, retry scheduling, and model-refresh behavior ## Verification - `pnpm exec vitest run --project @paperclipai/adapter-utils packages/adapter-utils/src/server-utils.test.ts` - `pnpm exec vitest run --project @paperclipai/adapter-claude-local packages/adapters/claude-local/src/server/parse.test.ts` - `pnpm exec vitest run --project @paperclipai/adapter-codex-local packages/adapters/codex-local/src/server/parse.test.ts` - `pnpm exec vitest run --project @paperclipai/server server/src/__tests__/adapter-model-refresh-routes.test.ts server/src/__tests__/adapter-models.test.ts server/src/__tests__/claude-local-execute.test.ts server/src/__tests__/codex-local-execute.test.ts server/src/__tests__/heartbeat-process-recovery.test.ts server/src/__tests__/heartbeat-retry-scheduling.test.ts` ## Risks - Moderate behavior risk: retry classification affects whether runs auto-recover or block, so mistakes here could either suppress needed retries or over-retry real failures - Low workflow risk: deferred comment wake reopening is intentionally scoped to human/comment-reopen interactions so system follow-ups do not revive completed issues unexpectedly > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex GPT-5-based coding agent with tool use and code execution in the Codex CLI environment ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [ ] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing>	2026-04-24 09:40:40 -05:00
Devin Foley	e4995bbb1c	Add SSH environment support (#4358 ) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The environments subsystem already models execution environments, but before this branch there was no end-to-end SSH-backed runtime path for agents to actually run work against a remote box > - That meant agents could be configured around environment concepts without a reliable way to execute adapter sessions remotely, sync workspace state, and preserve run context across supported adapters > - We also need environment selection to participate in normal Paperclip control-plane behavior: agent defaults, project/issue selection, route validation, and environment probing > - Because this capability is still experimental, the UI surface should be easy to hide and easy to remove later without undoing the underlying implementation > - This pull request adds SSH environment execution support across the runtime, adapters, routes, schema, and tests, then puts the visible environment-management UI behind an experimental flag > - The benefit is that we can validate real SSH-backed agent execution now while keeping the user-facing controls safely gated until the feature is ready to come out of experimentation ## What Changed - Added SSH-backed execution target support in the shared adapter runtime, including remote workspace preparation, skill/runtime asset sync, remote session handling, and workspace restore behavior after runs. - Added SSH execution coverage for supported local adapters, plus remote execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi. - Added environment selection and environment-management backend support needed for SSH execution, including route/service work, validation, probing, and agent default environment persistence. - Added CLI support for SSH environment lab verification and updated related docs/tests. - Added the `enableEnvironments` experimental flag and gated the environment UI behind it on company settings, agent configuration, and project configuration surfaces. ## Verification - `pnpm exec vitest run packages/adapters/claude-local/src/server/execute.remote.test.ts packages/adapters/cursor-local/src/server/execute.remote.test.ts packages/adapters/gemini-local/src/server/execute.remote.test.ts packages/adapters/opencode-local/src/server/execute.remote.test.ts packages/adapters/pi-local/src/server/execute.remote.test.ts` - `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts` - `pnpm exec vitest run server/src/__tests__/instance-settings-routes.test.ts` - `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts ui/src/lib/new-agent-runtime-config.test.ts` - `pnpm -r typecheck` - `pnpm build` - Manual verification on a branch-local dev server: - enabled the experimental flag - created an SSH environment - created a Linux Claude agent using that environment - confirmed a run executed on the Linux box and synced workspace changes back ## Risks - Medium: this touches runtime execution flow across multiple adapters, so regressions would likely show up in remote session setup, workspace sync, or environment selection precedence. - The UI flag reduces exposure, but the underlying runtime and route changes are still substantial and rely on migration correctness. - The change set is broad across adapters, control-plane services, migrations, and UI gating, so review should pay close attention to environment-selection precedence and remote workspace lifecycle behavior. ## Model Used - OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding model with tool use and code execution in the local repo workspace. The local adapter does not surface a more specific public model version string in this branch workflow. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge	2026-04-23 19:15:22 -07:00
Russell Dempsey	854fa81757	fix(pi-local): prepend installed skill bin/ dirs to child PATH (#4331 ) ## Thinking Path > - Paperclip orchestrates AI agents; each agent runs under an adapter that spawns a model CLI as a child process. > - The pi-local adapter (`packages/adapters/pi-local`) spawns `pi` and inherits the child's shell environment — including `PATH`, which determines what the child's bash tool can execute by name. > - Paperclip skills ship executable helpers under `<skill>/bin/` (e.g. `paperclip-get-issue`) and Reviewer/QA-style `AGENTS.md` files invoke them by name via the agent's bash tool. > - Pi-local builds its runtime env with `ensurePathInEnv({ ...process.env, ...env })` only — it never adds the installed skills' `bin/` dirs to PATH. The pi CLI's `--skill` arg loads each skill's SKILL.md but does not augment PATH. > - Consequence: every bash invocation of a skill helper fails with `exit 127: command not found`. The agent then spends its heartbeat guessing (re-reading SKILL.md, trying `find`, inventing command paths) and either times out or gives up. > - This PR prepends each injected skill's `bin/` directory to the child PATH immediately before runtimeEnv is constructed. > - The benefit: pi_local agents whose AGENTS.md uses any `paperclip-*` skill helper can actually run those helpers. ## What Changed - `packages/adapters/pi-local/src/server/execute.ts`: compute `skillBinDirs` from the already-resolved `piSkillEntries`, dedupe against the existing PATH, prepend them to whichever of `PATH` / `Path` the merged env uses, then build `runtimeEnv`. No new helpers, no adapter-utils changes. ## Verification Manual repro before the fix: 1. Create a pi_local agent wired to a paperclip skill (e.g. paperclip-control). 2. Wake the agent on an in_review issue with an AGENTS.md that starts with `paperclip-get-issue "$PAPERCLIP_TASK_ID"`. 3. Session file: `{ "role": "toolResult", "isError": true, "content": [{ "text": "/bin/bash: paperclip-get-issue: command not found\n\nCommand exited with code 127" }] }`. After the fix: same wake; `paperclip-get-issue` resolves and returns the issue JSON; agent proceeds. Local commands: ``` pnpm --filter @paperclipai/adapter-pi-local typecheck # clean pnpm --filter @paperclipai/adapter-pi-local build # clean pnpm --filter @paperclipai/server exec vitest run \ src/__tests__/pi-local-execute.test.ts \ src/__tests__/pi-local-adapter-environment.test.ts \ src/__tests__/pi-local-skill-sync.test.ts # 5/5 passing ``` No new tests: the existing `pi-local-skill-sync.test.ts` covers skill symlink injection (upstream of the PATH step), and `pi-local-execute.test.ts` covers the spawn path; this change only augments env on the same spawn path. ## Risks Low. Pure PATH augmentation on the child env. Edge cases: - Zero skills installed → no PATH change (guarded by `skillBinDirs.length > 0`). - Duplicate bin dirs already on PATH → deduped; no pollution on re-runs. - Windows `Path` casing → falls back correctly when merged env uses `Path` instead of `PATH`. - Skill dir without `bin/` subdir → joined path simply won't resolve; harmless. No behavioral change for pi_local agents that don't use skill-provided commands. ## Model Used - Claude, `claude-opus-4-7` (1M context), extended thinking enabled, tool use enabled. Walked pi-local/cursor-local/claude-local and adapter-utils to isolate the gap, wrote the inlined fix, and ran typecheck/build/test locally. ## Checklist - [x] Thinking path from project context to this change - [x] Model used specified - [x] Checked ROADMAP.md — no overlap - [x] Tests run locally, passing - [x] Tests added — new case in `server/src/__tests__/pi-local-execute.test.ts`; verified it fails when the fix is reverted - [ ] UI screenshots — N/A (backend adapter change) - [x] Docs updated — N/A (internal adapter, no user-facing docs) - [x] Risks documented - [x] Will address reviewer comments before merge	2026-04-23 10:15:10 -05:00
Dotta	a957394420	[codex] Add structured issue-thread interactions (#4244 ) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies. > - Operators supervise that work through issues, comments, approvals, and the board UI. > - Some agent proposals need structured board/user decisions, not hidden markdown conventions or heavyweight governed approvals. > - Issue-thread interactions already provide a natural thread-native surface for proposed tasks and questions. > - This pull request extends that surface with request confirmations, richer interaction cards, and agent/plugin/MCP helpers. > - The benefit is that plan approvals and yes/no decisions become explicit, auditable, and resumable without losing the single-issue workflow. ## What Changed - Added persisted issue-thread interactions for suggested tasks, structured questions, and request confirmations. - Added board UI cards for interaction review, selection, question answers, and accept/reject confirmation flows. - Added MCP and plugin SDK helpers for creating interaction cards from agents/plugins. - Updated agent wake instructions, onboarding assets, Paperclip skill docs, and public docs to prefer structured confirmations for issue-scoped decisions. - Rebased the branch onto `public-gh/master` and renumbered branch migrations to `0063` and `0064`; the idempotency migration uses `ADD COLUMN IF NOT EXISTS` for old branch users. ## Verification - `git diff --check public-gh/master..HEAD` - `pnpm exec vitest run packages/adapter-utils/src/server-utils.test.ts packages/mcp-server/src/tools.test.ts packages/shared/src/issue-thread-interactions.test.ts ui/src/lib/issue-thread-interactions.test.ts ui/src/lib/issue-chat-messages.test.ts ui/src/components/IssueThreadInteractionCard.test.tsx ui/src/components/IssueChatThread.test.tsx server/src/__tests__/issue-thread-interaction-routes.test.ts server/src/__tests__/issue-thread-interactions-service.test.ts server/src/services/issue-thread-interactions.test.ts` -> 9 files / 79 tests passed - `pnpm -r typecheck` -> passed, including `packages/db` migration numbering check ## Risks - Medium: this adds a new issue-thread interaction model across db/shared/server/ui/plugin surfaces. - Migration risk is reduced by placing this branch after current master migrations (`0063`, `0064`) and making the idempotency column add idempotent for users who applied the old branch numbering. - UI interaction behavior is covered by component tests, but this PR does not include browser screenshots. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex, GPT-5-class coding agent runtime. Exact model ID and context window are not exposed in this Paperclip run; tool use and local shell/code execution were enabled. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing>	2026-04-21 20:15:11 -05:00
Dotta	09d0678840	[codex] Harden heartbeat scheduling and runtime controls (#4223 ) ## Thinking Path > - Paperclip orchestrates AI agents through issue checkout, heartbeat runs, routines, and auditable control-plane state > - The runtime path has to recover from lost local processes, transient adapter failures, blocked dependencies, and routine coalescing without stranding work > - The existing branch carried several reliability fixes across heartbeat scheduling, issue runtime controls, routine dispatch, and operator-facing run state > - These changes belong together because they share backend contracts, migrations, and runtime status semantics > - This pull request groups the control-plane/runtime slice so it can merge independently from board UI polish and adapter sandbox work > - The benefit is safer heartbeat recovery, clearer runtime controls, and more predictable recurring execution behavior ## What Changed - Adds bounded heartbeat retry scheduling, scheduled retry state, and Codex transient failure recovery handling. - Tightens heartbeat process recovery, blocker wake behavior, issue comment wake handling, routine dispatch coalescing, and activity/dashboard bounds. - Adds runtime-control MCP tools and Paperclip skill docs for issue workspace runtime management. - Adds migrations `0061_lively_thor_girl.sql` and `0062_routine_run_dispatch_fingerprint.sql`. - Surfaces retry state in run ledger/agent UI and keeps related shared types synchronized. ## Verification - `pnpm exec vitest run server/src/__tests__/heartbeat-retry-scheduling.test.ts server/src/__tests__/heartbeat-process-recovery.test.ts server/src/__tests__/routines-service.test.ts` - `pnpm exec vitest run src/tools.test.ts` from `packages/mcp-server` ## Risks - Medium risk: this touches heartbeat recovery and routine dispatch, which are central execution paths. - Migration order matters if split branches land out of order: merge this PR before branches that assume the new runtime/routine fields. - Runtime retry behavior should be watched in CI and in local operator smoke tests because it changes how transient failures are resumed. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex, GPT-5-based coding agent runtime, shell/git tool use enabled. Exact hosted model build and context window are not exposed in this Paperclip heartbeat environment. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge	2026-04-21 12:24:11 -05:00
Dotta	c7c1ca0c78	[codex] Clean up terminal-result adapter process groups (#4129 ) ## Thinking Path > - Paperclip runs local adapter processes for agents and streams their output into heartbeat runs > - Some adapters can emit a terminal result before all descendant processes have exited > - If those descendants keep running, a heartbeat can appear complete while the process group remains alive > - Claude local runs need a bounded cleanup path after terminal JSON output is observed and the child exits > - This pull request adds terminal-result cleanup support to adapter process utilities and wires it into the Claude local adapter > - The benefit is fewer stranded adapter process groups after successful terminal results ## What Changed - Added terminal-result cleanup options to `runChildProcess`. - Tracked child exit plus terminal output before signaling lingering process groups. - Added Claude local adapter configuration for terminal result cleanup grace time. - Added process cleanup tests covering terminal-output cleanup and noisy non-terminal runs. ## Verification - `pnpm install --frozen-lockfile --ignore-scripts` - `pnpm exec vitest run packages/adapter-utils/src/server-utils.test.ts` - Result: 9 tests passed. ## Risks - Medium risk: this changes adapter child-process cleanup behavior. - The cleanup only arms after terminal result detection and child exit, and it is covered by process-group tests. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex coding agent based on GPT-5, tool-enabled local shell and GitHub workflow, exact runtime context window not exposed in this session. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots, or documented why it is not applicable - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing>	2026-04-20 10:38:57 -05:00
Dotta	236d11d36f	[codex] Add run liveness continuations (#4083 ) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies. > - Heartbeat runs are the control-plane record of each agent execution window. > - Long-running local agents can exhaust context or stop while still holding useful next-step state. > - Operators need that stop reason, next action, and continuation path to be durable and visible. > - This pull request adds run liveness metadata, continuation summaries, and UI surfaces for issue run ledgers. > - The benefit is that interrupted or long-running work can resume with clearer context instead of losing the agent's last useful handoff. ## What Changed - Added heartbeat-run liveness fields, continuation attempt tracking, and an idempotent `0058` migration. - Added server services and tests for run liveness, continuation summaries, stop metadata, and activity backfill. - Wired local and HTTP adapters to surface continuation/liveness context through shared adapter utilities. - Added shared constants, validators, and heartbeat types for liveness continuation state. - Added issue-detail UI surfaces for continuation handoffs and the run ledger, with component tests. - Updated agent runtime docs, heartbeat protocol docs, prompt guidance, onboarding assets, and skills instructions to explain continuation behavior. - Addressed Greptile feedback by scoping document evidence by run, excluding system continuation-summary documents from liveness evidence, importing shared liveness types, surfacing hidden ledger run counts, documenting bounded retry behavior, and moving run-ledger liveness backfill off the request path. ## Verification - `pnpm exec vitest run packages/adapter-utils/src/server-utils.test.ts server/src/__tests__/run-continuations.test.ts server/src/__tests__/run-liveness.test.ts server/src/__tests__/activity-service.test.ts server/src/__tests__/documents-service.test.ts server/src/__tests__/issue-continuation-summary.test.ts server/src/services/heartbeat-stop-metadata.test.ts ui/src/components/IssueRunLedger.test.tsx ui/src/components/IssueContinuationHandoff.test.tsx ui/src/components/IssueDocumentsSection.test.tsx` - `pnpm --filter @paperclipai/db build` - `pnpm exec vitest run server/src/__tests__/activity-service.test.ts ui/src/components/IssueRunLedger.test.tsx` - `pnpm --filter @paperclipai/ui typecheck` - `pnpm --filter @paperclipai/server typecheck` - `pnpm exec vitest run server/src/__tests__/activity-service.test.ts server/src/__tests__/run-continuations.test.ts ui/src/components/IssueRunLedger.test.tsx` - `pnpm exec vitest run server/src/__tests__/heartbeat-process-recovery.test.ts -t "treats a plan document update"` - `pnpm exec vitest run server/src/__tests__/activity-service.test.ts server/src/__tests__/heartbeat-process-recovery.test.ts -t "activity service\|treats a plan document update"` - Remote PR checks on head `e53b1a1d`: `verify`, `e2e`, `policy`, and Snyk all passed. - Confirmed `public-gh/master` is an ancestor of this branch after fetching `public-gh master`. - Confirmed `pnpm-lock.yaml` is not included in the branch diff. - Confirmed migration `0058_wealthy_starbolt.sql` is ordered after `0057` and uses `IF NOT EXISTS` guards for repeat application. - Greptile inline review threads are resolved. ## Risks - Medium risk: this touches heartbeat execution, liveness recovery, activity rendering, issue routes, shared contracts, docs, and UI. - Migration risk is mitigated by additive columns/indexes and idempotent guards. - Run-ledger liveness backfill is now asynchronous, so the first ledger response can briefly show historical missing liveness until the background backfill completes. - UI screenshot coverage is not included in this packaging pass; validation is currently through focused component tests. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex, GPT-5.4, local tool-use coding agent with terminal, git, GitHub connector, GitHub CLI, and Paperclip API access. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge Screenshot note: no before/after screenshots were captured in this PR packaging pass; the UI changes are covered by focused component tests listed above. --------- Co-authored-by: Paperclip <noreply@paperclip.ing>	2026-04-20 06:01:49 -05:00
Dewaldt Huysamen	f701c3e78c	feat(claude-local): add Opus 4.7 to adapter model dropdown (#3828 ) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Each adapter advertises a model list that powers the agent config UI dropdown > - The `claude_local` adapter's dropdown is sourced from the hard-coded `models` array in `packages/adapters/claude-local/src/index.ts` > - Anthropic recently released Opus 4.7, the newest current-generation Opus model > - Without a list entry, users cannot discover or select Opus 4.7 from the dropdown (they can still type it manually, since the field is creatable, but discoverability is poor) > - This pull request adds `claude-opus-4-7` to the `claude_local` model list so new agents can be configured with the latest model by default > - The benefit is out-of-the-box access to the newest Opus model, consistent with how every other current-generation Claude model is already listed ## What Changed - Added `{ id: "claude-opus-4-7", label: "Claude Opus 4.7" }` as the first entry of the `models` array in `packages/adapters/claude-local/src/index.ts`. Newest-first ordering matches the convention already used for 4.6. ## Verification - `pnpm --filter @paperclipai/adapter-claude-local typecheck` → passes. - `pnpm --filter @paperclipai/server exec vitest run src/__tests__/adapter-models.test.ts src/__tests__/claude-local-adapter.test.ts` → 12/12 passing (both directly-related files). - No existing test pins the `claude_local` models array (see `server/src/__tests__/adapter-models.test.ts`), so appending a new entry is non-breaking. - Manual check of UI consumer: `AgentConfigForm.tsx` fetches the list via `agentsApi.adapterModels()` and renders it in a creatable popover — no hard-coded expectations anywhere in the UI layer. - Screenshots: single new option appears at the top of the Claude Code (local) model dropdown; existing options unchanged. ## Risks - Low risk. Purely additive: one new entry in a list consumed by a UI dropdown. No behavior change for existing agents, no schema change, no migration, no env var. - `BEDROCK_MODELS` in `packages/adapters/claude-local/src/server/models.ts` is intentionally not touched — the exact region-qualified Bedrock id for Opus 4.7 is not yet confirmed, and shipping a guessed id could produce a broken option for Bedrock users. Tracked as a follow-up on the linked issue. ## Model Used - None — human-authored. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable (no tests needed: existing suite already covers the list-consumer paths) - [x] If this change affects the UI, I have included before/after screenshots (dropdown gains one new top entry; all other entries unchanged) - [x] I have updated relevant documentation to reflect my changes (no doc update needed: `docs/adapters/claude-local.md` uses `claude-opus-4-6` only as an example, still valid) - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge Closes #3827	2026-04-16 13:18:30 -05:00
Knife.D	f6ce976544	fix: Anthropic subscription quota always shows 100% used (#3589 ) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The Costs > Providers tab displays live subscription quota from each adapter (Claude, Codex) > - The Claude adapter fetches utilization from the Anthropic OAuth usage API and converts it to a 0-100 percent via `toPercent()` > - The API changed to return utilization as 0-100 percentages (e.g. `34.0` = 34%), but `toPercent()` assumed 0-1 fractions and multiplied by 100 > - After `Math.min(100, ...)` clamping, every quota window displayed as 100% used regardless of actual usage > - Additionally, `extra_usage.used_credits` and `monthly_limit` are returned in cents but were formatted as dollars, showing $6,793 instead of $67.93 > - This PR applies the same `< 1` heuristic already proven in the Codex adapter and fixes the cents-to-dollars conversion > - The benefit is accurate quota display matching what users see on claude.ai/settings/usage ## What Changed - `toPercent()`: apply `< 1` heuristic to handle both legacy 0-1 fractions and current 0-100 percentage API responses (consistent with Codex adapter's `normalizeCodexUsedPercent()`) - `formatExtraUsageLabel()`: divide `used_credits` and `monthly_limit` by 100 to convert cents to dollars before formatting - Updated all `toPercent` and `fetchClaudeQuota` tests to use current API format (0-100 range) - Added backward-compatibility test for legacy 0-1 fraction values - Added test for enabled extra usage with utilization and cents-to-dollars conversion ## Verification - `toPercent(34.0)` → `34` (was `100`) - `toPercent(91.0)` → `91` (was `100`) - `toPercent(0.5)` → `50` (legacy format still works) - Extra usage `used_credits: 6793, monthly_limit: 14000` → `$67.93 / $140.00` (was `$6,793.00 / $14,000.00`) - Verified on a live instance with Claude Max subscription — Costs > Providers tab now shows correct percentages matching claude.ai/settings/usage ## Risks Low risk. The `< 1` heuristic is already battle-tested in the Codex adapter. The only edge case is a true utilization of exactly `1.0` which maps to `1%` instead of `100%` — this is consistent with the Codex adapter behavior and is an acceptable trade-off since 1% and 100% are distinguishable in practice (100% would be returned as `100.0` by the API). ## Model Used Claude Opus 4.6 (1M context) via Claude Code CLI — tool use, code analysis, and code generation ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge Closes #2188 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 06:44:26 -05:00
Dotta	a7dc88941b	fix(codex-local): avoid fast mode in env probe	2026-04-11 08:33:18 -05:00
Dotta	2d8f97feb0	feat(codex-local): add fast mode support	2026-04-11 08:21:55 -05:00
dotta	2a84e53c1b	Introduce bind presets for deployment setup Co-Authored-By: Paperclip <noreply@paperclip.ing>	2026-04-11 07:09:07 -05:00
Dotta	c566a9236c	fix: harden heartbeat and adapter runtime workflows	2026-04-10 22:26:21 -05:00
Aron Prins	724893ad5b	fix claude instruction sibling path hint	2026-04-10 14:22:48 +02:00
dotta	9eaf72ab31	Fix Codex tool-use transcript completion	2026-04-09 06:16:05 -05:00
dotta	0ff262ca0f	fix: preserve claude instructions on resume fallback	2026-04-08 06:57:21 -05:00
lempkey	e3804f792d	fix: gate instructions file I/O and commandNotes on fresh sessions only On resumed sessions, skipping --append-system-prompt-file (the original fix) left two secondary issues: - commandNotes still claimed the flag was injected, producing misleading onMeta logs on every resumed heartbeat - The instructions file was still read from disk and a combined temp file written on every resume, even though effectiveInstructionsFilePath was never consumed Hoist canResumeSession before the I/O block and gate both the disk operations and commandNotes construction on !canResumeSession / !sessionId. Adds three regression tests: commandNotes is populated on fresh sessions, empty on resume; and no agent-instructions.md is written on resume.	2026-04-08 06:57:21 -05:00
lempkey	3cfbc350a0	fix: skip --append-system-prompt-file on resumed claude sessions On resumed sessions the agent instructions are already present in the session cache. Unconditionally passing --append-system-prompt-file re-injects 5-10K redundant tokens per heartbeat and may be rejected by the Claude CLI when combined with --resume. Guard the flag behind `!resumeSessionId` so it is only appended on fresh session starts. Fixes: #2848	2026-04-08 06:57:21 -05:00
Dotta	50a36beec5	Merge pull request #3033 from kimnamu/feat/bedrock-model-selection fix(claude-local): respect model selection for Bedrock users	2026-04-07 21:48:29 -05:00
Dotta	391afa627f	Merge pull request #2143 from shoaib050326/codex/issue-2131-openclaw-session-key fix(openclaw-gateway): prefix session keys with configured agent id	2026-04-07 16:53:18 -05:00
Dotta	26ebe3b002	Merge pull request #2662 from wbelt/fix/configurable-claimed-api-key-path fix(openclaw-gateway): make claimedApiKeyPath configurable per agent	2026-04-07 09:31:14 -05:00
kimnamu	60744d8a91	fix: address Greptile P2 — reuse DIRECT_MODELS import, global region prefix match - Import models from index.ts instead of duplicating the array - Use regex ^\w+\.anthropic\. to match all Bedrock region prefixes (us, eu, ap, and any future regions) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-07 23:24:37 +09:00
kimnamu	07987d75ad	feat(claude-local): add Bedrock model selection support Previously, --model was completely skipped for Bedrock users, so the model dropdown selection was silently ignored and the CLI always used its default model. Selecting Haiku would still run Opus. - Add listClaudeModels() that returns Bedrock-native model IDs (us.anthropic.) when Bedrock env is detected - Register listModels on claude_local adapter so the UI dropdown shows Bedrock models instead of Anthropic API names - Allow --model to pass through when the ID is a Bedrock-native identifier (us.anthropic. or ARN) - Add isBedrockModelId() helper shared by execute.ts and test.ts Follows up on #2793 which added basic Bedrock auth detection. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-07 23:16:57 +09:00
Brandon Woo	15bd2ef349	fix: recognize missing-rollout Codex resume error as stale session The Codex CLI can return "no rollout found for thread id ..." when resuming a heartbeat thread whose rollout has been garbage-collected. Extend isCodexUnknownSessionError() to match this wording so the existing single-retry path in execute.ts activates correctly. Add parse.test.ts covering the new pattern, existing stale-session wordings, parseCodexJsonl, and a negative case. Co-Authored-By: Paperclip <noreply@paperclip.ing>	2026-04-07 10:45:38 +09:00
Dotta	08fea10ce1	Merge pull request #2772 from paperclipai/PAPA-46-why-did-this-issue-succeed-without-following-my-instructions fix: enable agent re-checkout of in_review tasks on comment feedback	2026-04-06 18:57:33 -05:00
Dawid Piaskowski	b74d94ba1e	Treat Pi quota exhaustion as a failed run (#2305 ) ## Thinking Path Paperclip orchestrates AI agent runs and reports their success or failure. The Pi adapter spawns a local Pi process and interprets its JSONL output to determine the run outcome. When Pi hits a quota limit (429 RESOURCE_EXHAUSTED), it retries internally and emits an `auto_retry_end` event with `success: false` — but still exits with code 0. The current adapter trusts the exit code, so Paperclip marks the run as succeeded even though it produced no useful work. This PR teaches the parser to detect quota exhaustion and synthesize a failure. Closes #2234 ## Changes - Parse `auto_retry_end` events with `success: false` into `result.errors` - Parse standalone `error` events into `result.errors` - Synthesize exit code 1 when Pi exits 0 but parsed errors exist - Use the parsed error as `errorMessage` so the failure reason is visible in the UI ## Verification ```bash pnpm vitest run pi-local-execute pnpm vitest run --reporter=verbose 2>&1 \| grep pi-local ``` - `parse.test.ts`: covers failed retry, successful retry (no error), standalone error events, and empty error messages - `pi-local-execute.test.ts`: end-to-end test with a fake Pi binary that emits `auto_retry_end` + exits 0, asserts the run is marked failed ## Risks - Low: Only affects runs where Pi exits 0 with a parsed error — no change to normal successful or already-failing runs - If Pi emits `auto_retry_end { success: false }` but the run actually produced valid output, this would incorrectly mark it as failed. This seems unlikely given the semantics of the event. ## Model Used - Claude Opus 4.6 (Anthropic) — assisted with test additions and PR template ## Checklist - [x] Thinking path documented - [x] Model specified - [x] Tests pass locally - [x] Test coverage for new parse branches (success path, error events, empty messages) - [x] No UI changes - [x] Risk analysis included --------- Co-authored-by: Dawid Piaskowski <dawid@MacBook-Pro.local> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-06 14:29:41 -07:00
Lucas Kim	b6e40fec54	feat: add AWS Bedrock auth support on "claude-local" (#2793 ) Closes #2412 Related: #2681, #498, #128 ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The Claude Code adapter spawns the `claude` CLI to run agent tasks > - The adapter detects auth mode by checking for `ANTHROPIC_API_KEY` — recognizing only "api" and "subscription" modes > - But users running Claude Code via AWS Bedrock (`CLAUDE_CODE_USE_BEDROCK=1`) fall through to the "subscription" path > - This causes a misleading "ANTHROPIC_API_KEY is not set; subscription-based auth can be used" message in the environment check > - Additionally, the hello probe passes `--model claude-opus-4-6` which is not a valid Bedrock model identifier, causing `400 The provided model identifier is invalid` and a probe failure > - This pull request adds Bedrock auth detection, skips the Anthropic-style `--model` flag for Bedrock, and returns the correct billing type > - The benefit is that Bedrock users get a working environment check and correct cost tracking out of the box --- ## Pain Point Many enterprise teams use Claude Code through AWS Bedrock rather than Anthropic's direct API — for compliance, billing consolidation, or VPC requirements. Currently, these users hit a hard wall during onboarding: \| Problem \| Impact \| \|---\|---\| \| ❌ Adapter environment check always fails \| Users cannot create their first agent — blocked at step 1 \| \| ❌ `--model claude-opus-4-6` is invalid on Bedrock (requires `us.anthropic.` format) \| Hello probe exits with code 1: `400 The provided model identifier is invalid` \| \| ❌ Auth shown as _"subscription-based"_ \| Misleading — Bedrock is neither subscription nor API-key auth \| \| ❌ Quota polling hits Anthropic OAuth endpoint \| Fails silently for Bedrock users who have no Anthropic subscription \| > Bottom line: Paperclip is completely unusable for Bedrock users out of the box. ## Why Bedrock Matters AWS Bedrock is a major deployment path for Claude in enterprise environments: - Enterprise compliance* — data stays within the customer's AWS account and VPC - Unified billing — Claude usage appears on the existing AWS invoice, no separate Anthropic billing - IAM integration — access controlled through AWS IAM roles and policies - Regional deployment — models run in the customer's preferred AWS region Supporting Bedrock unlocks Paperclip for organizations that cannot use Anthropic's direct API due to procurement, security, or regulatory constraints. --- ## What Changed - `execute.ts`: Added `isBedrockAuth()` helper that checks `CLAUDE_CODE_USE_BEDROCK` and `ANTHROPIC_BEDROCK_BASE_URL` env vars. `resolveClaudeBillingType()` now returns `"metered_api"` for Bedrock. Biller set to `"aws_bedrock"`. Skips `--model` flag when Bedrock is active (Anthropic-style model IDs are invalid on Bedrock; the CLI uses its own configured model). - `test.ts`: Environment check now detects Bedrock env vars (from adapter config or server env) and shows `"AWS Bedrock auth detected. Claude will use Bedrock for inference."` instead of the misleading subscription message. Also skips `--model` in the hello probe for Bedrock. - `quota.ts`: Early return with `{ ok: true, windows: [] }` when Bedrock is active — Bedrock usage is billed through AWS, not Anthropic's subscription quota system. - `ui/src/lib/utils.ts`: Added `"aws_bedrock"` → `"AWS Bedrock"` to `providerDisplayName()` and `quotaSourceDisplayName()`. ## Verification 1. `pnpm -r typecheck` — all packages pass 2. Unit tests added and passing (6/6) 3. Environment check with Bedrock env vars: \| \| Before \| After \| \|---\|---\|---\| \| Status \| 🔴 Failed \| ✅ Passed \| \| Auth message \| `ANTHROPIC_API_KEY is not set; subscription-based auth can be used if Claude is logged in.` \| `AWS Bedrock auth detected. Claude will use Bedrock for inference.` \| \| Hello probe \| `ERROR · Claude hello probe failed.` (exit code 1 — `--model claude-opus-4-6` is invalid on Bedrock) \| `INFO · Claude hello probe succeeded.` \| \| Screenshot \| <img height="500" alt="Screenshot 2026-04-05 at 8 25 27 AM" src="https://github.com/user-attachments/assets/476431f6-6139-425a-8abc-97875d653657" /> \| <img height="500" alt="Screenshot 2026-04-05 at 8 31 58 AM" src="https://github.com/user-attachments/assets/d388ce87-c5e6-4574-b8d2-fd8b86135299" /> \| 4. Existing API key / subscription paths are completely untouched unless Bedrock env vars are present ## Risks - Low risk. All changes are additive — existing "api" and "subscription" code paths are only entered when Bedrock env vars are absent. - When Bedrock is active, the `--model` flag is skipped, so the Paperclip model dropdown selection is ignored in favor of the Claude CLI's own model config. This is intentional since Bedrock requires different model identifiers. ## Model Used - Claude Opus 4.6 (`claude-opus-4-6`, 1M context window) via Claude Code CLI ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>	2026-04-06 13:15:18 -07:00
Wes Belt	c171ff901c	Merge branch 'master' into fix/configurable-claimed-api-key-path	2026-04-06 06:17:42 -04:00
dotta	8ae4c0e765	Clean up opencode rebase and stabilize runtime test Co-Authored-By: Paperclip <noreply@paperclip.ing>	2026-04-04 18:15:28 -05:00
dotta	b9b2bf3b5b	Trim resumed comment wake prompts	2026-04-04 18:14:19 -05:00
dotta	91e040a696	Batch inline comment wake payloads Co-Authored-By: Paperclip <noreply@paperclip.ing>	2026-04-04 18:14:19 -05:00
Devin Foley	cd2be692e9	Fix in-review task recheckout guidance Co-Authored-By: Paperclip <noreply@paperclip.ing>	2026-04-04 11:20:29 -07:00
HenkDz	14d59da316	feat(adapters): external adapter plugin system with dynamic UI parser - Plugin loader: install/reload/remove/reinstall external adapters from npm packages or local directories - Plugin store persisted at ~/.paperclip/adapter-plugins.json - Self-healing UI parser resolution with version caching - UI: Adapter Manager page, dynamic loader, display registry with humanized names for unknown adapter types - Dev watch: exclude adapter-plugins dir from tsx watcher to prevent mid-request server restarts during reinstall - All consumer fallbacks use getAdapterLabel() for consistent display - AdapterTypeDropdown uses controlled open state for proper close behavior - Remove hermes-local from built-in UI (externalized to plugin) - Add docs for external adapters and UI parser contract	2026-04-03 21:11:20 +01:00
Wes Belt	1ce800c158	docs: add claimedApiKeyPath to agentConfigurationDoc Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-03 14:15:36 -04:00
Wes Belt	8e42c6cdac	fix(openclaw-gateway): make claimedApiKeyPath configurable per agent The openclaw_gateway adapter hardcodes the Paperclip API key path to ~/.openclaw/workspace/paperclip-claimed-api-key.json in buildWakeText(). In multi-agent OpenClaw deployments, each agent has its own workspace with its own key file. The hardcoded path forces all agents to share one key, breaking agent identity isolation. Add a claimedApiKeyPath field to the adapter config (with UI input) that allows operators to set a per-agent path. Falls back to the current default when unset — zero behavior change for existing deployments. Fixes #930	2026-04-03 11:25:58 -04:00
Dotta	19aaa54ae4	Merge branch 'master' into add-gpt-5-4-xhigh-effort	2026-03-31 06:19:26 -05:00
Dotta	ccb5cce4ac	Merge pull request #2204 from paperclipai/pap-1007-operator-polish fix: apply operator polish across comments, invites, routines, and health	2026-03-30 14:48:24 -05:00
Dotta	5575399af1	Merge pull request #2048 from remdev/fix/codex-rpc-client-spawn-error fix(codex) rpc client spawn error	2026-03-30 14:24:33 -05:00
dotta	a3e125f796	Clarify Claude transcript event categories Co-Authored-By: Paperclip <noreply@paperclip.ing>	2026-03-30 14:13:52 -05:00
Shoaib Ansari	8e2148e99d	fix openclaw gateway session key routing	2026-03-30 12:13:39 +05:30
dotta	b3d61a7561	Clarify manual workspace runtime behavior Co-Authored-By: Paperclip <noreply@paperclip.ing>	2026-03-29 10:55:45 -05:00
dotta	cadfcd1bc6	Log resolved adapter command in run metadata Co-Authored-By: Paperclip <noreply@paperclip.ing>	2026-03-29 10:55:26 -05:00
Mikhail Batukhtin	dc3aa8f31f	test(codex-local): isolate quota spawn test from host CODEX_HOME After the mocked RPC spawn fails, getQuotaWindows() still calls readCodexToken(). Use an empty mkdtemp directory for CODEX_HOME for the duration of the test so we never read ~/.codex/auth.json or call WHAM.	2026-03-29 15:15:37 +03:00

1 2 3 4 5

247 commits