paperclip

mirror of https://github.com/alkimake/paperclip.git synced 2026-06-17 03:10:38 +09:00

Author	SHA1	Message	Date
Devin Foley	9578dc3da7	Wire per-adapter sandbox install commands through test and execute paths (#5280 ) > Stacked PR. Sits on top of the e2b sandbox chain — #5278 (stdin staging) and #5279 (honest-resolvability + login-profiles). The cumulative diff against `master` includes both of those PRs' content; the files touched by this PR's commit are the new `maybeRunSandboxInstallCommand` helper in `packages/adapter-utils/src/execution-target.ts` and the per-adapter `index.ts`/`server/test.ts`/`server/execute.ts` wiring under `packages/adapters/{claude,codex,cursor,gemini,opencode,pi}-local/`. The honest resolvability check from #5279 is what gives this PR's install command a meaningful "did it actually land on PATH" follow-up. ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Sandbox execution targets are ephemeral — each fresh lease starts from a template image that may or may not have the agent CLIs preinstalled > - When a CLI isn't preinstalled, the resolvability probe fails at `command -v` and the hello probe never runs > - There's no shared mechanism for "before you probe or provision, install the CLI on this sandbox" > - This pull request adds a `SANDBOX_INSTALL_COMMAND` constant per adapter and a `maybeRunSandboxInstallCommand` helper that runs it via the existing sandbox login shell, captures structured output, and never throws (so the resolvability + hello probe still run after); each adapter's `test()` and `execute()` share the constant so the two callsites can't drift > - The benefit is a fresh sandbox lease without a preinstalled CLI now installs it once via `sh -lc` before the resolvability probe and before managed-runtime provisioning, with a uniform `<adapter>_install_command_run` check on the test report ## What Changed - `packages/adapter-utils/src/execution-target.ts`: add `AdapterSandboxInstallCommandCheck` and `maybeRunSandboxInstallCommand` (runs the install via existing sandbox shell, captures exit/stdout/stderr, returns a structured info/warn check, never throws) - Add `SANDBOX_INSTALL_COMMAND` to each adapter's `index.ts` so `test()` and `execute()` share a single source of truth - Wire each of the 6 affected adapter `testEnvironment()`s to call `maybeRunSandboxInstallCommand` before `ensureAdapterExecutionTargetCommandResolvable` - Pass `installCommand: SANDBOX_INSTALL_COMMAND` through `prepareAdapterExecutionTargetRuntime` in each adapter's `execute()` - Per-adapter install commands use npm globals where possible so binaries land on a PATH segment the template already exports: - claude → `npm install -g @anthropic-ai/claude-code` - codex → `npm install -g @openai/codex` - cursor → `curl https://cursor.com/install -fsS \| bash` - gemini → `npm install -g @google/gemini-cli` - opencode → `npm install -g opencode-ai` - pi → `npm install -g @mariozechner/pi-coding-agent` SSH and local targets ignore `installCommand` (SSH runtime takes no such param; local short-circuits before runtime prep), so this is a no-op for non-sandbox environments. ## Verification - `pnpm typecheck` clean - `pnpm vitest run --no-coverage --project @paperclipai/adapter-utils` and per-adapter projects pass - Manual sandbox matrix (claude, codex, cursor, gemini, opencode, pi) — each goes `install_command_run → resolvable → hello_probe_passed` (Codex and Pi land on `hello_probe_auth_required`, which is the configured-credentials problem, not an install issue) - SSH no-regression: SSH Claude still passes; the helper short-circuits on non-sandbox targets ## Risks Medium — adds a network/CPU cost (npm install / curl) on every fresh sandbox lease. Cost is bounded (one-time per lease, typically tens of seconds for npm globals), and the helper never throws so a failing install still lets the report run resolvability and hello probes. If a sandbox image already has the CLI, the install is an idempotent reinstall. ## Model Used Claude Opus 4.7 (1M context) ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots — N/A (no UI) - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge	2026-05-05 08:29:28 -07:00
Dotta	a3de1d764d	Add cheap model profiles for local adapters (#4881 ) ## Thinking Path > - Paperclip is a control plane for autonomous AI companies, where adapters are the boundary between the board, agents, and execution runtimes. > - Local adapters currently expose a primary runtime configuration, but operators often need a cheaper model lane for routine or low-risk work. > - That cheap lane has to stay adapter-owned: runtime profile settings should not mutate the primary adapter config or bypass existing auth/secret mediation. > - Issue creation also needs an ergonomic way to request primary, cheap, or custom model behavior for a selected assignee. > - This pull request adds a first-class `cheap` model profile contract across adapter capabilities, heartbeat config resolution, agent configuration, and issue creation. > - The benefit is cheaper task execution can be configured and requested explicitly while preserving adapter boundaries, secret handling, and audit visibility. ## What Changed - Added adapter model-profile capability metadata and a `cheap` profile contract for supported local adapters. - Applied `runtimeConfig.modelProfiles.cheap.adapterConfig` during heartbeat config resolution, including requested/applied/fallback run metadata. - Added agent configuration UI for cheap model profile settings without writing those settings into primary `adapterConfig`. - Added New Issue assignee model lane controls for Primary / Cheap / Custom and request payload handling. - Added run ledger profile badges and Storybook stories for the new cheap-lane UI states. - Added tests for validators, heartbeat model profile application, permission/secret mediation, UI payload helpers, and run ledger rendering. - Added committed UI verification screenshots under `docs/pr-screenshots/pap-2837/`. - Addressed Greptile review feedback around cheap-profile defaults, shared profile types, and fallback test data. ## Verification Local: - `pnpm exec vitest run packages/shared/src/validators/issue.test.ts server/src/__tests__/adapter-registry.test.ts server/src/__tests__/agent-permissions-routes.test.ts server/src/__tests__/heartbeat-model-profile.test.ts ui/src/components/IssueRunLedger.test.tsx ui/src/lib/agent-config-patch.test.ts ui/src/lib/issue-assignee-overrides.test.ts ui/src/lib/new-agent-runtime-config.test.ts` — passed, 8 files / 103 tests. - `pnpm exec vitest run ui/src/lib/new-agent-runtime-config.test.ts ui/src/components/IssueRunLedger.test.tsx` — passed after Greptile/rebase follow-up, 2 files / 17 tests. - `pnpm --filter @paperclipai/ui typecheck` — passed after Greptile/rebase follow-up. - `pnpm -r typecheck` — passed. - `pnpm build` — passed. - `pnpm test:run` — did not complete successfully in this local worktree: it stopped in pre-existing `@paperclipai/adapter-utils` sandbox/SSH fixture suites outside this PR diff. Failures were 5s local timeouts plus `git init -b` unsupported by this machine's Git 2.21.0. The branch-specific targeted suites above passed. - Branch was fetched/rebased onto `public-gh/master`; `git rev-list --left-right --count public-gh/master...HEAD` reports `0 9`. Remote PR checks on latest head `e30bf399146451c86cee98ed528d51d33fa5af5a`: - `policy` — passed. - `verify` — passed. - `e2e` — passed. - `Greptile Review` — passed, confidence score 5/5; Greptile review threads resolved. - `security/snyk (cryppadotta)` — passed. Screenshots: - [New issue cheap lane desktop](https://github.com/paperclipai/paperclip/blob/PAP-2837-plan-cheap-model-for-adapters-that-can-support-it/docs/pr-screenshots/pap-2837/newissue-cheap-desktop.png) - [New issue custom lane desktop](https://github.com/paperclipai/paperclip/blob/PAP-2837-plan-cheap-model-for-adapters-that-can-support-it/docs/pr-screenshots/pap-2837/newissue-custom-desktop.png) - [New issue unsupported adapter desktop](https://github.com/paperclipai/paperclip/blob/PAP-2837-plan-cheap-model-for-adapters-that-can-support-it/docs/pr-screenshots/pap-2837/newissue-unsupported-desktop.png) - [Run ledger model profile badges desktop](https://github.com/paperclipai/paperclip/blob/PAP-2837-plan-cheap-model-for-adapters-that-can-support-it/docs/pr-screenshots/pap-2837/runledger-profile-badges-desktop.png) - Mobile variants are also in `docs/pr-screenshots/pap-2837/`. ## Risks - Medium: heartbeat config mediation now merges runtime model profiles into adapter configs, so adapter secret normalization and host-command restrictions must keep covering nested config paths. - Medium: the UI adds another issue creation choice; unsupported adapters must keep hiding the cheap lane and preserve primary behavior. - Low migration risk: no database migration is included. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used OpenAI Codex coding agent using GPT-5-class reasoning with repo tool use and command execution. Exact served model/context window was not exposed by the runtime. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [ ] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing> Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-30 15:32:04 -05:00
Dotta	9a8d219949	[codex] Stabilize tests and local maintenance assets (#4423 ) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - A fast-moving control plane needs stable local tests and repeatable local maintenance tools so contributors can safely split and review work > - Several route suites needed stronger isolation, Codex manual model selection needed a faster-mode option, and local browser cleanup missed Playwright's headless shell binary > - Storybook static output also needed to be preserved as a generated review artifact from the working branch > - This pull request groups the test/local-dev maintenance pieces so they can be reviewed separately from product runtime changes > - The benefit is more predictable contributor verification and cleaner local maintenance without mixing these changes into feature PRs ## What Changed - Added stable Vitest runner support and serialized route/authz test isolation. - Fixed workspace runtime authz route mocks and stabilized Claude/company-import related assertions. - Allowed Codex fast mode for manually selected models. - Broadened the agent browser cleanup script to detect `chrome-headless-shell` as well as Chrome for Testing. - Preserved generated Storybook static output from the source branch. ## Verification - `pnpm exec vitest run src/__tests__/workspace-runtime-routes-authz.test.ts src/__tests__/claude-local-execute.test.ts --config vitest.config.ts` from `server/` passed: 2 files, 19 tests. - `pnpm exec vitest run src/server/codex-args.test.ts --config vitest.config.ts` from `packages/adapters/codex-local/` passed: 1 file, 3 tests. - `bash -n scripts/kill-agent-browsers.sh && scripts/kill-agent-browsers.sh --dry` passed; dry-run detected `chrome-headless-shell` processes without killing them. - `test -f ui/storybook-static/index.html && test -f ui/storybook-static/assets/forms-editors.stories-Dry7qwx2.js` passed. - `git diff --check public-gh/master..pap-2228-test-local-maintenance -- . ':(exclude)ui/storybook-static'` passed. - `pnpm exec vitest run cli/src/__tests__/company-import-export-e2e.test.ts --config cli/vitest.config.ts` did not complete in the isolated split worktree because `paperclipai run` exited during build prep with `TS2688: Cannot find type definition file for 'react'`; this appears to be caused by the worktree dependency symlink setup, not the code under test. - Confirmed this PR does not include `pnpm-lock.yaml`. ## Risks - Medium risk: the stable Vitest runner changes how route/authz tests are scheduled. - Generated `ui/storybook-static` files are large and contain minified third-party output; `git diff --check` reports whitespace inside those generated assets, so reviewers may choose to drop or regenerate that artifact before merge. - No database migrations. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex coding agent based on GPT-5, with shell, git, Paperclip API, and GitHub CLI tool use in the local Paperclip workspace. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge Note: screenshot checklist item is not applicable to source UI behavior; the included Storybook static output is generated artifact preservation from the source branch. --------- Co-authored-by: Paperclip <noreply@paperclip.ing>	2026-04-24 15:11:42 -05:00
Dotta	2d8f97feb0	feat(codex-local): add fast mode support	2026-04-11 08:21:55 -05:00
Dotta	19aaa54ae4	Merge branch 'master' into add-gpt-5-4-xhigh-effort	2026-03-31 06:19:26 -05:00
dotta	b3d61a7561	Clarify manual workspace runtime behavior Co-Authored-By: Paperclip <noreply@paperclip.ing>	2026-03-29 10:55:45 -05:00
Devin Foley	80766e589c	Clarify docs: skills go to the effective CODEX_HOME, not ~/.codex The previous documentation parenthetical "(defaulting to ~/.codex/skills/)" was misleading because Paperclip almost always sets CODEX_HOME to a per-company managed home. Update index.ts docs, skills.ts detail string, and execute.ts inline comment to make the runtime path unambiguous. Co-Authored-By: Paperclip <noreply@paperclip.ing>	2026-03-25 20:46:05 -07:00
Devin Foley	f6ac6e47c4	Clarify docs: skills go to $CODEX_HOME/skills/, defaulting to ~/.codex Addresses Greptile P2 review comment. Co-Authored-By: Paperclip <noreply@paperclip.ing>	2026-03-25 16:05:57 -07:00
Devin Foley	eeec52ad74	Fix Codex skill injection to use ~/.codex/skills/ instead of cwd The Codex adapter was the only one injecting skills into <cwd>/.agents/skills/, polluting the project's git repo. All other adapters (Gemini, Cursor, etc.) use a home-based directory. This changes the Codex adapter to inject into ~/.codex/skills/ (resolved via resolveSharedCodexHomeDir) to match the established pattern. Co-Authored-By: Paperclip <noreply@paperclip.ing>	2026-03-25 15:55:51 -07:00
dotta	19154d0fec	Clarify Codex instruction sources Co-Authored-By: Paperclip <noreply@paperclip.ing>	2026-03-23 16:57:33 -05:00
dotta	d53714a145	fix: manage codex home per company by default	2026-03-20 14:44:27 -05:00
dotta	b4e06c63e2	Refine codex runtime skills and portability assets	2026-03-19 07:15:36 -05:00
Dotta	3120c72372	Add worktree-aware workspace runtime support	2026-03-10 10:58:38 -05:00
Kevin Mok	314288ff82	Add gpt-5.4 fallback and xhigh effort options	2026-03-05 18:59:42 -06:00
Arthur R Longbottom	7086ad00ae	feat(codex): add gpt-5.4 to codex_local model list Add the newly released gpt-5.4 model to the codex_local adapter's available models list. にゃ～ 🐱	2026-03-05 16:13:29 -08:00
Dotta	0810101fda	Set codex-local creation defaults for model and sandbox bypass	2026-03-03 12:41:50 -06:00
Dotta	8351f7f1bd	Auto-create missing cwd for claude_local and codex_local	2026-03-03 12:29:32 -06:00
Dotta	83be94361c	feat(core): merge backup core changes with post-split functionality	2026-03-02 16:43:59 -06:00
Forgotten	614974f81c	feat(adapter): add instructionsFilePath to config types and UI builders Wire up instructionsFilePath in CreateConfigValues, adapter docs, and UI config builders for both Claude and Codex adapters. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-26 16:34:29 -06:00
Forgotten	9f049aa4f3	feat: resolve agent workspace from session/project/fallback Heartbeat service resolves cwd from task session, project primary workspace, or agent home directory (~/.paperclip/instances/.../workspaces/). Adapters receive workspace context and forward it as env vars and session params. cwd is now optional in adapter config. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-25 08:38:58 -06:00
Forgotten	2c3c2cf724	feat: adapter model discovery, reasoning effort, and improved codex formatting Add dynamic OpenAI model list fetching for codex adapter with caching, async listModels interface, reasoning effort support for both claude and codex adapters, optional timeouts (default to unlimited), wakeCommentId context propagation, and richer codex stdout event parsing/formatting. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-20 10:32:07 -06:00
Forgotten	6e335b3fd0	Improve codex-local adapter: skill injection, stdin piping, and error parsing Codex adapter now auto-injects Paperclip skills into ~/.codex/skills, pipes prompts via stdin instead of passing as CLI args, filters out noisy rollout stderr warnings, and extracts error/turn.failed messages from JSONL output. Also broadens stale session detection for rollout path errors. Claude-local adapter gets the same template vars (agentId, companyId, runId) that codex-local already had. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-19 14:39:37 -06:00
Forgotten	0d73e1b407	Add adapter config docs, approval context env vars, and paperclip-create-agent skill Export agentConfigurationDoc from all adapters for LLM reflection. Inject PAPERCLIP_APPROVAL_ID, PAPERCLIP_APPROVAL_STATUS, and PAPERCLIP_LINKED_ISSUE_IDS into local adapter environments. Update SKILL.md with approval handling procedure, issue-approval linking, and config revision documentation. Add new paperclip-create-agent skill for CEO-driven agent hiring workflows. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-19 13:02:53 -06:00
Forgotten	631c859b89	Move adapter implementations into shared workspace packages Extract claude-local and codex-local adapter code from cli/server/ui into packages/adapters/ and packages/adapter-utils/. CLI, server, and UI now import shared adapter logic instead of duplicating it. Removes ~1100 lines of duplicated code across packages. Register new packages in pnpm workspace. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-18 14:23:16 -06:00

24 commits