Commit graph

10 commits

Author SHA1 Message Date
Devin Foley
9578dc3da7
Wire per-adapter sandbox install commands through test and execute paths (#5280)
> **Stacked PR.** Sits on top of the e2b sandbox chain — #5278 (stdin
staging) and #5279 (honest-resolvability + login-profiles). The
cumulative diff against `master` includes both of those PRs' content;
the files touched by *this* PR's commit are the new
`maybeRunSandboxInstallCommand` helper in
`packages/adapter-utils/src/execution-target.ts` and the per-adapter
`index.ts`/`server/test.ts`/`server/execute.ts` wiring under
`packages/adapters/{claude,codex,cursor,gemini,opencode,pi}-local/`. The
honest resolvability check from #5279 is what gives this PR's install
command a meaningful "did it actually land on PATH" follow-up.

## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Sandbox execution targets are ephemeral — each fresh lease starts
from a template image that may or may not have the agent CLIs
preinstalled
> - When a CLI isn't preinstalled, the resolvability probe fails at
`command -v` and the hello probe never runs
> - There's no shared mechanism for "before you probe or provision,
install the CLI on this sandbox"
> - This pull request adds a `SANDBOX_INSTALL_COMMAND` constant per
adapter and a `maybeRunSandboxInstallCommand` helper that runs it via
the existing sandbox login shell, captures structured output, and never
throws (so the resolvability + hello probe still run after); each
adapter's `test()` and `execute()` share the constant so the two
callsites can't drift
> - The benefit is a fresh sandbox lease without a preinstalled CLI now
installs it once via `sh -lc` before the resolvability probe and before
managed-runtime provisioning, with a uniform
`<adapter>_install_command_run` check on the test report

## What Changed

- `packages/adapter-utils/src/execution-target.ts`: add
`AdapterSandboxInstallCommandCheck` and `maybeRunSandboxInstallCommand`
(runs the install via existing sandbox shell, captures
exit/stdout/stderr, returns a structured info/warn check, never throws)
- Add `SANDBOX_INSTALL_COMMAND` to each adapter's `index.ts` so `test()`
and `execute()` share a single source of truth
- Wire each of the 6 affected adapter `testEnvironment()`s to call
`maybeRunSandboxInstallCommand` before
`ensureAdapterExecutionTargetCommandResolvable`
- Pass `installCommand: SANDBOX_INSTALL_COMMAND` through
`prepareAdapterExecutionTargetRuntime` in each adapter's `execute()`
- Per-adapter install commands use npm globals where possible so
binaries land on a PATH segment the template already exports:
  - claude → `npm install -g @anthropic-ai/claude-code`
  - codex → `npm install -g @openai/codex`
  - cursor → `curl https://cursor.com/install -fsS | bash`
  - gemini → `npm install -g @google/gemini-cli`
  - opencode → `npm install -g opencode-ai`
  - pi → `npm install -g @mariozechner/pi-coding-agent`

SSH and local targets ignore `installCommand` (SSH runtime takes no such
param; local short-circuits before runtime prep), so this is a no-op for
non-sandbox environments.

## Verification

- `pnpm typecheck` clean
- `pnpm vitest run --no-coverage --project @paperclipai/adapter-utils`
and per-adapter projects pass
- Manual sandbox matrix (claude, codex, cursor, gemini, opencode, pi) —
each goes `install_command_run → resolvable → hello_probe_passed` (Codex
and Pi land on `hello_probe_auth_required`, which is the
configured-credentials problem, not an install issue)
- SSH no-regression: SSH Claude still passes; the helper short-circuits
on non-sandbox targets

## Risks

Medium — adds a network/CPU cost (npm install / curl) on every fresh
sandbox lease. Cost is bounded (one-time per lease, typically tens of
seconds for npm globals), and the helper never throws so a failing
install still lets the report run resolvability and hello probes. If a
sandbox image already has the CLI, the install is an idempotent
reinstall.

## Model Used

Claude Opus 4.7 (1M context)

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots — N/A (no UI)
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-05 08:29:28 -07:00
Devin Foley
f9cf1d2f6a
Add cursor sandbox support and fix SSH workspace sync (#4803)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Agents can run inside sandboxed environments like E2B, or on remote
hosts via SSH
> - The cursor adapter needs to resolve `cursor-agent` inside sandbox
environments where it's installed in `~/.local/bin`
> - But when using the default `agent` command on a sandbox target, the
adapter didn't know to look in `~/.local/bin/cursor-agent`, causing
"command not found" failures
> - Additionally, repeated SSH runs failed because `git checkout` during
workspace sync conflicted with leftover `.paperclip-runtime` files from
previous runs
> - This PR adds sandbox-aware command resolution for cursor and fixes
the SSH workspace sync conflict
> - The benefit is cursor works in E2B sandboxes out of the box, and
repeated SSH runs don't fail on workspace sync

## What Changed

- `cursor-local`: Added `prepareCursorSandboxCommand` — on sandbox
targets, reads the remote `$HOME`, prepends `~/.local/bin` to PATH, and
prefers `~/.local/bin/cursor-agent` when the default command is
requested; tightened the sandbox command probe to validate the binary
exists before launching; preserves explicit custom command overrides
- `adapter-utils/ssh.ts`: Added `--force` to git checkout in SSH
workspace sync to handle `.paperclip-runtime` untracked file conflicts
from previous runs

## Verification

- `pnpm test` — all existing and new tests pass, including cursor
sandbox probe, sandbox execution, and custom command override tests
- `pnpm typecheck` — clean
- Manual: configure an E2B environment, run a cursor-local task, verify
it resolves cursor-agent from the sandbox install path

## Risks

- Low-medium. The `--force` flag on git checkout could discard
uncommitted changes in the remote workspace, but the workspace is
managed by Paperclip and should not contain user edits.

## Model Used

Codex GPT 5.4 high via Paperclip.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-29 16:12:06 -07:00
Devin Foley
9b99d30330
Add dedicated environment settings page and test-in-environment (#4798)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Agents run inside environments (local, SSH, E2B sandbox)
> - Operators need to configure and manage these environments
> - But environment settings were buried inside the general company
settings page, making them hard to find
> - Additionally, when testing an agent from the configuration form, the
test always ran locally regardless of which environment was selected
> - This PR moves environments into a dedicated top-level company
settings section and wires the "Test Environment" button to run inside
the selected environment
> - The benefit is operators can find and manage environments more
easily, and the test button now validates the actual environment the
agent will use

## What Changed

- Added a dedicated `CompanyEnvironments` settings page with its own
route and sidebar entry
- Updated `CompanySettingsSidebar` and `CompanySettingsNav` to include
the new environments section
- Modified the agent test route (`POST /agents/:id/test`) to accept an
optional `environmentId` parameter
- Updated all adapter `test.ts` handlers to resolve and use the
specified execution target environment
- Added `resolveTestExecutionTarget` to `execution-target.ts` for remote
environment test resolution with cwd fallback
- Moved the "Test Environment" button and its feedback display into the
`NewAgent` page footer for better UX flow

## Verification

- `pnpm test` — all existing and new tests pass
- `pnpm typecheck` — clean
- Manual: navigate to Company Settings, confirm "Environments" appears
as a top-level section
- Manual: configure an agent with a non-local environment, click "Test
Environment", confirm the test runs inside that environment

## Risks

- Low risk. UI-only routing change for the settings page. The
test-in-environment change adds an optional parameter with a local
fallback, so existing behavior is preserved when no environment is
specified.

## Model Used

Codex GPT 5.4 high via Paperclip.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-29 15:56:13 -07:00
kimnamu
07987d75ad feat(claude-local): add Bedrock model selection support
Previously, --model was completely skipped for Bedrock users, so the
model dropdown selection was silently ignored and the CLI always used
its default model.  Selecting Haiku would still run Opus.

- Add listClaudeModels() that returns Bedrock-native model IDs
  (us.anthropic.*) when Bedrock env is detected
- Register listModels on claude_local adapter so the UI dropdown
  shows Bedrock models instead of Anthropic API names
- Allow --model to pass through when the ID is a Bedrock-native
  identifier (us.anthropic.* or ARN)
- Add isBedrockModelId() helper shared by execute.ts and test.ts

Follows up on #2793 which added basic Bedrock auth detection.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 23:16:57 +09:00
Lucas Kim
b6e40fec54
feat: add AWS Bedrock auth support on "claude-local" (#2793)
Closes #2412
Related: #2681, #498, #128

## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The Claude Code adapter spawns the `claude` CLI to run agent tasks
> - The adapter detects auth mode by checking for `ANTHROPIC_API_KEY` —
recognizing only "api" and "subscription" modes
> - But users running Claude Code via **AWS Bedrock**
(`CLAUDE_CODE_USE_BEDROCK=1`) fall through to the "subscription" path
> - This causes a misleading "ANTHROPIC_API_KEY is not set;
subscription-based auth can be used" message in the environment check
> - Additionally, the hello probe passes `--model claude-opus-4-6` which
is **not a valid Bedrock model identifier**, causing `400 The provided
model identifier is invalid` and a probe failure
> - This pull request adds Bedrock auth detection, skips the
Anthropic-style `--model` flag for Bedrock, and returns the correct
billing type
> - The benefit is that Bedrock users get a working environment check
and correct cost tracking out of the box

---

## Pain Point

Many enterprise teams use **Claude Code through AWS Bedrock** rather
than Anthropic's direct API — for compliance, billing consolidation, or
VPC requirements. Currently, these users hit a **hard wall during
onboarding**:

| Problem | Impact |
|---|---|
|  Adapter environment check **always fails** | Users cannot create
their first agent — blocked at step 1 |
|  `--model claude-opus-4-6` is **invalid on Bedrock** (requires
`us.anthropic.*` format) | Hello probe exits with code 1: `400 The
provided model identifier is invalid` |
|  Auth shown as _"subscription-based"_ | Misleading — Bedrock is
neither subscription nor API-key auth |
|  Quota polling hits Anthropic OAuth endpoint | Fails silently for
Bedrock users who have no Anthropic subscription |

> **Bottom line**: Paperclip is completely unusable for Bedrock users
out of the box.

## Why Bedrock Matters

AWS Bedrock is a major deployment path for Claude in enterprise
environments:

- **Enterprise compliance** — data stays within the customer's AWS
account and VPC
- **Unified billing** — Claude usage appears on the existing AWS
invoice, no separate Anthropic billing
- **IAM integration** — access controlled through AWS IAM roles and
policies
- **Regional deployment** — models run in the customer's preferred AWS
region

Supporting Bedrock unlocks Paperclip for organizations that **cannot**
use Anthropic's direct API due to procurement, security, or regulatory
constraints.

---

## What Changed

- **`execute.ts`**: Added `isBedrockAuth()` helper that checks
`CLAUDE_CODE_USE_BEDROCK` and `ANTHROPIC_BEDROCK_BASE_URL` env vars.
`resolveClaudeBillingType()` now returns `"metered_api"` for Bedrock.
Biller set to `"aws_bedrock"`. Skips `--model` flag when Bedrock is
active (Anthropic-style model IDs are invalid on Bedrock; the CLI uses
its own configured model).
- **`test.ts`**: Environment check now detects Bedrock env vars (from
adapter config or server env) and shows `"AWS Bedrock auth detected.
Claude will use Bedrock for inference."` instead of the misleading
subscription message. Also skips `--model` in the hello probe for
Bedrock.
- **`quota.ts`**: Early return with `{ ok: true, windows: [] }` when
Bedrock is active — Bedrock usage is billed through AWS, not Anthropic's
subscription quota system.
- **`ui/src/lib/utils.ts`**: Added `"aws_bedrock"` → `"AWS Bedrock"` to
`providerDisplayName()` and `quotaSourceDisplayName()`.

## Verification

1. `pnpm -r typecheck` — all packages pass
2. Unit tests added and passing (6/6)
3. Environment check with Bedrock env vars:

| | Before | After |
|---|---|---|
| **Status** | 🔴 Failed |  Passed |
| **Auth message** | `ANTHROPIC_API_KEY is not set; subscription-based
auth can be used if Claude is logged in.` | `AWS Bedrock auth detected.
Claude will use Bedrock for inference.` |
| **Hello probe** | `ERROR · Claude hello probe failed.` (exit code 1 —
`--model claude-opus-4-6` is invalid on Bedrock) | `INFO · Claude hello
probe succeeded.` |
| **Screenshot** | <img height="500" alt="Screenshot 2026-04-05 at 8 25
27 AM"
src="https://github.com/user-attachments/assets/476431f6-6139-425a-8abc-97875d653657"
/> | <img height="500" alt="Screenshot 2026-04-05 at 8 31 58 AM"
src="https://github.com/user-attachments/assets/d388ce87-c5e6-4574-b8d2-fd8b86135299"
/> |

4. Existing API key / subscription paths are completely untouched unless
Bedrock env vars are present

## Risks

- **Low risk.** All changes are additive — existing "api" and
"subscription" code paths are only entered when Bedrock env vars are
absent.
- When Bedrock is active, the `--model` flag is skipped, so the
Paperclip model dropdown selection is ignored in favor of the Claude
CLI's own model config. This is intentional since Bedrock requires
different model identifiers.

## Model Used

- Claude Opus 4.6 (`claude-opus-4-6`, 1M context window) via Claude Code
CLI

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
2026-04-06 13:15:18 -07:00
HenkDz
14d59da316 feat(adapters): external adapter plugin system with dynamic UI parser
- Plugin loader: install/reload/remove/reinstall external adapters
  from npm packages or local directories
- Plugin store persisted at ~/.paperclip/adapter-plugins.json
- Self-healing UI parser resolution with version caching
- UI: Adapter Manager page, dynamic loader, display registry
  with humanized names for unknown adapter types
- Dev watch: exclude adapter-plugins dir from tsx watcher
  to prevent mid-request server restarts during reinstall
- All consumer fallbacks use getAdapterLabel() for consistent display
- AdapterTypeDropdown uses controlled open state for proper close behavior
- Remove hermes-local from built-in UI (externalized to plugin)
- Add docs for external adapters and UI parser contract
2026-04-03 21:11:20 +01:00
Dotta
9513bef7e7 Improve onboarding adapter diagnostics with live hello probes 2026-03-03 12:51:42 -06:00
Dotta
8351f7f1bd Auto-create missing cwd for claude_local and codex_local 2026-03-03 12:29:32 -06:00
Dotta
f60c1001ec refactor: rename packages to @paperclipai and CLI binary to paperclipai
Rename all workspace packages from @paperclip/* to @paperclipai/* and
the CLI binary from `paperclip` to `paperclipai` in preparation for
npm publishing. Bump CLI version to 0.1.0 and add package metadata
(description, keywords, license, repository, files). Update all
imports, documentation, user-facing messages, and tests accordingly.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-03 08:45:26 -06:00
Forgotten
f80a802592 Add adapter environment testing infrastructure
Introduce testEnvironment() on ServerAdapterModule with structured
pass/warn/fail diagnostics for all four adapter types (claude_local,
codex_local, process, http). Adds POST test-environment endpoint,
shared types/validators, adapter test implementations, and UI API
client. Includes asset type foundations used by related features.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-20 12:50:23 -06:00