[codex] Add source-scoped recovery actions (#5599)

## Thinking Path

> - Paperclip is a control plane for autonomous AI companies, where work
must end with a clear disposition rather than ambiguous agent liveness.
> - Recovery currently detects stalled or missing-next-step issues, but
source issue recovery can become split across child recovery issues,
blockers, and comments.
> - That makes it harder for operators and agents to see who owns
recovery and what exact action is needed on the original issue.
> - Source-scoped recovery actions give the original issue a first-class
active recovery state with owner, evidence, wake policy, and resolution
outcome.
> - This pull request adds the recovery-action data model, backend
reconciliation and resolution APIs, and board UI indicators/actions.
> - The benefit is clearer stalled-work recovery without losing source
issue context or relying on comments as the liveness path.

## What Changed

- Added the `issue_recovery_actions` schema, shared
types/constants/validators, and an idempotent
`0084_issue_recovery_actions` migration ordered after current `master`
migrations.
- Updated stranded/missing-disposition recovery to create source-scoped
recovery actions, wake the recovery owner on the source issue, and avoid
locking the source issue for recovery-action wakes.
- Added API support for reading active recovery actions on issue
detail/list surfaces and resolving them with restored, blocked,
cancelled, or false-positive outcomes.
- Require blocked recovery resolutions to have an unresolved first-class
blocker, and removed the UI shortcut that could mark recovery blocked
without a blocker selection path.
- Surfaced recovery indicators/actions in the issue UI, blocker notices,
active run panels, issue rows, and Storybook coverage.
- Updated docs and focused tests for recovery semantics, ownership,
races, stale comments, and UI behavior.

## Verification

- `pnpm exec vitest run
server/src/__tests__/issue-recovery-actions.test.ts
server/src/__tests__/heartbeat-process-recovery.test.ts
ui/src/components/IssueRecoveryActionCard.test.tsx
ui/src/components/IssueBlockedNotice.test.tsx ui/src/api/issues.test.ts`
— 5 files, 72 tests passed.
- `pnpm --filter @paperclipai/shared typecheck` — passed.
- `pnpm --filter @paperclipai/db typecheck` — passed, including
migration numbering check.
- `pnpm --filter @paperclipai/server typecheck` — passed.
- `pnpm --filter @paperclipai/ui typecheck` — passed.
- Follow-up verification after blocker-resolution guard: `pnpm exec
vitest run server/src/__tests__/issue-recovery-actions.test.ts
ui/src/components/IssueRecoveryActionCard.test.tsx
ui/src/api/issues.test.ts` — 3 files, 27 tests passed.
- Follow-up `pnpm --filter @paperclipai/server typecheck` — passed.
- Follow-up `pnpm --filter @paperclipai/ui typecheck` — passed.
- UI states are available in
`ui/storybook/stories/source-issue-recovery.stories.tsx`; screenshot
capture helper is `scripts/screenshot-recovery-card.cjs`.

## Risks

- Medium: recovery behavior changes from child recovery issue ownership
toward source-scoped actions, so operators may see stalled-work state in
new places.
- Migration risk is mitigated by using the next migration slot after
`master` and making the table/constraints/index creation idempotent for
anyone who previously applied the old branch-local
`0082_dizzy_master_mold` migration.
- Existing child recovery issue paths are still guarded for
already-created recovery issues, but new source-scoped flows should be
watched in CI and Greptile review.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool use enabled for shell, Git,
GitHub, and local test execution. Context window not exposed by the
runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
This commit is contained in:
Dotta 2026-05-12 09:37:15 -05:00 committed by GitHub
parent c445e59256
commit 0808b388ee
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
57 changed files with 3947 additions and 224 deletions

View file

@ -37,7 +37,7 @@ These decisions close open questions from `SPEC.md` for V1.
| Visibility | Full visibility to board and all agents in same company |
| Communication | Tasks + comments only (no separate chat system) |
| Task ownership | Single assignee; atomic checkout required for `in_progress` transition |
| Recovery | Liveness/watchdog recovery preserves explicit ownership: retry lost execution continuity where safe, otherwise create visible recovery issues or require human escalation (see `doc/execution-semantics.md`) |
| Recovery | Liveness/watchdog recovery preserves explicit ownership: retry lost execution continuity where safe, otherwise open visible source-scoped recovery actions by default, use issue-backed recovery only for independent repair work, or require human escalation (see `doc/execution-semantics.md`) |
| Agent adapters | Built-in `process`, `http`, local CLI/session adapters, and OpenClaw gateway support; external adapters can also be loaded through the adapter plugin flow |
| Plugin framework | Local/self-hosted early plugin runtime is in scope; cloud marketplace and packaged public distribution remain out of scope |
| Auth | Mode-dependent human auth (`local_trusted` implicit board in current code; authenticated mode uses sessions), API keys for agents |
@ -434,9 +434,10 @@ Side effects:
V1 non-terminal liveness rule:
- agent-owned `todo`, `in_progress`, `in_review`, and `blocked` issues must have a live execution path, an explicit waiting path, or an explicit recovery path
- `in_review` is healthy only when a typed execution participant, pending issue-thread interaction or approval, user owner, active run, queued wake, or explicit recovery issue owns the next action
- `in_review` is healthy only when a typed execution participant, pending issue-thread interaction or approval, user owner, active run, queued wake, or explicit recovery action owns the next action
- a blocked chain is covered only when each unresolved leaf issue is live or explicitly waiting
- when Paperclip cannot safely infer the next action, it surfaces the problem through visible blocked/recovery work instead of silently completing or reassigning work
- explicit recovery actions are the liveness primitive; source-scoped actions are the default form, issue-backed recovery is a fallback for independent repair work or safety boundaries, and comments alone are evidence rather than a healthy liveness path
Detailed ownership, execution, blocker, active-run watchdog, crash-recovery, and non-terminal liveness semantics are documented in `doc/execution-semantics.md`.