mirror of
https://github.com/alkimake/paperclip.git
synced 2026-06-16 02:40:39 +09:00
## Thinking Path > - Paperclip's board UI and bundled skills are the operator layer for configuring agents, routines, issue workflows, and local troubleshooting loops. > - The prior rollup mixed this operator polish with database backups, backend reliability, thread scale, and cost/workflow primitives. > - This pull request isolates the remaining board QoL, settings, issue-detail integration, adapter config cleanup, and skills smoke tooling. > - It includes some integration-level overlap with the thread and workflow slices so this branch can run from `origin/master` while still preserving the full original work. > - Preferred merge order is the narrower primitives first, then this integration PR last. > - The benefit is that reviewers can inspect the user-facing board/settings/skills layer separately from backend infrastructure changes. ## What Changed - Added board/settings polish for agents, routines, company settings, project workspace detail, and issue detail controls. - Added agent/routine UI regression tests and New Issue dialog coverage. - Integrated issue-detail activity/cost/interaction surfaces and leaf work pause/resume controls. - Cleaned bundled adapter UI config defaults and onboarding copy. - Added terminal-bench loop and work-stoppage diagnosis skills plus a smoke test script. - Updated attachment type handling and Paperclip skill/API guidance. ## Verification - `pnpm install --frozen-lockfile` - `pnpm exec vitest run ui/src/pages/Agents.test.tsx ui/src/pages/Routines.test.tsx ui/src/components/NewIssueDialog.test.tsx ui/src/pages/IssueDetail.test.tsx server/src/__tests__/costs-service.test.ts server/src/__tests__/issue-thread-interaction-routes.test.ts server/src/__tests__/issue-thread-interactions-service.test.ts` - Result: 7 test files passed, 54 tests passed. - `pnpm run smoke:terminal-bench-loop-skill` - Result: JSON output included `"ok": true` and `"cleanup": true`. - UI screenshots not included because verification is focused component/page coverage for the changed board surfaces. ## Risks - This is the integration-heavy PR in the split and intentionally overlaps some component/API primitives with the issue-thread and workflow PRs so it can run from `origin/master`. - Preferred merge order: #4859, #4860, #4861, #4862, then this PR last. If earlier branches merge first, this PR may need a straightforward conflict refresh in shared UI files. - The terminal-bench smoke script creates temporary mock issues and relies on cleanup; the verified run returned `cleanup: true`. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex, GPT-5.5, code execution and GitHub CLI tool use, medium reasoning effort. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing>
161 lines
11 KiB
Markdown
161 lines
11 KiB
Markdown
---
|
|
name: diagnose-why-work-stopped
|
|
description: >
|
|
How to handle "why did this work stop / why is this looping?" assignments.
|
|
Forensics first on the named tree, surface the exact stop-point, frame the
|
|
fix as a general product rule that respects three invariants (productive
|
|
work continues, only real blockers stop work, no infinite loops), and
|
|
deliver a plan — no code changes — gated by board/CTO approval before
|
|
child issues are created. Use whenever the issue title or body asks for
|
|
forensics on a stalled, looping, or "went too deep" tree.
|
|
---
|
|
|
|
# Diagnose Why Work Stopped
|
|
|
|
A repeatable procedure for the recurring class of issues where the user (or a manager) points at a stalled / looping / over-recovered issue tree and asks "why did this stop / why is this looping / how do we make sure this doesn't happen again?"
|
|
|
|
This skill is **diagnostic + product-design**, not engineering. The output is a written root cause and an approved plan. No code changes leave this skill.
|
|
|
|
Canonical execution model: read `doc/execution-semantics.md` before diagnosing or proposing a new liveness/recovery rule. Use that document as the source of truth for status, action-path, post-run disposition, bounded continuation, productivity review, pause-hold, watchdog, and explicit recovery semantics. If the investigation finds a true product-rule gap, the plan should say whether `doc/execution-semantics.md` needs a matching update.
|
|
|
|
## When to use
|
|
|
|
Trigger on an assignment whose title or body matches any of:
|
|
|
|
- "why did this work stop", "why did this stall", "why did this just stop"
|
|
- "infinite loop", "looping", "spinning", "going too deep", "recovery went too deep"
|
|
- "liveness — what happened here", "this tree stopped working", "stuck"
|
|
- "approach it from a product perspective", "general product principle / rule"
|
|
- An attached link to a specific stalled / looping / over-recovered issue tree
|
|
|
|
Also use when the user asks for forensics, root cause, or a write-up *before* any product change.
|
|
|
|
## When NOT to use
|
|
|
|
- The assignment asks you to ship a code change directly. Use normal engineering flow.
|
|
- The assignment is a normal bug report against a specific feature. Use normal investigation.
|
|
- You are the original implementer being asked to fix your own bug. Use normal debugging.
|
|
|
|
## Three invariants you must preserve
|
|
|
|
Every diagnosis and every proposed rule must hold these three invariants together. The user has restated them on at least four issues; treat them as load-bearing:
|
|
|
|
1. **Productive work continues.** Agents that have a clear next action must keep working without needing the user to wake them. ([PAP-2674](/PAP/issues/PAP-2674), [PAP-2708](/PAP/issues/PAP-2708))
|
|
2. **Only real blockers stop work.** Stops happen when something genuinely cannot proceed (missing approval, missing dependency, human owner). Pseudo-stops (in_review with no action path, cancelled leaves, malformed metadata) must be detected and routed, not left silent. ([PAP-2335](/PAP/issues/PAP-2335), [PAP-2674](/PAP/issues/PAP-2674))
|
|
3. **No infinite loops.** Stranded-work recovery and continuation loops must be bounded and distinguishable from genuinely productive continuation. ([PAP-2602](/PAP/issues/PAP-2602), [PAP-2486](/PAP/issues/PAP-2486))
|
|
|
|
If a proposed rule violates any of the three, drop it or rework it. State explicitly in the plan how each invariant is held.
|
|
|
|
## Procedure
|
|
|
|
### 0. Read the current execution contract
|
|
|
|
Before walking the tree, read `doc/execution-semantics.md` and keep its terms intact:
|
|
|
|
- live path / waiting path / recovery path
|
|
- post-run disposition: terminal, explicitly live, explicitly waiting, invalid
|
|
- bounded `run_liveness_continuation`
|
|
- productivity review vs liveness recovery
|
|
- active subtree pause holds
|
|
- silent active-run watchdog
|
|
|
|
Do not invent a new rule until you can state how it differs from the current execution semantics document.
|
|
|
|
### 1. Forensics on the named tree — before anything else
|
|
|
|
Do this in the same heartbeat. Do not propose a rule until you have a concrete stop point.
|
|
|
|
- Open the linked issue (and its blocker chain, parents, recovery siblings, recent runs).
|
|
- Walk the tree node-by-node and find the exact issue + state combination that stops the world. Common shapes seen in the company so far:
|
|
- `in_review` with no typed execution participant, no active run, no pending interaction, no recovery issue ([PAP-2335](/PAP/issues/PAP-2335), [PAP-2674](/PAP/issues/PAP-2674)).
|
|
- `in_progress` after a successful run with no future action path queued ([PAP-2674](/PAP/issues/PAP-2674)).
|
|
- Blocker chain whose leaf is `cancelled` / malformed / cross-company-inaccessible ([PAP-2602](/PAP/issues/PAP-2602)).
|
|
- `issue.continuation_recovery` waking the same issue >N times after successful runs ([PAP-2602](/PAP/issues/PAP-2602)).
|
|
- Stranded-work recovery treating its own recovery issues as more recoverable source work ([PAP-2486](/PAP/issues/PAP-2486)).
|
|
- Quote the evidence: run ids, comment timestamps, status transitions. "Inferred" is acceptable only when an API boundary blocks direct evidence — say so explicitly and mark the claim provisional ([PAP-2631](/PAP/issues/PAP-2631)).
|
|
|
|
Respect the API boundary. If the linked issue is in another company and your agent token returns 403, do not bypass scoping. Either request a board-approved diagnostic path or proceed from inferred PAP-side evidence and label it.
|
|
|
|
### 2. Survey recent related work
|
|
|
|
Before proposing a new product rule, read what already shipped this week in the same area. The user has explicitly called this out: ([PAP-2602](/PAP/issues/PAP-2602)) "review our recent work on liveness that we shipped in the last couple of days." A new rule that contradicts code merged 48 hours ago is rework, not improvement.
|
|
|
|
Quick survey:
|
|
- Recent merged PRs in the affected area.
|
|
- Recent done issues whose title mentions liveness, recovery, productivity, continuation, or the affected subsystem.
|
|
- Any active plan documents on parent issues. The fix may belong as a revision to an existing plan, not as a new top-level proposal.
|
|
|
|
State in the forensics: "I reviewed X, Y, Z. The new gap is …"
|
|
|
|
### 3. Classify each non-progressing issue in the tree
|
|
|
|
For every issue in the affected tree that is not `done` / `cancelled` / actively running, decide:
|
|
|
|
- **Truly needs human or board intervention** — name the owner and the action.
|
|
- **Agent-actionable but not currently routed** — name the rule that would have routed it, and the agent that should have been waked.
|
|
- **Already covered** — point at the active run, queued wake, recovery issue, or pending interaction.
|
|
|
|
This is the table the user has asked for repeatedly ([PAP-2335](/PAP/issues/PAP-2335)). Without it the plan is abstract.
|
|
|
|
### 4. Frame as a general product rule
|
|
|
|
The user does not want a one-off patch on the named tree. They want the rule. Two checks:
|
|
|
|
- The rule is **stated as a contract**, not as an if/else patch. Example contract: "every agent-owned non-terminal issue must finish each heartbeat with a terminal state, an explicit waiting path, or an explicit live path" ([PAP-2674](/PAP/issues/PAP-2674)).
|
|
- The rule is reconciled against `doc/execution-semantics.md`. Prefer citing and applying the existing contract; propose a document change only when the current doc is incomplete or contradicted by accepted/implemented behavior.
|
|
- The rule **explicitly preserves the three invariants** above. Show the work.
|
|
|
|
If the rule would have blocked a recent productive run from succeeding, drop or narrow it.
|
|
|
|
### 5. Plan, do not code
|
|
|
|
Write the plan into the issue's `plan` document. Cover:
|
|
|
|
- Forensics summary (root cause + evidence).
|
|
- The general product rule, stated as a contract.
|
|
- Whether the existing `doc/execution-semantics.md` contract already covers the case, or what exact documentation update is needed.
|
|
- Phased subtasks: typically `Phase 0` resolves the named live tree (carefully, not destructively), `Phase 1` codifies the contract in docs, then implementation phases for detection, recovery, UI surfacing, security review, QA, and CTO review.
|
|
- Explicit assignees per phase; favor team specialty (CodexCoder for server, ClaudeCoder for FE, UXDesigner for visible state, SecurityEngineer for ownership/permissions, QA for validation).
|
|
- Blocking dependencies wired with `blockedByIssueIds`, parallel branches identified.
|
|
|
|
Do not create the child issues yet. Do not push code.
|
|
|
|
### 6. Request approval, then decompose
|
|
|
|
- Open a `request_confirmation` interaction targeting the latest plan revision. Idempotency key `confirmation:{issueId}:plan:{revisionId}`.
|
|
- Wait for board/CTO acceptance. If the user posts a new comment that supersedes the plan, the prior confirmation is invalidated — open a fresh confirmation tied to the new revision ([PAP-2602](/PAP/issues/PAP-2602) cycled three revisions; that is fine).
|
|
- Only after acceptance: create the phased child issues with the right assignees and dependencies, then block this parent on the final QA / CTO review issue so the parent only wakes when the chain finishes.
|
|
|
|
### 7. Phase 0 hygiene on the named tree
|
|
|
|
Phase 0 cleans up the live tree without papering over evidence:
|
|
|
|
- Move stalled `in_review` leaves with no participant to `todo` with a precise next action and named owner ([PAP-2335](/PAP/issues/PAP-2335)).
|
|
- Detach cancelled/dead blockers from chains they were holding hostage; do not silently mark issues `done` to clear backlog.
|
|
- Leave a comment on the original named issue summarizing what changed and why; never hide the recovery chain history.
|
|
|
|
### 8. Final close-out
|
|
|
|
When the phase chain is complete, post a board-level summary comment on the parent issue: what changed, what the new contract is, what the rollout step is (e.g. "restart the control-plane to pick up the new response shape"), and the live state of the originally-named tree. Then close the parent.
|
|
|
|
## Pitfalls
|
|
|
|
- **Coding before approval.** The user has said "make a plan first" on every recent diagnostic issue. Producing code in the forensic phase wastes the round-trip.
|
|
- **Restating one invariant at the cost of another.** Bound continuation too tightly and productive work stalls; loosen recovery and infinite loops return. Always check all three.
|
|
- **Skipping the recent-work survey.** Proposing a contract that contradicts what shipped 24 hours ago is the easiest way to get the plan rejected.
|
|
- **Letting "in_review" mean done.** A leaf assigned to another agent with no participant or active run is not progress; treat it as a stop.
|
|
- **Bypassing company scoping.** Cross-company forensics needs a board-approved diagnostic path, not a database read.
|
|
- **Recursive recovery.** Stranded-work recovery that recovers its own recovery issues is the canonical infinite loop ([PAP-2486](/PAP/issues/PAP-2486)). Detect it and refuse to deepen.
|
|
- **Hiding the chain.** Don't silently delete or hide the symptomatic recovery issues — the operator needs the audit trail.
|
|
|
|
## Verification checklist (before posting the plan)
|
|
|
|
- [ ] The exact stop point in the named tree is identified with run ids / comment ids.
|
|
- [ ] Recent shipped work in the same area was surveyed and is referenced.
|
|
- [ ] Every non-progressing issue is classified human-needed / agent-actionable / already-covered.
|
|
- [ ] The proposed rule is stated as a contract, not a patch.
|
|
- [ ] All three invariants are explicitly preserved.
|
|
- [ ] No code change has landed in this heartbeat.
|
|
- [ ] A `request_confirmation` against the latest plan revision is open.
|
|
- [ ] Phase 0 of the plan addresses the live named tree without destroying evidence.
|
|
- [ ] Implementation phases name specialty-appropriate assignees and `blockedByIssueIds` dependencies.
|