paperclip/ui/src/components/IssueRunLedger.test.tsx

272 lines
8.7 KiB
TypeScript
Raw Normal View History

[codex] Add run liveness continuations (#4083) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies. > - Heartbeat runs are the control-plane record of each agent execution window. > - Long-running local agents can exhaust context or stop while still holding useful next-step state. > - Operators need that stop reason, next action, and continuation path to be durable and visible. > - This pull request adds run liveness metadata, continuation summaries, and UI surfaces for issue run ledgers. > - The benefit is that interrupted or long-running work can resume with clearer context instead of losing the agent's last useful handoff. ## What Changed - Added heartbeat-run liveness fields, continuation attempt tracking, and an idempotent `0058` migration. - Added server services and tests for run liveness, continuation summaries, stop metadata, and activity backfill. - Wired local and HTTP adapters to surface continuation/liveness context through shared adapter utilities. - Added shared constants, validators, and heartbeat types for liveness continuation state. - Added issue-detail UI surfaces for continuation handoffs and the run ledger, with component tests. - Updated agent runtime docs, heartbeat protocol docs, prompt guidance, onboarding assets, and skills instructions to explain continuation behavior. - Addressed Greptile feedback by scoping document evidence by run, excluding system continuation-summary documents from liveness evidence, importing shared liveness types, surfacing hidden ledger run counts, documenting bounded retry behavior, and moving run-ledger liveness backfill off the request path. ## Verification - `pnpm exec vitest run packages/adapter-utils/src/server-utils.test.ts server/src/__tests__/run-continuations.test.ts server/src/__tests__/run-liveness.test.ts server/src/__tests__/activity-service.test.ts server/src/__tests__/documents-service.test.ts server/src/__tests__/issue-continuation-summary.test.ts server/src/services/heartbeat-stop-metadata.test.ts ui/src/components/IssueRunLedger.test.tsx ui/src/components/IssueContinuationHandoff.test.tsx ui/src/components/IssueDocumentsSection.test.tsx` - `pnpm --filter @paperclipai/db build` - `pnpm exec vitest run server/src/__tests__/activity-service.test.ts ui/src/components/IssueRunLedger.test.tsx` - `pnpm --filter @paperclipai/ui typecheck` - `pnpm --filter @paperclipai/server typecheck` - `pnpm exec vitest run server/src/__tests__/activity-service.test.ts server/src/__tests__/run-continuations.test.ts ui/src/components/IssueRunLedger.test.tsx` - `pnpm exec vitest run server/src/__tests__/heartbeat-process-recovery.test.ts -t "treats a plan document update"` - `pnpm exec vitest run server/src/__tests__/activity-service.test.ts server/src/__tests__/heartbeat-process-recovery.test.ts -t "activity service|treats a plan document update"` - Remote PR checks on head `e53b1a1d`: `verify`, `e2e`, `policy`, and Snyk all passed. - Confirmed `public-gh/master` is an ancestor of this branch after fetching `public-gh master`. - Confirmed `pnpm-lock.yaml` is not included in the branch diff. - Confirmed migration `0058_wealthy_starbolt.sql` is ordered after `0057` and uses `IF NOT EXISTS` guards for repeat application. - Greptile inline review threads are resolved. ## Risks - Medium risk: this touches heartbeat execution, liveness recovery, activity rendering, issue routes, shared contracts, docs, and UI. - Migration risk is mitigated by additive columns/indexes and idempotent guards. - Run-ledger liveness backfill is now asynchronous, so the first ledger response can briefly show historical missing liveness until the background backfill completes. - UI screenshot coverage is not included in this packaging pass; validation is currently through focused component tests. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex, GPT-5.4, local tool-use coding agent with terminal, git, GitHub connector, GitHub CLI, and Paperclip API access. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge Screenshot note: no before/after screenshots were captured in this PR packaging pass; the UI changes are covered by focused component tests listed above. --------- Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-20 06:01:49 -05:00
// @vitest-environment jsdom
import { act } from "react";
import type { ComponentProps, ReactNode } from "react";
import { createRoot, type Root } from "react-dom/client";
import type { Issue, RunLivenessState } from "@paperclipai/shared";
import { afterEach, beforeEach, describe, expect, it, vi } from "vitest";
import type { RunForIssue } from "../api/activity";
import { IssueRunLedgerContent } from "./IssueRunLedger";
vi.mock("@/lib/router", () => ({
Link: ({ children, to, ...props }: { children: ReactNode; to: string } & ComponentProps<"a">) => (
<a href={to} {...props}>{children}</a>
),
}));
// eslint-disable-next-line @typescript-eslint/no-explicit-any
(globalThis as any).IS_REACT_ACT_ENVIRONMENT = true;
let container: HTMLDivElement;
let root: Root;
beforeEach(() => {
vi.useFakeTimers();
vi.setSystemTime(new Date("2026-04-18T20:00:00.000Z"));
container = document.createElement("div");
document.body.appendChild(container);
root = createRoot(container);
});
afterEach(() => {
act(() => root.unmount());
container.remove();
vi.useRealTimers();
});
function render(ui: ReactNode) {
act(() => {
root.render(ui);
});
}
function createRun(overrides: Partial<RunForIssue> = {}): RunForIssue {
return {
runId: "run-00000000",
status: "succeeded",
agentId: "agent-1",
adapterType: "codex_local",
startedAt: "2026-04-18T19:58:00.000Z",
finishedAt: "2026-04-18T19:59:00.000Z",
createdAt: "2026-04-18T19:58:00.000Z",
invocationSource: "assignment",
usageJson: null,
resultJson: null,
livenessState: "advanced",
livenessReason: "Run produced concrete action evidence: 2 activity event(s)",
continuationAttempt: 0,
lastUsefulActionAt: "2026-04-18T19:59:00.000Z",
nextAction: null,
...overrides,
};
}
function createIssue(overrides: Partial<Issue> = {}): Issue {
return {
id: "issue-1",
companyId: "company-1",
projectId: null,
projectWorkspaceId: null,
goalId: null,
parentId: null,
title: "Child issue",
description: null,
status: "todo",
priority: "medium",
assigneeAgentId: null,
assigneeUserId: null,
checkoutRunId: null,
executionRunId: null,
executionAgentNameKey: null,
executionLockedAt: null,
createdByAgentId: null,
createdByUserId: null,
issueNumber: null,
identifier: "PAP-1",
requestDepth: 0,
billingCode: null,
assigneeAdapterOverrides: null,
executionWorkspaceId: null,
executionWorkspacePreference: null,
executionWorkspaceSettings: null,
startedAt: null,
completedAt: null,
cancelledAt: null,
hiddenAt: null,
createdAt: new Date("2026-04-18T19:00:00.000Z"),
updatedAt: new Date("2026-04-18T19:00:00.000Z"),
...overrides,
};
}
function renderLedger(props: Partial<ComponentProps<typeof IssueRunLedgerContent>> = {}) {
render(
<IssueRunLedgerContent
runs={props.runs ?? []}
liveRuns={props.liveRuns}
activeRun={props.activeRun}
issueStatus={props.issueStatus ?? "in_progress"}
childIssues={props.childIssues ?? []}
agentMap={props.agentMap ?? new Map([["agent-1", { name: "CodexCoder" }]])}
/>,
);
}
describe("IssueRunLedger", () => {
it("renders every liveness state with exhausted continuation context", () => {
const states: RunLivenessState[] = [
"advanced",
"plan_only",
"empty_response",
"blocked",
"failed",
"completed",
"needs_followup",
];
renderLedger({
runs: states.map((state, index) =>
createRun({
runId: `run-${index}0000000`,
createdAt: `2026-04-18T19:5${index}:00.000Z`,
livenessState: state,
livenessReason: state === "needs_followup"
? "Run produced useful output but no concrete action evidence; continuation attempts exhausted"
: `state ${state}`,
continuationAttempt: state === "needs_followup" ? 3 : 0,
}),
),
});
expect(container.textContent).toContain("Advanced");
expect(container.textContent).toContain("Plan only");
expect(container.textContent).toContain("Empty response");
expect(container.textContent).toContain("Blocked");
expect(container.textContent).toContain("Failed");
expect(container.textContent).toContain("Completed");
expect(container.textContent).toContain("Needs follow-up");
expect(container.textContent).toContain("Exhausted");
expect(container.textContent).toContain("Continuation attempt 3");
});
it("renders historical runs without liveness metadata as unavailable", () => {
renderLedger({
runs: [
createRun({
livenessState: null,
livenessReason: null,
continuationAttempt: undefined,
lastUsefulActionAt: null,
nextAction: null,
resultJson: null,
}),
],
});
expect(container.textContent).toContain("No liveness data");
expect(container.textContent).toContain("Stop Unavailable");
expect(container.textContent).toContain("Last useful action Unavailable");
});
it("shows live runs as pending final checks without missing-data language", () => {
renderLedger({
runs: [
createRun({
status: "running",
finishedAt: null,
livenessState: null,
livenessReason: null,
continuationAttempt: 0,
lastUsefulActionAt: null,
nextAction: null,
resultJson: null,
}),
],
});
expect(container.textContent).toContain("Running now by CodexCoder");
expect(container.textContent).toContain("Checks after finish");
expect(container.textContent).toContain("Last useful action No action recorded yet");
expect(container.textContent).toContain("Stop Still running");
expect(container.textContent).not.toContain("Liveness pending");
expect(container.textContent).not.toContain("initial attempt");
});
it("shows timeout, cancel, and budget stop reasons without raw logs", () => {
renderLedger({
runs: [
createRun({
runId: "run-timeout",
resultJson: { stopReason: "timeout", timeoutFired: true, effectiveTimeoutSec: 30 },
}),
createRun({
runId: "run-cancel",
resultJson: { stopReason: "cancelled" },
createdAt: "2026-04-18T19:57:00.000Z",
}),
createRun({
runId: "run-budget",
resultJson: { stopReason: "budget_paused" },
createdAt: "2026-04-18T19:56:00.000Z",
}),
],
});
expect(container.textContent).toContain("timeout (30s timeout)");
expect(container.textContent).toContain("cancelled");
expect(container.textContent).toContain("budget paused");
});
it("surfaces active and completed child issue summaries", () => {
renderLedger({
childIssues: [
createIssue({ id: "child-1", identifier: "PAP-2", title: "Implement worker handoff", status: "in_progress" }),
createIssue({ id: "child-2", identifier: "PAP-3", title: "Verify final report", status: "done" }),
createIssue({ id: "child-3", identifier: "PAP-4", title: "Cancelled experiment", status: "cancelled" }),
],
});
expect(container.textContent).toContain("Child work");
expect(container.textContent).toContain("1 active, 1 done, 1 cancelled");
expect(container.textContent).toContain("PAP-2");
expect(container.textContent).toContain("Implement worker handoff");
renderLedger({
childIssues: [
createIssue({ id: "child-2", identifier: "PAP-3", title: "Verify final report", status: "done" }),
createIssue({ id: "child-3", identifier: "PAP-4", title: "Cancelled experiment", status: "cancelled" }),
],
});
expect(container.textContent).toContain("all 2 terminal (1 done, 1 cancelled)");
});
it("uses wrapping-friendly markup for long next action text", () => {
renderLedger({
runs: [
createRun({
nextAction: "Continue investigating this intentionally-long-next-action-token-that-needs-to-wrap-cleanly-on-mobile-and-desktop-without-overlapping-controls.",
}),
],
});
const nextAction = [...container.querySelectorAll("span")]
.find((node) => node.textContent?.includes("intentionally-long-next-action-token"));
expect(nextAction?.className).toContain("break-words");
expect(container.textContent).toContain("Next action:");
});
it("shows when older runs are clipped from the ledger", () => {
renderLedger({
runs: Array.from({ length: 10 }, (_, index) =>
createRun({
runId: `run-${index.toString().padStart(8, "0")}`,
createdAt: `2026-04-18T19:${String(index).padStart(2, "0")}:00.000Z`,
}),
),
});
expect(container.textContent).toContain("2 older runs not shown");
});
});