[codex] Add run liveness continuations (#4083)

## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies. > - Heartbeat runs are the control-plane record of each agent execution window. > - Long-running local agents can exhaust context or stop while still holding useful next-step state. > - Operators need that stop reason, next action, and continuation path to be durable and visible. > - This pull request adds run liveness metadata, continuation summaries, and UI surfaces for issue run ledgers. > - The benefit is that interrupted or long-running work can resume with clearer context instead of losing the agent's last useful handoff. ## What Changed - Added heartbeat-run liveness fields, continuation attempt tracking, and an idempotent `0058` migration. - Added server services and tests for run liveness, continuation summaries, stop metadata, and activity backfill. - Wired local and HTTP adapters to surface continuation/liveness context through shared adapter utilities. - Added shared constants, validators, and heartbeat types for liveness continuation state. - Added issue-detail UI surfaces for continuation handoffs and the run ledger, with component tests. - Updated agent runtime docs, heartbeat protocol docs, prompt guidance, onboarding assets, and skills instructions to explain continuation behavior. - Addressed Greptile feedback by scoping document evidence by run, excluding system continuation-summary documents from liveness evidence, importing shared liveness types, surfacing hidden ledger run counts, documenting bounded retry behavior, and moving run-ledger liveness backfill off the request path. ## Verification - `pnpm exec vitest run packages/adapter-utils/src/server-utils.test.ts server/src/__tests__/run-continuations.test.ts server/src/__tests__/run-liveness.test.ts server/src/__tests__/activity-service.test.ts server/src/__tests__/documents-service.test.ts server/src/__tests__/issue-continuation-summary.test.ts server/src/services/heartbeat-stop-metadata.test.ts ui/src/components/IssueRunLedger.test.tsx ui/src/components/IssueContinuationHandoff.test.tsx ui/src/components/IssueDocumentsSection.test.tsx` - `pnpm --filter @paperclipai/db build` - `pnpm exec vitest run server/src/__tests__/activity-service.test.ts ui/src/components/IssueRunLedger.test.tsx` - `pnpm --filter @paperclipai/ui typecheck` - `pnpm --filter @paperclipai/server typecheck` - `pnpm exec vitest run server/src/__tests__/activity-service.test.ts server/src/__tests__/run-continuations.test.ts ui/src/components/IssueRunLedger.test.tsx` - `pnpm exec vitest run server/src/__tests__/heartbeat-process-recovery.test.ts -t "treats a plan document update"` - `pnpm exec vitest run server/src/__tests__/activity-service.test.ts server/src/__tests__/heartbeat-process-recovery.test.ts -t "activity service|treats a plan document update"` - Remote PR checks on head `e53b1a1d`: `verify`, `e2e`, `policy`, and Snyk all passed. - Confirmed `public-gh/master` is an ancestor of this branch after fetching `public-gh master`. - Confirmed `pnpm-lock.yaml` is not included in the branch diff. - Confirmed migration `0058_wealthy_starbolt.sql` is ordered after `0057` and uses `IF NOT EXISTS` guards for repeat application. - Greptile inline review threads are resolved. ## Risks - Medium risk: this touches heartbeat execution, liveness recovery, activity rendering, issue routes, shared contracts, docs, and UI. - Migration risk is mitigated by additive columns/indexes and idempotent guards. - Run-ledger liveness backfill is now asynchronous, so the first ledger response can briefly show historical missing liveness until the background backfill completes. - UI screenshot coverage is not included in this packaging pass; validation is currently through focused component tests. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex, GPT-5.4, local tool-use coding agent with terminal, git, GitHub connector, GitHub CLI, and Paperclip API access. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge Screenshot note: no before/after screenshots were captured in this PR packaging pass; the UI changes are covered by focused component tests listed above. --------- Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-06-18 03:30:39 +09:00 · 2026-04-20 06:01:49 -05:00 · 2026-04-20 06:01:49 -05:00 · 236d11d36f
commit 236d11d36f
parent b9a80dcf22
71 changed files with 18254 additions and 85 deletions
--- a/ui/src/components/IssueRunLedger.test.tsx
+++ b/ui/src/components/IssueRunLedger.test.tsx
@ -0,0 +1,271 @@
+// @vitest-environment jsdom
+
+import { act } from "react";
+import type { ComponentProps, ReactNode } from "react";
+import { createRoot, type Root } from "react-dom/client";
+import type { Issue, RunLivenessState } from "@paperclipai/shared";
+import { afterEach, beforeEach, describe, expect, it, vi } from "vitest";
+import type { RunForIssue } from "../api/activity";
+import { IssueRunLedgerContent } from "./IssueRunLedger";
+
+vi.mock("@/lib/router", () => ({
+  Link: ({ children, to, ...props }: { children: ReactNode; to: string } & ComponentProps<"a">) => (
+    <a href={to} {...props}>{children}</a>
+  ),
+}));
+
+// eslint-disable-next-line @typescript-eslint/no-explicit-any
+(globalThis as any).IS_REACT_ACT_ENVIRONMENT = true;
+
+let container: HTMLDivElement;
+let root: Root;
+
+beforeEach(() => {
+  vi.useFakeTimers();
+  vi.setSystemTime(new Date("2026-04-18T20:00:00.000Z"));
+  container = document.createElement("div");
+  document.body.appendChild(container);
+  root = createRoot(container);
+});
+
+afterEach(() => {
+  act(() => root.unmount());
+  container.remove();
+  vi.useRealTimers();
+});
+
+function render(ui: ReactNode) {
+  act(() => {
+    root.render(ui);
+  });
+}
+
+function createRun(overrides: Partial<RunForIssue> = {}): RunForIssue {
+  return {
+    runId: "run-00000000",
+    status: "succeeded",
+    agentId: "agent-1",
+    adapterType: "codex_local",
+    startedAt: "2026-04-18T19:58:00.000Z",
+    finishedAt: "2026-04-18T19:59:00.000Z",
+    createdAt: "2026-04-18T19:58:00.000Z",
+    invocationSource: "assignment",
+    usageJson: null,
+    resultJson: null,
+    livenessState: "advanced",
+    livenessReason: "Run produced concrete action evidence: 2 activity event(s)",
+    continuationAttempt: 0,
+    lastUsefulActionAt: "2026-04-18T19:59:00.000Z",
+    nextAction: null,
+    ...overrides,
+  };
+}
+
+function createIssue(overrides: Partial<Issue> = {}): Issue {
+  return {
+    id: "issue-1",
+    companyId: "company-1",
+    projectId: null,
+    projectWorkspaceId: null,
+    goalId: null,
+    parentId: null,
+    title: "Child issue",
+    description: null,
+    status: "todo",
+    priority: "medium",
+    assigneeAgentId: null,
+    assigneeUserId: null,
+    checkoutRunId: null,
+    executionRunId: null,
+    executionAgentNameKey: null,
+    executionLockedAt: null,
+    createdByAgentId: null,
+    createdByUserId: null,
+    issueNumber: null,
+    identifier: "PAP-1",
+    requestDepth: 0,
+    billingCode: null,
+    assigneeAdapterOverrides: null,
+    executionWorkspaceId: null,
+    executionWorkspacePreference: null,
+    executionWorkspaceSettings: null,
+    startedAt: null,
+    completedAt: null,
+    cancelledAt: null,
+    hiddenAt: null,
+    createdAt: new Date("2026-04-18T19:00:00.000Z"),
+    updatedAt: new Date("2026-04-18T19:00:00.000Z"),
+    ...overrides,
+  };
+}
+
+function renderLedger(props: Partial<ComponentProps<typeof IssueRunLedgerContent>> = {}) {
+  render(
+    <IssueRunLedgerContent
+      runs={props.runs ?? []}
+      liveRuns={props.liveRuns}
+      activeRun={props.activeRun}
+      issueStatus={props.issueStatus ?? "in_progress"}
+      childIssues={props.childIssues ?? []}
+      agentMap={props.agentMap ?? new Map([["agent-1", { name: "CodexCoder" }]])}
+    />,
+  );
+}
+
+describe("IssueRunLedger", () => {
+  it("renders every liveness state with exhausted continuation context", () => {
+    const states: RunLivenessState[] = [
+      "advanced",
+      "plan_only",
+      "empty_response",
+      "blocked",
+      "failed",
+      "completed",
+      "needs_followup",
+    ];
+
+    renderLedger({
+      runs: states.map((state, index) =>
+        createRun({
+          runId: `run-${index}0000000`,
+          createdAt: `2026-04-18T19:5${index}:00.000Z`,
+          livenessState: state,
+          livenessReason: state === "needs_followup"
+            ? "Run produced useful output but no concrete action evidence; continuation attempts exhausted"
+            : `state ${state}`,
+          continuationAttempt: state === "needs_followup" ? 3 : 0,
+        }),
+      ),
+    });
+
+    expect(container.textContent).toContain("Advanced");
+    expect(container.textContent).toContain("Plan only");
+    expect(container.textContent).toContain("Empty response");
+    expect(container.textContent).toContain("Blocked");
+    expect(container.textContent).toContain("Failed");
+    expect(container.textContent).toContain("Completed");
+    expect(container.textContent).toContain("Needs follow-up");
+    expect(container.textContent).toContain("Exhausted");
+    expect(container.textContent).toContain("Continuation attempt 3");
+  });
+
+  it("renders historical runs without liveness metadata as unavailable", () => {
+    renderLedger({
+      runs: [
+        createRun({
+          livenessState: null,
+          livenessReason: null,
+          continuationAttempt: undefined,
+          lastUsefulActionAt: null,
+          nextAction: null,
+          resultJson: null,
+        }),
+      ],
+    });
+
+    expect(container.textContent).toContain("No liveness data");
+    expect(container.textContent).toContain("Stop Unavailable");
+    expect(container.textContent).toContain("Last useful action Unavailable");
+  });
+
+  it("shows live runs as pending final checks without missing-data language", () => {
+    renderLedger({
+      runs: [
+        createRun({
+          status: "running",
+          finishedAt: null,
+          livenessState: null,
+          livenessReason: null,
+          continuationAttempt: 0,
+          lastUsefulActionAt: null,
+          nextAction: null,
+          resultJson: null,
+        }),
+      ],
+    });
+
+    expect(container.textContent).toContain("Running now by CodexCoder");
+    expect(container.textContent).toContain("Checks after finish");
+    expect(container.textContent).toContain("Last useful action No action recorded yet");
+    expect(container.textContent).toContain("Stop Still running");
+    expect(container.textContent).not.toContain("Liveness pending");
+    expect(container.textContent).not.toContain("initial attempt");
+  });
+
+  it("shows timeout, cancel, and budget stop reasons without raw logs", () => {
+    renderLedger({
+      runs: [
+        createRun({
+          runId: "run-timeout",
+          resultJson: { stopReason: "timeout", timeoutFired: true, effectiveTimeoutSec: 30 },
+        }),
+        createRun({
+          runId: "run-cancel",
+          resultJson: { stopReason: "cancelled" },
+          createdAt: "2026-04-18T19:57:00.000Z",
+        }),
+        createRun({
+          runId: "run-budget",
+          resultJson: { stopReason: "budget_paused" },
+          createdAt: "2026-04-18T19:56:00.000Z",
+        }),
+      ],
+    });
+
+    expect(container.textContent).toContain("timeout (30s timeout)");
+    expect(container.textContent).toContain("cancelled");
+    expect(container.textContent).toContain("budget paused");
+  });
+
+  it("surfaces active and completed child issue summaries", () => {
+    renderLedger({
+      childIssues: [
+        createIssue({ id: "child-1", identifier: "PAP-2", title: "Implement worker handoff", status: "in_progress" }),
+        createIssue({ id: "child-2", identifier: "PAP-3", title: "Verify final report", status: "done" }),
+        createIssue({ id: "child-3", identifier: "PAP-4", title: "Cancelled experiment", status: "cancelled" }),
+      ],
+    });
+
+    expect(container.textContent).toContain("Child work");
+    expect(container.textContent).toContain("1 active, 1 done, 1 cancelled");
+    expect(container.textContent).toContain("PAP-2");
+    expect(container.textContent).toContain("Implement worker handoff");
+
+    renderLedger({
+      childIssues: [
+        createIssue({ id: "child-2", identifier: "PAP-3", title: "Verify final report", status: "done" }),
+        createIssue({ id: "child-3", identifier: "PAP-4", title: "Cancelled experiment", status: "cancelled" }),
+      ],
+    });
+
+    expect(container.textContent).toContain("all 2 terminal (1 done, 1 cancelled)");
+  });
+
+  it("uses wrapping-friendly markup for long next action text", () => {
+    renderLedger({
+      runs: [
+        createRun({
+          nextAction: "Continue investigating this intentionally-long-next-action-token-that-needs-to-wrap-cleanly-on-mobile-and-desktop-without-overlapping-controls.",
+        }),
+      ],
+    });
+
+    const nextAction = [...container.querySelectorAll("span")]
+      .find((node) => node.textContent?.includes("intentionally-long-next-action-token"));
+    expect(nextAction?.className).toContain("break-words");
+    expect(container.textContent).toContain("Next action:");
+  });
+
+  it("shows when older runs are clipped from the ledger", () => {
+    renderLedger({
+      runs: Array.from({ length: 10 }, (_, index) =>
+        createRun({
+          runId: `run-${index.toString().padStart(8, "0")}`,
+          createdAt: `2026-04-18T19:${String(index).padStart(2, "0")}:00.000Z`,
+        }),
+      ),
+    });
+
+    expect(container.textContent).toContain("2 older runs not shown");
+  });
+});