paperclip/doc
Dotta 09d0678840
[codex] Harden heartbeat scheduling and runtime controls (#4223)
## Thinking Path

> - Paperclip orchestrates AI agents through issue checkout, heartbeat
runs, routines, and auditable control-plane state
> - The runtime path has to recover from lost local processes, transient
adapter failures, blocked dependencies, and routine coalescing without
stranding work
> - The existing branch carried several reliability fixes across
heartbeat scheduling, issue runtime controls, routine dispatch, and
operator-facing run state
> - These changes belong together because they share backend contracts,
migrations, and runtime status semantics
> - This pull request groups the control-plane/runtime slice so it can
merge independently from board UI polish and adapter sandbox work
> - The benefit is safer heartbeat recovery, clearer runtime controls,
and more predictable recurring execution behavior

## What Changed

- Adds bounded heartbeat retry scheduling, scheduled retry state, and
Codex transient failure recovery handling.
- Tightens heartbeat process recovery, blocker wake behavior, issue
comment wake handling, routine dispatch coalescing, and
activity/dashboard bounds.
- Adds runtime-control MCP tools and Paperclip skill docs for issue
workspace runtime management.
- Adds migrations `0061_lively_thor_girl.sql` and
`0062_routine_run_dispatch_fingerprint.sql`.
- Surfaces retry state in run ledger/agent UI and keeps related shared
types synchronized.

## Verification

- `pnpm exec vitest run
server/src/__tests__/heartbeat-retry-scheduling.test.ts
server/src/__tests__/heartbeat-process-recovery.test.ts
server/src/__tests__/routines-service.test.ts`
- `pnpm exec vitest run src/tools.test.ts` from `packages/mcp-server`

## Risks

- Medium risk: this touches heartbeat recovery and routine dispatch,
which are central execution paths.
- Migration order matters if split branches land out of order: merge
this PR before branches that assume the new runtime/routine fields.
- Runtime retry behavior should be watched in CI and in local operator
smoke tests because it changes how transient failures are resumed.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5-based coding agent runtime, shell/git tool use
enabled. Exact hosted model build and context window are not exposed in
this Paperclip heartbeat environment.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-21 12:24:11 -05:00
..
assets add 512x512 PNG versions of paperclip avatars 2026-03-02 12:21:58 -06:00
experimental updating paths 2026-03-10 14:43:34 -05:00
plans [codex] Harden execution reliability and heartbeat tooling (#3679) 2026-04-14 13:34:52 -05:00
plugins [codex] Add plugin orchestration host APIs (#4114) 2026-04-20 08:52:51 -05:00
spec [codex] Add run liveness continuations (#4083) 2026-04-20 06:01:49 -05:00
AGENTCOMPANIES_SPEC_INVENTORY.md Add routine support to recurring task portability 2026-03-23 16:57:38 -05:00
CLI.md Introduce bind presets for deployment setup 2026-04-11 07:09:07 -05:00
CLIPHUB.md refactor: rename packages to @paperclipai and CLI binary to paperclipai 2026-03-03 08:45:26 -06:00
DATABASE.md Add first-class issue references (#4214) 2026-04-21 10:02:52 -05:00
DEPLOYMENT-MODES.md feat: implement multi-user access and invite flows (#3784) 2026-04-17 09:44:19 -05:00
DEVELOPING.md [codex] add comprehensive UI Storybook coverage (#4132) 2026-04-20 12:13:23 -05:00
DOCKER.md chore(docker): improve base image and organize docker files 2026-04-01 11:36:27 +00:00
execution-semantics.md [codex] Harden heartbeat scheduling and runtime controls (#4223) 2026-04-21 12:24:11 -05:00
GOAL.md Expand GOAL.md with Paperclip vision and mission statement 2026-02-18 16:46:33 -06:00
memory-landscape.md chore: improve worktree tooling and security docs 2026-04-10 22:26:30 -05:00
OPENCLAW_ONBOARDING.md Introduce bind presets for deployment setup 2026-04-11 07:09:07 -05:00
PRODUCT.md docs: update PRODUCT.md and add 2026-03-13 features plan 2026-03-13 07:25:23 -05:00
PUBLISHING.md feat: implement multi-user access and invite flows (#3784) 2026-04-17 09:44:19 -05:00
README-draft.md docs: add README, draft README, and adapter logo assets 2026-03-02 10:31:59 -06:00
RELEASE-AUTOMATION-SETUP.md Publish @paperclipai/ui from release automation 2026-03-26 11:13:11 -05:00
RELEASING.md fix: use origin for github release creation in actions 2026-03-18 09:10:00 -05:00
SPEC-implementation.md [codex] Improve agent runtime recovery and governance (#4086) 2026-04-20 06:19:48 -05:00
SPEC.md docs: update SPEC work artifacts and deprecate bootstrapPromptTemplate 2026-03-26 07:23:09 -05:00
TASKS-mcp.md Add product spec and MCP task interface docs 2026-02-16 19:07:30 -06:00
TASKS.md Add task management data model spec 2026-02-16 14:25:00 -06:00
UNTRUSTED-PR-REVIEW.md chore(docker): improve base image and organize docker files 2026-04-01 11:36:27 +00:00