[codex] Refresh docs and agent skills (#4693)

## Thinking Path > - Paperclip orchestrates AI agents through a company-scoped control plane > - Contributors and agents need docs and skills that match the current V1 behavior > - The source branch included documentation updates alongside implementation work > - Keeping docs and skill guidance separate makes the implementation PR easier to review > - This pull request refreshes the V1 docs and agent-operating guidance without changing runtime behavior > - The benefit is current contributor guidance that can merge independently from code changes ## What Changed - Refreshed V1 product, goal, implementation, database, and development documentation. - Updated the Paperclip heartbeat skill guidance and create-agent skill references. - Added the Paperclip plan-to-task conversion skill. - Updated release changelog skill guidance. ## Verification - `git diff --check public-gh/master..HEAD` passed in the PR worktree after the Greptile fix. - Greptile Review passed on head `673317ed` with zero unresolved review threads. - GitHub PR checks passed on head `673317ed`: `policy`, `verify`, `e2e`, and `security/snyk (cryppadotta)`. ## Risks - Low runtime risk because this branch only changes docs and skill guidance. - Documentation may need follow-up wording adjustments if reviewers want a different framing for V1 behavior. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex, GPT-5 coding agent, tool-enabled terminal/GitHub workflow. Exact runtime context window was not exposed by the harness. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-06-16 02:40:39 +09:00 · 2026-04-28 16:12:03 -05:00 · 2026-04-28 16:12:03 -05:00 · d9f540c331
commit d9f540c331
parent d0bdbe11a9
13 changed files with 192 additions and 76 deletions
--- a/doc/DATABASE.md
+++ b/doc/DATABASE.md
@ -59,11 +59,11 @@ cp .env.example .env
 # DATABASE_URL=postgres://paperclip:paperclip@localhost:5432/paperclip
 ```

-Run migrations (once the migration generation issue is fixed) or use `drizzle-kit push`:
+Run migrations:

 ```sh
 DATABASE_URL=postgres://paperclip:paperclip@localhost:5432/paperclip \
-  npx drizzle-kit push
+  pnpm db:migrate
 ```

 Start the server:
@ -100,37 +100,27 @@ postgres://postgres.[PROJECT-REF]:[PASSWORD]@aws-0-[REGION].pooler.supabase.com:

 ### Configure

-Set `DATABASE_URL` in your `.env`:
+For the application runtime, use a direct PostgreSQL connection unless the database client has explicit prepared-statement configuration for your pooling mode:

 ```sh
-DATABASE_URL=postgres://postgres.[PROJECT-REF]:[PASSWORD]@aws-0-[REGION].pooler.supabase.com:6543/postgres
+DATABASE_URL=postgres://postgres.[PROJECT-REF]:[PASSWORD]@aws-0-[REGION].pooler.supabase.com:5432/postgres
 ```

-For hosted deployments that use a pooled runtime URL, set
-`DATABASE_MIGRATION_URL` to the direct connection URL. Paperclip uses it for
-startup schema checks/migrations and plugin namespace migrations, while the app
-continues to use `DATABASE_URL` for runtime queries:
+If you later run the app with a pooled runtime URL, set `DATABASE_MIGRATION_URL` to the direct connection URL. Paperclip uses it for startup schema checks/migrations and plugin namespace migrations, while the app continues to use `DATABASE_URL` for runtime queries:

 ```sh
 DATABASE_URL=postgres://postgres.[PROJECT-REF]:[PASSWORD]@aws-0-[REGION].pooler.supabase.com:6543/postgres
 DATABASE_MIGRATION_URL=postgres://postgres.[PROJECT-REF]:[PASSWORD]@aws-0-[REGION].pooler.supabase.com:5432/postgres
 ```

-If using connection pooling (port 6543), the `postgres` client must disable prepared statements. Update `packages/db/src/client.ts`:
-
-```ts
-export function createDb(url: string) {
-  const sql = postgres(url, { prepare: false });
-  return drizzlePg(sql, { schema });
-}
-```
+If your hosted database requires transaction-pooling-only connections, use a direct or session-pooled connection for Paperclip until runtime pooling support is documented in this guide. Do not edit database client source files as part of deployment setup.

 ### Push the schema

 ```sh
 # Use the direct connection (port 5432) for schema changes
 DATABASE_URL=postgres://postgres.[PROJECT-REF]:[PASSWORD]@...5432/postgres \
-  npx drizzle-kit push
+  pnpm db:migrate
 ```

 ### Free tier limits
@ -153,6 +143,14 @@ The database mode is controlled by `DATABASE_URL`:

 Your Drizzle schema (`packages/db/src/schema/`) stays the same regardless of mode.

+## Plugin database namespaces
+
+The plugin runtime tracks plugin-owned database namespaces and migrations in `plugin_database_namespaces` and `plugin_migrations`. Hosted deployments that separate runtime and migration connections should set `DATABASE_MIGRATION_URL`; plugin namespace migration work uses the migration connection when present.
+
+## Backups
+
+Paperclip supports automatic and manual database backups. See `doc/DEVELOPING.md` for the current `paperclipai db:backup` / `pnpm db:backup` commands and backup retention configuration.
+
 ## Secret storage

 Paperclip stores secret metadata and versions in:
--- a/doc/DEVELOPING.md
+++ b/doc/DEVELOPING.md
@ -43,6 +43,8 @@ This starts:

 `pnpm dev` and `pnpm dev:once` are now idempotent for the current repo and instance: if the matching Paperclip dev runner is already alive, Paperclip reports the existing process instead of starting a duplicate.

+Issue execution may also use project execution workspace policies and workspace runtime services for per-project worktrees, preview servers, and managed dev commands. Configure those through the project workspace/runtime surfaces rather than starting long-running unmanaged processes when a task needs a reusable service.
+
 ## Storybook

 The board UI Storybook keeps stories and Storybook config under `ui/storybook/` so component review files stay out of the app source routes.
@ -113,6 +115,8 @@ pnpm test:release-smoke

 These browser suites are intended for targeted local verification and CI, not the default agent/human test command.

+For normal issue work, start with the smallest targeted check that proves the change. Reserve repo-wide typecheck/build/test runs for PR-ready handoff or changes broad enough that narrow checks do not cover the risk.
+
 ## One-Command Local Run

 For a first-time local install, you can bootstrap and run in one command:
@ -194,6 +198,8 @@ For `codex_local`, Paperclip also manages a per-company Codex home under the ins

 If the `codex` CLI is not installed or not on `PATH`, `codex_local` agent runs fail at execution time with a clear adapter error. Quota polling uses a short-lived `codex app-server` subprocess: when `codex` cannot be spawned, that provider reports `ok: false` in aggregated quota results and the API server keeps running (it must not exit on a missing binary).

+Local adapters require their corresponding CLI/session setup on the machine running Paperclip. External adapters are installed through the adapter/plugin flow and should not require hardcoded imports in `server/` or `ui/`.
+
 ## Worktree-local Instances

 When developing from multiple git worktrees, do not point two Paperclip servers at the same embedded PostgreSQL data directory.
--- a/doc/GOAL.md
+++ b/doc/GOAL.md
@ -23,7 +23,7 @@ Paperclip is the command, communication, and control plane for a company of AI a
 - **Track work in real time** — see at any moment what every agent is working on
 - **Control costs** — token salary budgets per agent, spend tracking, burn rate
 - **Align to goals** — agents see how their work serves the bigger mission
- **Store company knowledge** — a shared brain for the organization
+- **Preserve work context** — comments, documents, work products, attachments, and company state stay attached to the work

 ## Architecture

@ -36,17 +36,20 @@ The central nervous system. Manages:
 - Agent registry and org chart
 - Task assignment and status
 - Budget and token spend tracking
- Company knowledge base
+- Issue comments, documents, work products, attachments, and company state
 - Goal hierarchy (company → team → agent → task)
 - Heartbeat monitoring — know when agents are alive, idle, or stuck

+It also enforces execution-control semantics such as single-assignee issues, atomic checkout and execution locks, blockers, recovery issues, and workspace/runtime controls.
+
 ### 2. Execution Services (adapters)

-Agents run externally and report into the control plane. An agent is just Python code that gets kicked off and does work. Adapters connect different execution environments:
+Agents run externally and report into the control plane. Adapters connect different execution environments and define how a heartbeat is invoked, observed, and cancelled:

- **OpenClaw** — initial adapter target
- **Heartbeat loop** — simple custom Python that loops, checks in, does work
- **Others** — any runtime that can call an API
+- **Local CLI/session adapters** — built-in adapters for tools such as Claude Code, Codex, Gemini, OpenCode, Pi, and Cursor
+- **HTTP/process-style adapters** — command or webhook/API integrations for custom runtimes
+- **OpenClaw gateway** — integration for OpenClaw-style remote agents
+- **External adapter plugins** — dynamically loaded adapters installed outside the core app

 The control plane doesn't run agents. It orchestrates them. Agents run wherever they run and phone home.

--- a/doc/PRODUCT.md
+++ b/doc/PRODUCT.md
@ -32,12 +32,14 @@ Then you define who reports to the CEO: a CTO managing programmers, a CMO managi

 ### Agent Execution

-There are two fundamental modes for running an agent's heartbeat:
+Paperclip supports several ways to run an agent's heartbeat:

-1. **Run a command** — Paperclip kicks off a process (shell command, Python script, etc.) and tracks it. The heartbeat is "execute this and monitor it."
-2. **Fire and forget a request** — Paperclip sends a webhook/API call to an externally running agent. The heartbeat is "notify this agent to wake up." (OpenClaw hooks work this way.)
+1. **Local CLI/session adapters** — Paperclip starts or resumes local coding-tool sessions such as Claude Code, Codex, Gemini, OpenCode, Pi, and Cursor, then tracks the run.
+2. **Run a command** — Paperclip kicks off a process (shell command, Python script, etc.) and tracks it. The heartbeat is "execute this and monitor it."
+3. **Fire and forget a request** — Paperclip sends a webhook/API call to an externally running agent. The heartbeat is "notify this agent to wake up." OpenClaw-style hooks work this way.
+4. **External adapter plugins** — Paperclip loads adapter packages through the plugin/adapter flow so self-hosted installs can add runtimes without hardcoding them in core.

-We provide sensible defaults — a default agent that shells out to Claude Code or Codex with your configuration, remembers session IDs, runs basic scripts. But you can plug in anything.
+Agent runs can use project and execution workspaces, managed runtime services such as preview/dev servers, adapter-specific session state, and HTTP/webhook-style execution. We provide sensible defaults, but the adapter is still the boundary: if a runtime can be invoked, observed, and authorized, Paperclip can coordinate it.

 ### Task Management

@ -54,7 +56,7 @@ I am researching the Facebook ads Granola uses (current task)

 Tasks have parentage. Every task exists in service of a parent task, all the way up to the company goal. This is what keeps autonomous agents aligned — they can always answer "why am I doing this?"

-More detailed task structure TBD.
+The current issue model includes stable issue identifiers, parent/sub-issues, blockers, a single assignee, comments, issue documents, attachments and work products, and review/approval handoffs. That structure keeps work inspectable by both the board and agents while still allowing agents to decompose work into smaller tasks.

 ## Principles

@ -115,7 +117,7 @@ Paperclip’s core identity is a **control plane for autonomous AI companies**,

 - Do not make the core product a general chat app. The current product definition is explicitly task/comment-centric and “not a chatbot,” and that boundary is valuable.
 - Do not build a complete Jira/GitHub replacement. The repo/docs already position Paperclip as organization orchestration, not focused on pull-request review.
- Do not build enterprise-grade RBAC first. The current V1 spec still treats multi-board governance and fine-grained human permissions as out of scope, so the first multi-user version should be coarse and company-scoped.
+- Do not build enterprise-grade RBAC first. Paperclip now has authenticated mode, company memberships, instance roles, and permission grants, but fine-grained enterprise governance should remain secondary to the core company control plane.
 - Do not lead with raw bash logs and transcripts. Default view should be human-readable intent/progress, with raw detail beneath.
 - Do not force users to understand provider/API-key plumbing unless absolutely necessary. There are active onboarding/auth issues already; friction here is clearly real.

@ -136,11 +138,14 @@ Paperclip’s core identity is a **control plane for autonomous AI companies**,
 5. **Output-first**
   Work is not done until the user can see the result: file, document, preview link, screenshot, plan, or PR.

-6. **Local-first, cloud-ready**
+6. **Execution visibility without log worship**
+   Active runs, recovery issues, productivity review states, blockers, and work products should be first-class surfaces. Raw transcripts are available when needed, but they are not the primary product surface.
+
+7. **Local-first, cloud-ready**
   The mental model should not change between local solo use and shared/private or public/cloud deployment.

-7. **Safe autonomy**
+8. **Safe autonomy**
   Auto mode is allowed; hidden token burn is not.

-8. **Thin core, rich edges**
+9. **Thin core, rich edges**
   Put optional chat, knowledge, and special surfaces into plugins/extensions rather than bloating the control plane.
--- a/doc/SPEC-implementation.md
+++ b/doc/SPEC-implementation.md
@ -1,7 +1,7 @@
 # Paperclip V1 Implementation Spec

 Status: Implementation contract for first release (V1)
-Date: 2026-02-17
+Date: 2026-04-28
 Audience: Product, engineering, and agent-integration authors
 Source inputs: `GOAL.md`, `PRODUCT.md`, `SPEC.md`, `DATABASE.md`, current monorepo code

@ -37,8 +37,9 @@ These decisions close open questions from `SPEC.md` for V1.
 | Visibility | Full visibility to board and all agents in same company |
 | Communication | Tasks + comments only (no separate chat system) |
 | Task ownership | Single assignee; atomic checkout required for `in_progress` transition |
-| Recovery | No automatic reassignment; control-plane recovery may retry lost execution continuity once, then uses explicit recovery issues or human escalation |
-| Agent adapters | Built-in `process` and `http` adapters |
+| Recovery | Liveness/watchdog recovery preserves explicit ownership: retry lost execution continuity where safe, otherwise create visible recovery issues or require human escalation (see `doc/execution-semantics.md`) |
+| Agent adapters | Built-in `process`, `http`, local CLI/session adapters, and OpenClaw gateway support; external adapters can also be loaded through the adapter plugin flow |
+| Plugin framework | Local/self-hosted early plugin runtime is in scope; cloud marketplace and packaged public distribution remain out of scope |
 | Auth | Mode-dependent human auth (`local_trusted` implicit board in current code; authenticated mode uses sessions), API keys for agents |
 | Budget period | Monthly UTC calendar window |
 | Budget enforcement | Soft alerts + hard limit auto-pause |
@ -73,7 +74,7 @@ V1 implementation extends this baseline into a company-centric, governance-aware

 ## 5.2 Out of Scope (V1)

- Plugin framework and third-party extension SDK
+- Cloud-grade plugin marketplace/distribution beyond the local/self-hosted plugin runtime
 - Revenue/expense accounting beyond model/token costs
 - Knowledge base subsystem
 - Public marketplace (ClipHub)
@ -123,6 +124,16 @@ Human auth tables (`users`, `sessions`, and provider-specific auth artifacts) ar
 - `name` text not null
 - `description` text null
 - `status` enum: `active | paused | archived`
+- `pause_reason` text null
+- `paused_at` timestamptz null
+- `issue_prefix` text not null
+- `issue_counter` int not null
+- `budget_monthly_cents` int not null default 0
+- `spent_monthly_cents` int not null default 0
+- `attachment_max_bytes` int not null
+- `require_board_approval_for_new_agents` boolean not null default false
+- feedback sharing consent fields
+- branding fields such as `brand_color`

 Invariant: every business record belongs to exactly one company.

@ -133,15 +144,21 @@ Invariant: every business record belongs to exactly one company.
 - `name` text not null
 - `role` text not null
 - `title` text null
- `status` enum: `active | paused | idle | running | error | terminated`
+- `icon` text null
+- `status` enum: `active | paused | idle | running | error | pending_approval | terminated`
 - `reports_to` uuid fk `agents.id` null
 - `capabilities` text null
- `adapter_type` enum: `process | http`
+- `adapter_type` text; built-ins include `process`, `http`, `claude_local`, `codex_local`, `gemini_local`, `opencode_local`, `pi_local`, `cursor`, and `openclaw_gateway`
 - `adapter_config` jsonb not null
+- `runtime_config` jsonb not null default `{}`
+- `default_environment_id` uuid fk `environments.id` null
 - `context_mode` enum: `thin | fat` default `thin`
 - `budget_monthly_cents` int not null default 0
 - `spent_monthly_cents` int not null default 0
+- pause fields: `pause_reason`, `paused_at`
+- `permissions` jsonb not null default `{}`
 - `last_heartbeat_at` timestamptz null
+- `metadata` jsonb null

 Invariants:

@ -195,6 +212,7 @@ Invariant:
 - `id` uuid pk
 - `company_id` uuid fk not null
 - `project_id` uuid fk `projects.id` null
+- `project_workspace_id` uuid fk `project_workspaces.id` null
 - `goal_id` uuid fk `goals.id` null
 - `parent_id` uuid fk `issues.id` null
 - `title` text not null
@ -202,13 +220,22 @@ Invariant:
 - `status` enum: `backlog | todo | in_progress | in_review | done | blocked | cancelled`
 - `priority` enum: `critical | high | medium | low`
 - `assignee_agent_id` uuid fk `agents.id` null
+- `assignee_user_id` text null
+- checkout/execution locks: `checkout_run_id`, `execution_run_id`, `execution_agent_name_key`, `execution_locked_at`
 - `created_by_agent_id` uuid fk `agents.id` null
 - `created_by_user_id` uuid fk `users.id` null
+- identifier fields: `issue_number`, `identifier`
+- origin fields: `origin_kind`, `origin_id`, `origin_run_id`, `origin_fingerprint`
 - `request_depth` int not null default 0
 - `billing_code` text null
+- `assignee_adapter_overrides` jsonb null
+- `execution_policy` jsonb null
+- `execution_state` jsonb null
+- execution workspace fields: `execution_workspace_id`, `execution_workspace_preference`, `execution_workspace_settings`
 - `started_at` timestamptz null
 - `completed_at` timestamptz null
 - `cancelled_at` timestamptz null
+- `hidden_at` timestamptz null

 Invariants:

@ -261,10 +288,10 @@ Invariant: each event must attach to agent and company; rollups are aggregation,

 - `id` uuid pk
 - `company_id` uuid fk not null
- `type` enum: `hire_agent | approve_ceo_strategy`
+- `type` enum: `hire_agent | approve_ceo_strategy | budget_override_required | request_board_approval`
 - `requested_by_agent_id` uuid fk `agents.id` null
 - `requested_by_user_id` uuid fk `users.id` null
- `status` enum: `pending | approved | rejected | cancelled`
+- `status` enum: `pending | revision_requested | approved | rejected | cancelled`
 - `payload` jsonb not null
 - `decision_note` text null
 - `decided_by_user_id` uuid fk `users.id` null
@ -363,6 +390,15 @@ Operational policy:
  - `document_id` uuid fk not null
  - `key` text not null (`plan`, `design`, `notes`, etc.)

+## 7.16 Current Implementation Addenda
+
+The current implementation includes additional V1-control-plane tables beyond the original February snapshot:
+
+- Issue structure and review: `issue_relations` for blockers, `labels`/`issue_labels`, `issue_thread_interactions`, `issue_approvals`, `issue_execution_decisions`, `issue_work_products`, `issue_inbox_archives`, `issue_read_states`, and issue reference mention indexes.
+- Execution and workspace control: `execution_workspaces`, `project_workspaces`, `workspace_runtime_services`, `workspace_operations`, `environments`, `environment_leases`, `agent_task_sessions`, `agent_runtime_state`, `agent_wakeup_requests`, heartbeat events, and watchdog decision tables.
+- Plugins and routines: `plugins`, plugin config/state/entities/jobs/logs/webhooks, plugin database namespaces/migrations, plugin company settings, and `routines`.
+- Access and operations: company memberships, instance roles, principal permission grants, invites, join requests, board API keys, CLI auth challenges, budget policies/incidents, feedback exports/votes, company skills, sidebar preferences, and company logos.
+
 ## 8. State Machines

 ## 8.1 Agent Status
@ -563,6 +599,17 @@ Dashboard payload must include:
 - `422` semantic rule violation
 - `500` server error

+## 10.10 Current Implementation API Addenda
+
+The current app also exposes V1-supporting surfaces for:
+
+- issue thread interactions (`suggest_tasks`, `ask_user_questions`, `request_confirmation`)
+- issue approvals, issue references/search, labels, read state, inbox/archive state, and work products
+- execution workspaces, project workspaces, workspace runtime services, and workspace operations
+- routines and scheduled/API/webhook triggers
+- plugin installation, configuration, state, jobs, logs, webhooks, and plugin database namespace migration
+- company import/export preview/apply, feedback export/vote routes, instance backup/config routes, invites, join requests, memberships, and permission grants
+
 ## 11. Heartbeat and Adapter Contract

 ## 11.1 Adapter Interface
@ -738,13 +785,14 @@ Required UX behaviors:

 - Node 20+
 - `DATABASE_URL` optional
- if unset, auto-use PGlite and push schema
+- if unset, auto-use embedded PostgreSQL under `~/.paperclip/instances/default/db`

 ## 15.2 Migrations

 - Drizzle migrations are source of truth
+- local/dev startup applies pending migrations automatically where supported
+- `pnpm db:migrate` applies pending migrations manually
 - no destructive migration in-place for V1 upgrade path
- provide migration script from existing minimal tables to company-scoped schema

 ## 15.3 Logging and Audit

@ -799,6 +847,8 @@ A release candidate is blocked unless these pass:

 ## 18. Delivery Plan

+Current implementation note: the milestones below describe the original V1 sequencing. Several systems originally framed as future work have since shipped or advanced materially, including issue documents/interactions, blockers, routines, execution workspaces, import/export portability, authenticated deployment modes, multi-user basics, and the local/self-hosted plugin runtime.
+
 ## Milestone 1: Company Core and Auth

 - add `companies` and company scoping to existing entities
@ -851,7 +901,7 @@ V1 is complete only when all criteria are true:

 ## 20. Post-V1 Backlog (Explicitly Deferred)

- plugin architecture
+- cloud-grade plugin marketplace/distribution
 - richer workflow-state customization per team
 - milestones/labels/dependency graph depth beyond V1 minimum
 - realtime transport optimization (SSE/WebSockets)