mirror of
https://github.com/alkimake/paperclip.git
synced 2026-06-20 04:20:38 +09:00
Merge public-gh/master into paperclip-company-import-export
This commit is contained in:
commit
9e19f1d005
49 changed files with 3997 additions and 2501 deletions
424
doc/plans/2026-03-17-docker-release-browser-e2e.md
Normal file
424
doc/plans/2026-03-17-docker-release-browser-e2e.md
Normal file
|
|
@ -0,0 +1,424 @@
|
|||
# Docker Release Browser E2E Plan
|
||||
|
||||
## Context
|
||||
|
||||
Today release smoke testing for published Paperclip packages is manual and shell-driven:
|
||||
|
||||
```sh
|
||||
HOST_PORT=3232 DATA_DIR=./data/release-smoke-canary PAPERCLIPAI_VERSION=canary ./scripts/docker-onboard-smoke.sh
|
||||
HOST_PORT=3233 DATA_DIR=./data/release-smoke-stable PAPERCLIPAI_VERSION=latest ./scripts/docker-onboard-smoke.sh
|
||||
```
|
||||
|
||||
That is useful because it exercises the same public install surface users hit:
|
||||
|
||||
- Docker
|
||||
- `npx paperclipai@canary`
|
||||
- `npx paperclipai@latest`
|
||||
- authenticated bootstrap flow
|
||||
|
||||
But it still leaves the most important release questions to a human with a browser:
|
||||
|
||||
- can I sign in with the smoke credentials?
|
||||
- do I land in onboarding?
|
||||
- can I complete onboarding?
|
||||
- does the initial CEO agent actually get created and run?
|
||||
|
||||
The repo already has two adjacent pieces:
|
||||
|
||||
- `tests/e2e/onboarding.spec.ts` covers the onboarding wizard against the local source tree
|
||||
- `scripts/docker-onboard-smoke.sh` boots a published Docker install and auto-bootstraps authenticated mode, but only verifies the API/session layer
|
||||
|
||||
What is missing is one deterministic browser test that joins those two paths.
|
||||
|
||||
## Goal
|
||||
|
||||
Add a release-grade Docker-backed browser E2E that validates the published `canary` and `latest` installs end to end:
|
||||
|
||||
1. boot the published package in Docker
|
||||
2. sign in with known smoke credentials
|
||||
3. verify the user is routed into onboarding
|
||||
4. complete onboarding in the browser
|
||||
5. verify the first CEO agent exists
|
||||
6. verify the initial CEO run was triggered and reached a terminal or active state
|
||||
|
||||
Then wire that test into GitHub Actions so release validation is no longer manual-only.
|
||||
|
||||
## Recommendation In One Sentence
|
||||
|
||||
Turn the current Docker smoke script into a machine-friendly test harness, add a dedicated Playwright release-smoke spec that drives the authenticated browser flow against published Docker installs, and run it in GitHub Actions for both `canary` and `latest`.
|
||||
|
||||
## What We Have Today
|
||||
|
||||
### Existing local browser coverage
|
||||
|
||||
`tests/e2e/onboarding.spec.ts` already proves the onboarding wizard can:
|
||||
|
||||
- create a company
|
||||
- create a CEO agent
|
||||
- create an initial issue
|
||||
- optionally observe task progress
|
||||
|
||||
That is a good base, but it does not validate the public npm package, Docker path, authenticated login flow, or release dist-tags.
|
||||
|
||||
### Existing Docker smoke coverage
|
||||
|
||||
`scripts/docker-onboard-smoke.sh` already does useful setup work:
|
||||
|
||||
- builds `Dockerfile.onboard-smoke`
|
||||
- runs `paperclipai@${PAPERCLIPAI_VERSION}` inside Docker
|
||||
- waits for health
|
||||
- signs up or signs in a smoke admin user
|
||||
- generates and accepts the bootstrap CEO invite in authenticated mode
|
||||
- verifies a board session and `/api/companies`
|
||||
|
||||
That means the hard bootstrap problem is mostly solved already. The main gap is that the script is human-oriented and never hands control to a browser test.
|
||||
|
||||
### Existing CI shape
|
||||
|
||||
The repo already has:
|
||||
|
||||
- `.github/workflows/e2e.yml` for manual Playwright runs against local source
|
||||
- `.github/workflows/release.yml` for canary publish on `master` and manual stable promotion
|
||||
|
||||
So the right move is to extend the current test/release system, not create a parallel one.
|
||||
|
||||
## Product Decision
|
||||
|
||||
### 1. The release smoke should stay deterministic and token-free
|
||||
|
||||
The first version should not require OpenAI, Anthropic, or external agent credentials.
|
||||
|
||||
Use the onboarding flow with a deterministic adapter that can run on a stock GitHub runner and inside the published Docker install. The existing `process` adapter with a trivial command is the right base path for this release gate.
|
||||
|
||||
That keeps this test focused on:
|
||||
|
||||
- release packaging
|
||||
- auth/bootstrap
|
||||
- UI routing
|
||||
- onboarding contract
|
||||
- agent creation
|
||||
- heartbeat invocation plumbing
|
||||
|
||||
Later we can add a second credentialed smoke lane for real model-backed agents.
|
||||
|
||||
### 2. Smoke credentials become an explicit test contract
|
||||
|
||||
The current defaults in `scripts/docker-onboard-smoke.sh` should be treated as stable test fixtures:
|
||||
|
||||
- email: `smoke-admin@paperclip.local`
|
||||
- password: `paperclip-smoke-password`
|
||||
|
||||
The browser test should log in with those exact values unless overridden by env vars.
|
||||
|
||||
### 3. Published-package smoke and source-tree E2E stay separate
|
||||
|
||||
Keep two lanes:
|
||||
|
||||
- source-tree E2E for feature development
|
||||
- published Docker release smoke for release confidence
|
||||
|
||||
They overlap on onboarding assertions, but they guard different failure classes.
|
||||
|
||||
## Proposed Design
|
||||
|
||||
## 1. Add a CI-friendly Docker smoke harness
|
||||
|
||||
Refactor `scripts/docker-onboard-smoke.sh` so it can run in two modes:
|
||||
|
||||
- interactive mode
|
||||
- current behavior
|
||||
- streams logs and waits in foreground for manual inspection
|
||||
- CI mode
|
||||
- starts the container
|
||||
- waits for health and authenticated bootstrap
|
||||
- prints machine-readable metadata
|
||||
- exits while leaving the container running for Playwright
|
||||
|
||||
Recommended shape:
|
||||
|
||||
- keep `scripts/docker-onboard-smoke.sh` as the public entry point
|
||||
- add a `SMOKE_DETACH=true` or `--detach` mode
|
||||
- emit a JSON blob or `.env` file containing:
|
||||
- `SMOKE_BASE_URL`
|
||||
- `SMOKE_ADMIN_EMAIL`
|
||||
- `SMOKE_ADMIN_PASSWORD`
|
||||
- `SMOKE_CONTAINER_NAME`
|
||||
- `SMOKE_DATA_DIR`
|
||||
|
||||
The workflow and Playwright tests can then consume the emitted metadata instead of scraping logs.
|
||||
|
||||
### Why this matters
|
||||
|
||||
The current script always tails logs and then blocks on `wait "$LOG_PID"`. That is convenient for manual smoke testing, but it is the wrong shape for CI orchestration.
|
||||
|
||||
## 2. Add a dedicated Playwright release-smoke spec
|
||||
|
||||
Create a second Playwright entry point specifically for published Docker installs, for example:
|
||||
|
||||
- `tests/release-smoke/playwright.config.ts`
|
||||
- `tests/release-smoke/docker-auth-onboarding.spec.ts`
|
||||
|
||||
This suite should not use Playwright `webServer`, because the app server will already be running inside Docker.
|
||||
|
||||
### Browser scenario
|
||||
|
||||
The first release-smoke scenario should validate:
|
||||
|
||||
1. open `/`
|
||||
2. unauthenticated user is redirected to `/auth`
|
||||
3. sign in using the smoke credentials
|
||||
4. authenticated user lands on onboarding when no companies exist
|
||||
5. onboarding wizard appears with the expected step labels
|
||||
6. create a company
|
||||
7. create the first agent using `process`
|
||||
8. create the initial issue
|
||||
9. finish onboarding and open the created issue
|
||||
10. verify via API:
|
||||
- company exists
|
||||
- CEO agent exists
|
||||
- issue exists and is assigned to the CEO
|
||||
11. verify the first heartbeat run was triggered:
|
||||
- either by checking issue status changed from initial state, or
|
||||
- by checking agent/runs API shows a run for the CEO, or
|
||||
- both
|
||||
|
||||
The test should tolerate the run completing quickly. For this reason, the assertion should accept:
|
||||
|
||||
- `queued`
|
||||
- `running`
|
||||
- `succeeded`
|
||||
|
||||
and similarly for issue progression if the issue status changes before the assertion runs.
|
||||
|
||||
### Why a separate spec instead of reusing `tests/e2e/onboarding.spec.ts`
|
||||
|
||||
The local-source test and release-smoke test have different assumptions:
|
||||
|
||||
- different server lifecycle
|
||||
- different auth path
|
||||
- different deployment mode
|
||||
- published npm package instead of local workspace code
|
||||
|
||||
Trying to force both through one spec will make both worse.
|
||||
|
||||
## 3. Add a release-smoke workflow in GitHub Actions
|
||||
|
||||
Add a workflow dedicated to this surface, ideally reusable:
|
||||
|
||||
- `.github/workflows/release-smoke.yml`
|
||||
|
||||
Recommended triggers:
|
||||
|
||||
- `workflow_dispatch`
|
||||
- `workflow_call`
|
||||
|
||||
Recommended inputs:
|
||||
|
||||
- `paperclip_version`
|
||||
- `canary` or `latest`
|
||||
- `host_port`
|
||||
- optional, default runner-safe port
|
||||
- `artifact_name`
|
||||
- optional for clearer uploads
|
||||
|
||||
### Job outline
|
||||
|
||||
1. checkout repo
|
||||
2. install Node/pnpm
|
||||
3. install Playwright browser dependencies
|
||||
4. launch Docker smoke harness in detached mode with the chosen dist-tag
|
||||
5. run the release-smoke Playwright suite against the returned base URL
|
||||
6. always collect diagnostics:
|
||||
- Playwright report
|
||||
- screenshots
|
||||
- trace
|
||||
- `docker logs`
|
||||
- harness metadata file
|
||||
7. stop and remove container
|
||||
|
||||
### Why a reusable workflow
|
||||
|
||||
This lets us:
|
||||
|
||||
- run the smoke manually on demand
|
||||
- call it from `release.yml`
|
||||
- reuse the same job for both `canary` and `latest`
|
||||
|
||||
## 4. Integrate it into release automation incrementally
|
||||
|
||||
### Phase A: Manual workflow only
|
||||
|
||||
First ship the workflow as manual-only so the harness and test can be stabilized without blocking releases.
|
||||
|
||||
### Phase B: Run automatically after canary publish
|
||||
|
||||
After `publish_canary` succeeds in `.github/workflows/release.yml`, call the reusable release-smoke workflow with:
|
||||
|
||||
- `paperclip_version=canary`
|
||||
|
||||
This proves the just-published public canary really boots and onboards.
|
||||
|
||||
### Phase C: Run automatically after stable publish
|
||||
|
||||
After `publish_stable` succeeds, call the same workflow with:
|
||||
|
||||
- `paperclip_version=latest`
|
||||
|
||||
This gives us post-publish confirmation that the stable dist-tag is healthy.
|
||||
|
||||
### Important nuance
|
||||
|
||||
Testing `latest` from npm cannot happen before stable publish, because the package under test does not exist under `latest` yet. So the `latest` smoke is a post-publish verification, not a pre-publish gate.
|
||||
|
||||
If we later want a true pre-publish stable gate, that should be a separate source-ref or locally built package smoke job.
|
||||
|
||||
## 5. Make diagnostics first-class
|
||||
|
||||
This workflow is only valuable if failures are fast to debug.
|
||||
|
||||
Always capture:
|
||||
|
||||
- Playwright HTML report
|
||||
- Playwright trace on failure
|
||||
- final screenshot on failure
|
||||
- full `docker logs` output
|
||||
- emitted smoke metadata
|
||||
- optional `curl /api/health` snapshot
|
||||
|
||||
Without that, the test will become a flaky black box and people will stop trusting it.
|
||||
|
||||
## Implementation Plan
|
||||
|
||||
## Phase 1: Harness refactor
|
||||
|
||||
Files:
|
||||
|
||||
- `scripts/docker-onboard-smoke.sh`
|
||||
- optionally `scripts/lib/docker-onboard-smoke.sh` or similar helper
|
||||
- `doc/DOCKER.md`
|
||||
- `doc/RELEASING.md`
|
||||
|
||||
Tasks:
|
||||
|
||||
1. Add detached/CI mode to the Docker smoke script.
|
||||
2. Make the script emit machine-readable connection metadata.
|
||||
3. Keep the current interactive manual mode intact.
|
||||
4. Add reliable cleanup commands for CI.
|
||||
|
||||
Acceptance:
|
||||
|
||||
- a script invocation can start the published Docker app, auto-bootstrap it, and return control to the caller with enough metadata for browser automation
|
||||
|
||||
## Phase 2: Browser release-smoke suite
|
||||
|
||||
Files:
|
||||
|
||||
- `tests/release-smoke/playwright.config.ts`
|
||||
- `tests/release-smoke/docker-auth-onboarding.spec.ts`
|
||||
- root `package.json`
|
||||
|
||||
Tasks:
|
||||
|
||||
1. Add a dedicated Playwright config for external server testing.
|
||||
2. Implement login + onboarding + CEO creation flow.
|
||||
3. Assert a CEO run was created or completed.
|
||||
4. Add a root script such as:
|
||||
- `test:release-smoke`
|
||||
|
||||
Acceptance:
|
||||
|
||||
- the suite passes locally against both:
|
||||
- `PAPERCLIPAI_VERSION=canary`
|
||||
- `PAPERCLIPAI_VERSION=latest`
|
||||
|
||||
## Phase 3: GitHub Actions workflow
|
||||
|
||||
Files:
|
||||
|
||||
- `.github/workflows/release-smoke.yml`
|
||||
|
||||
Tasks:
|
||||
|
||||
1. Add manual and reusable workflow entry points.
|
||||
2. Install Chromium and runner dependencies.
|
||||
3. Start Docker smoke in detached mode.
|
||||
4. Run the release-smoke Playwright suite.
|
||||
5. Upload diagnostics artifacts.
|
||||
|
||||
Acceptance:
|
||||
|
||||
- a maintainer can run the workflow manually for either `canary` or `latest`
|
||||
|
||||
## Phase 4: Release workflow integration
|
||||
|
||||
Files:
|
||||
|
||||
- `.github/workflows/release.yml`
|
||||
- `doc/RELEASING.md`
|
||||
|
||||
Tasks:
|
||||
|
||||
1. Trigger release smoke automatically after canary publish.
|
||||
2. Trigger release smoke automatically after stable publish.
|
||||
3. Document expected behavior and failure handling.
|
||||
|
||||
Acceptance:
|
||||
|
||||
- canary releases automatically produce a published-package browser smoke result
|
||||
- stable releases automatically produce a `latest` browser smoke result
|
||||
|
||||
## Phase 5: Future extension for real model-backed agent validation
|
||||
|
||||
Not part of the first implementation, but this should be the next layer after the deterministic lane is stable.
|
||||
|
||||
Possible additions:
|
||||
|
||||
- a second Playwright project gated on repo secrets
|
||||
- real `claude_local` or `codex_local` adapter validation in Docker-capable environments
|
||||
- assertion that the CEO posts a real task/comment artifact
|
||||
- stable release holdback until the credentialed lane passes
|
||||
|
||||
This should stay optional until the token-free lane is trustworthy.
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
The plan is complete when the implemented system can demonstrate all of the following:
|
||||
|
||||
1. A published `paperclipai@canary` Docker install can be smoke-tested by Playwright in CI.
|
||||
2. A published `paperclipai@latest` Docker install can be smoke-tested by Playwright in CI.
|
||||
3. The test logs into authenticated mode with the smoke credentials.
|
||||
4. The test sees onboarding for a fresh instance.
|
||||
5. The test completes onboarding in the browser.
|
||||
6. The test verifies the initial CEO agent was created.
|
||||
7. The test verifies at least one CEO heartbeat run was triggered.
|
||||
8. Failures produce actionable artifacts rather than just a red job.
|
||||
|
||||
## Risks And Decisions To Make
|
||||
|
||||
### 1. Fast process runs may finish before the UI visibly updates
|
||||
|
||||
That is expected. The assertions should prefer API polling for run existence/status rather than only visual indicators.
|
||||
|
||||
### 2. `latest` smoke is post-publish, not preventive
|
||||
|
||||
This is a real limitation of testing the published dist-tag itself. It is still valuable, but it should not be confused with a pre-publish gate.
|
||||
|
||||
### 3. We should not overcouple the test to cosmetic onboarding text
|
||||
|
||||
The important contract is flow success, created entities, and run creation. Use visible labels sparingly and prefer stable semantic selectors where possible.
|
||||
|
||||
### 4. Keep the smoke adapter path boring
|
||||
|
||||
For release safety, the first test should use the most boring runnable adapter possible. This is not the place to validate every adapter.
|
||||
|
||||
## Recommended First Slice
|
||||
|
||||
If we want the fastest path to value, ship this in order:
|
||||
|
||||
1. add detached mode to `scripts/docker-onboard-smoke.sh`
|
||||
2. add one Playwright spec for authenticated login + onboarding + CEO run verification
|
||||
3. add manual `release-smoke.yml`
|
||||
4. once stable, wire canary into `release.yml`
|
||||
5. after that, wire stable `latest` smoke into `release.yml`
|
||||
|
||||
That gives release confidence quickly without turning the first version into a large CI redesign.
|
||||
426
doc/plans/2026-03-17-memory-service-surface-api.md
Normal file
426
doc/plans/2026-03-17-memory-service-surface-api.md
Normal file
|
|
@ -0,0 +1,426 @@
|
|||
# Paperclip Memory Service Plan
|
||||
|
||||
## Goal
|
||||
|
||||
Define a Paperclip memory service and surface API that can sit above multiple memory backends, while preserving Paperclip's control-plane requirements:
|
||||
|
||||
- company scoping
|
||||
- auditability
|
||||
- provenance back to Paperclip work objects
|
||||
- budget / cost visibility
|
||||
- plugin-first extensibility
|
||||
|
||||
This plan is based on the external landscape summarized in `doc/memory-landscape.md` and on the current Paperclip architecture in:
|
||||
|
||||
- `doc/SPEC-implementation.md`
|
||||
- `doc/plugins/PLUGIN_SPEC.md`
|
||||
- `doc/plugins/PLUGIN_AUTHORING_GUIDE.md`
|
||||
- `packages/plugins/sdk/src/types.ts`
|
||||
|
||||
## Recommendation In One Sentence
|
||||
|
||||
Paperclip should not embed one opinionated memory engine into core. It should add a company-scoped memory control plane with a small normalized adapter contract, then let built-ins and plugins implement the provider-specific behavior.
|
||||
|
||||
## Product Decisions
|
||||
|
||||
### 1. Memory is company-scoped by default
|
||||
|
||||
Every memory binding belongs to exactly one company.
|
||||
|
||||
That binding can then be:
|
||||
|
||||
- the company default
|
||||
- an agent override
|
||||
- a project override later if we need it
|
||||
|
||||
No cross-company memory sharing in the initial design.
|
||||
|
||||
### 2. Providers are selected by key
|
||||
|
||||
Each configured memory provider gets a stable key inside a company, for example:
|
||||
|
||||
- `default`
|
||||
- `mem0-prod`
|
||||
- `local-markdown`
|
||||
- `research-kb`
|
||||
|
||||
Agents and services resolve the active provider by key, not by hard-coded vendor logic.
|
||||
|
||||
### 3. Plugins are the primary provider path
|
||||
|
||||
Built-ins are useful for a zero-config local path, but most providers should arrive through the existing Paperclip plugin runtime.
|
||||
|
||||
That keeps the core small and matches the current direction that optional knowledge-like systems live at the edges.
|
||||
|
||||
### 4. Paperclip owns routing, provenance, and accounting
|
||||
|
||||
Providers should not decide how Paperclip entities map to governance.
|
||||
|
||||
Paperclip core should own:
|
||||
|
||||
- who is allowed to call a memory operation
|
||||
- which company / agent / project scope is active
|
||||
- what issue / run / comment / document the operation belongs to
|
||||
- how usage gets recorded
|
||||
|
||||
### 5. Automatic memory should be narrow at first
|
||||
|
||||
Automatic capture is useful, but broad silent capture is dangerous.
|
||||
|
||||
Initial automatic hooks should be:
|
||||
|
||||
- post-run capture from agent runs
|
||||
- issue comment / document capture when the binding enables it
|
||||
- pre-run recall for agent context hydration
|
||||
|
||||
Everything else should start explicit.
|
||||
|
||||
## Proposed Concepts
|
||||
|
||||
### Memory provider
|
||||
|
||||
A built-in or plugin-supplied implementation that stores and retrieves memory.
|
||||
|
||||
Examples:
|
||||
|
||||
- local markdown + vector index
|
||||
- mem0 adapter
|
||||
- supermemory adapter
|
||||
- MemOS adapter
|
||||
|
||||
### Memory binding
|
||||
|
||||
A company-scoped configuration record that points to a provider and carries provider-specific config.
|
||||
|
||||
This is the object selected by key.
|
||||
|
||||
### Memory scope
|
||||
|
||||
The normalized Paperclip scope passed into a provider request.
|
||||
|
||||
At minimum:
|
||||
|
||||
- `companyId`
|
||||
- optional `agentId`
|
||||
- optional `projectId`
|
||||
- optional `issueId`
|
||||
- optional `runId`
|
||||
- optional `subjectId` for external/user identity
|
||||
|
||||
### Memory source reference
|
||||
|
||||
The provenance handle that explains where a memory came from.
|
||||
|
||||
Supported source kinds should include:
|
||||
|
||||
- `issue_comment`
|
||||
- `issue_document`
|
||||
- `issue`
|
||||
- `run`
|
||||
- `activity`
|
||||
- `manual_note`
|
||||
- `external_document`
|
||||
|
||||
### Memory operation
|
||||
|
||||
A normalized write, query, browse, or delete action performed through Paperclip.
|
||||
|
||||
Paperclip should log every operation, whether the provider is local or external.
|
||||
|
||||
## Required Adapter Contract
|
||||
|
||||
The required core should be small enough to fit `memsearch`, `mem0`, `Memori`, `MemOS`, or `OpenViking`.
|
||||
|
||||
```ts
|
||||
export interface MemoryAdapterCapabilities {
|
||||
profile?: boolean;
|
||||
browse?: boolean;
|
||||
correction?: boolean;
|
||||
asyncIngestion?: boolean;
|
||||
multimodal?: boolean;
|
||||
providerManagedExtraction?: boolean;
|
||||
}
|
||||
|
||||
export interface MemoryScope {
|
||||
companyId: string;
|
||||
agentId?: string;
|
||||
projectId?: string;
|
||||
issueId?: string;
|
||||
runId?: string;
|
||||
subjectId?: string;
|
||||
}
|
||||
|
||||
export interface MemorySourceRef {
|
||||
kind:
|
||||
| "issue_comment"
|
||||
| "issue_document"
|
||||
| "issue"
|
||||
| "run"
|
||||
| "activity"
|
||||
| "manual_note"
|
||||
| "external_document";
|
||||
companyId: string;
|
||||
issueId?: string;
|
||||
commentId?: string;
|
||||
documentKey?: string;
|
||||
runId?: string;
|
||||
activityId?: string;
|
||||
externalRef?: string;
|
||||
}
|
||||
|
||||
export interface MemoryUsage {
|
||||
provider: string;
|
||||
model?: string;
|
||||
inputTokens?: number;
|
||||
outputTokens?: number;
|
||||
embeddingTokens?: number;
|
||||
costCents?: number;
|
||||
latencyMs?: number;
|
||||
details?: Record<string, unknown>;
|
||||
}
|
||||
|
||||
export interface MemoryWriteRequest {
|
||||
bindingKey: string;
|
||||
scope: MemoryScope;
|
||||
source: MemorySourceRef;
|
||||
content: string;
|
||||
metadata?: Record<string, unknown>;
|
||||
mode?: "append" | "upsert" | "summarize";
|
||||
}
|
||||
|
||||
export interface MemoryRecordHandle {
|
||||
providerKey: string;
|
||||
providerRecordId: string;
|
||||
}
|
||||
|
||||
export interface MemoryQueryRequest {
|
||||
bindingKey: string;
|
||||
scope: MemoryScope;
|
||||
query: string;
|
||||
topK?: number;
|
||||
intent?: "agent_preamble" | "answer" | "browse";
|
||||
metadataFilter?: Record<string, unknown>;
|
||||
}
|
||||
|
||||
export interface MemorySnippet {
|
||||
handle: MemoryRecordHandle;
|
||||
text: string;
|
||||
score?: number;
|
||||
summary?: string;
|
||||
source?: MemorySourceRef;
|
||||
metadata?: Record<string, unknown>;
|
||||
}
|
||||
|
||||
export interface MemoryContextBundle {
|
||||
snippets: MemorySnippet[];
|
||||
profileSummary?: string;
|
||||
usage?: MemoryUsage[];
|
||||
}
|
||||
|
||||
export interface MemoryAdapter {
|
||||
key: string;
|
||||
capabilities: MemoryAdapterCapabilities;
|
||||
write(req: MemoryWriteRequest): Promise<{
|
||||
records?: MemoryRecordHandle[];
|
||||
usage?: MemoryUsage[];
|
||||
}>;
|
||||
query(req: MemoryQueryRequest): Promise<MemoryContextBundle>;
|
||||
get(handle: MemoryRecordHandle, scope: MemoryScope): Promise<MemorySnippet | null>;
|
||||
forget(handles: MemoryRecordHandle[], scope: MemoryScope): Promise<{ usage?: MemoryUsage[] }>;
|
||||
}
|
||||
```
|
||||
|
||||
This contract intentionally does not force a provider to expose its internal graph, filesystem, or ontology.
|
||||
|
||||
## Optional Adapter Surfaces
|
||||
|
||||
These should be capability-gated, not required:
|
||||
|
||||
- `browse(scope, filters)` for file-system / graph / timeline inspection
|
||||
- `correct(handle, patch)` for natural-language correction flows
|
||||
- `profile(scope)` when the provider can synthesize stable preferences or summaries
|
||||
- `sync(source)` for connectors or background ingestion
|
||||
- `explain(queryResult)` for providers that can expose retrieval traces
|
||||
|
||||
## What Paperclip Should Persist
|
||||
|
||||
Paperclip should not mirror the full provider memory corpus into Postgres unless the provider is a Paperclip-managed local provider.
|
||||
|
||||
Paperclip core should persist:
|
||||
|
||||
- memory bindings and overrides
|
||||
- provider keys and capability metadata
|
||||
- normalized memory operation logs
|
||||
- provider record handles returned by operations when available
|
||||
- source references back to issue comments, documents, runs, and activity
|
||||
- usage and cost data
|
||||
|
||||
For external providers, the memory payload itself can remain in the provider.
|
||||
|
||||
## Hook Model
|
||||
|
||||
### Automatic hooks
|
||||
|
||||
These should be low-risk and easy to reason about:
|
||||
|
||||
1. `pre-run hydrate`
|
||||
Before an agent run starts, Paperclip may call `query(... intent = "agent_preamble")` using the active binding.
|
||||
|
||||
2. `post-run capture`
|
||||
After a run finishes, Paperclip may write a summary or transcript-derived note tied to the run.
|
||||
|
||||
3. `issue comment / document capture`
|
||||
When enabled on the binding, Paperclip may capture selected issue comments or issue documents as memory sources.
|
||||
|
||||
### Explicit hooks
|
||||
|
||||
These should be tool- or UI-driven first:
|
||||
|
||||
- `memory.search`
|
||||
- `memory.note`
|
||||
- `memory.forget`
|
||||
- `memory.correct`
|
||||
- `memory.browse`
|
||||
|
||||
### Not automatic in the first version
|
||||
|
||||
- broad web crawling
|
||||
- silent import of arbitrary repo files
|
||||
- cross-company memory sharing
|
||||
- automatic destructive deletion
|
||||
- provider migration between bindings
|
||||
|
||||
## Agent UX Rules
|
||||
|
||||
Paperclip should give agents both automatic recall and explicit tools, with simple guidance:
|
||||
|
||||
- use `memory.search` when the task depends on prior decisions, people, projects, or long-running context that is not in the current issue thread
|
||||
- use `memory.note` when a durable fact, preference, or decision should survive this run
|
||||
- use `memory.correct` when the user explicitly says prior context is wrong
|
||||
- rely on post-run auto-capture for ordinary session residue so agents do not have to write memory notes for every trivial exchange
|
||||
|
||||
This keeps memory available without forcing every agent prompt to become a memory-management protocol.
|
||||
|
||||
## Browse And Inspect Surface
|
||||
|
||||
Paperclip needs a first-class UI for memory, otherwise providers become black boxes.
|
||||
|
||||
The initial browse surface should support:
|
||||
|
||||
- active binding by company and agent
|
||||
- recent memory operations
|
||||
- recent write sources
|
||||
- query results with source backlinks
|
||||
- filters by agent, issue, run, source kind, and date
|
||||
- provider usage / cost / latency summaries
|
||||
|
||||
When a provider supports richer browsing, the plugin can add deeper views through the existing plugin UI surfaces.
|
||||
|
||||
## Cost And Evaluation
|
||||
|
||||
Every adapter response should be able to return usage records.
|
||||
|
||||
Paperclip should roll up:
|
||||
|
||||
- memory inference tokens
|
||||
- embedding tokens
|
||||
- external provider cost
|
||||
- latency
|
||||
- query count
|
||||
- write count
|
||||
|
||||
It should also record evaluation-oriented metrics where possible:
|
||||
|
||||
- recall hit rate
|
||||
- empty query rate
|
||||
- manual correction count
|
||||
- per-binding success / failure counts
|
||||
|
||||
This is important because a memory system that "works" but silently burns budget is not acceptable in Paperclip.
|
||||
|
||||
## Suggested Data Model Additions
|
||||
|
||||
At the control-plane level, the likely new core tables are:
|
||||
|
||||
- `memory_bindings`
|
||||
- company-scoped key
|
||||
- provider id / plugin id
|
||||
- config blob
|
||||
- enabled status
|
||||
|
||||
- `memory_binding_targets`
|
||||
- target type (`company`, `agent`, later `project`)
|
||||
- target id
|
||||
- binding id
|
||||
|
||||
- `memory_operations`
|
||||
- company id
|
||||
- binding id
|
||||
- operation type (`write`, `query`, `forget`, `browse`, `correct`)
|
||||
- scope fields
|
||||
- source refs
|
||||
- usage / latency / cost
|
||||
- success / error
|
||||
|
||||
Provider-specific long-form state should stay in plugin state or the provider itself unless a built-in local provider needs its own schema.
|
||||
|
||||
## Recommended First Built-In
|
||||
|
||||
The best zero-config built-in is a local markdown-first provider with optional semantic indexing.
|
||||
|
||||
Why:
|
||||
|
||||
- it matches Paperclip's local-first posture
|
||||
- it is inspectable
|
||||
- it is easy to back up and debug
|
||||
- it gives the system a baseline even without external API keys
|
||||
|
||||
The design should still treat that built-in as just another provider behind the same control-plane contract.
|
||||
|
||||
## Rollout Phases
|
||||
|
||||
### Phase 1: Control-plane contract
|
||||
|
||||
- add memory binding models and API types
|
||||
- add plugin capability / registration surface for memory providers
|
||||
- add operation logging and usage reporting
|
||||
|
||||
### Phase 2: One built-in + one plugin example
|
||||
|
||||
- ship a local markdown-first provider
|
||||
- ship one hosted adapter example to validate the external-provider path
|
||||
|
||||
### Phase 3: UI inspection
|
||||
|
||||
- add company / agent memory settings
|
||||
- add a memory operation explorer
|
||||
- add source backlinks to issues and runs
|
||||
|
||||
### Phase 4: Automatic hooks
|
||||
|
||||
- pre-run hydrate
|
||||
- post-run capture
|
||||
- selected issue comment / document capture
|
||||
|
||||
### Phase 5: Rich capabilities
|
||||
|
||||
- correction flows
|
||||
- provider-native browse / graph views
|
||||
- project-level overrides if needed
|
||||
- evaluation dashboards
|
||||
|
||||
## Open Questions
|
||||
|
||||
- Should project overrides exist in V1 of the memory service, or should we force company default + agent override first?
|
||||
- Do we want Paperclip-managed extraction pipelines at all, or should built-ins be the only place where Paperclip owns extraction?
|
||||
- Should memory usage extend the current `cost_events` model directly, or should memory operations keep a parallel usage log and roll up into `cost_events` secondarily?
|
||||
- Do we want provider install / binding changes to require approvals for some companies?
|
||||
|
||||
## Bottom Line
|
||||
|
||||
The right abstraction is:
|
||||
|
||||
- Paperclip owns memory bindings, scopes, provenance, governance, and usage reporting.
|
||||
- Providers own extraction, ranking, storage, and provider-native memory semantics.
|
||||
|
||||
That gives Paperclip a stable "memory service" without locking the product to one memory philosophy or one vendor.
|
||||
488
doc/plans/2026-03-17-release-automation-and-versioning.md
Normal file
488
doc/plans/2026-03-17-release-automation-and-versioning.md
Normal file
|
|
@ -0,0 +1,488 @@
|
|||
# Release Automation and Versioning Simplification Plan
|
||||
|
||||
## Context
|
||||
|
||||
Paperclip's current release flow is documented in `doc/RELEASING.md` and implemented through:
|
||||
|
||||
- `.github/workflows/release.yml`
|
||||
- `scripts/release-lib.sh`
|
||||
- `scripts/release-start.sh`
|
||||
- `scripts/release-preflight.sh`
|
||||
- `scripts/release.sh`
|
||||
- `scripts/create-github-release.sh`
|
||||
|
||||
Today the model is:
|
||||
|
||||
1. pick `patch`, `minor`, or `major`
|
||||
2. create `release/X.Y.Z`
|
||||
3. draft `releases/vX.Y.Z.md`
|
||||
4. publish one or more canaries from that release branch
|
||||
5. publish stable from that same branch
|
||||
6. push tag + create GitHub Release
|
||||
7. merge the release branch back to `master`
|
||||
|
||||
That is workable, but it creates friction in exactly the places that should be cheap:
|
||||
|
||||
- deciding `patch` vs `minor` vs `major`
|
||||
- cutting and carrying release branches
|
||||
- manually publishing canaries
|
||||
- thinking about changelog generation for canaries
|
||||
- handling npm credentials safely in a public repo
|
||||
|
||||
The target state from this discussion is simpler:
|
||||
|
||||
- every push to `master` publishes a canary automatically
|
||||
- stable releases are promoted deliberately from a vetted commit
|
||||
- versioning is date-driven instead of semantics-driven
|
||||
- stable publishing is secure even in a public open-source repository
|
||||
- changelog generation happens only for real stable releases
|
||||
|
||||
## Recommendation In One Sentence
|
||||
|
||||
Move Paperclip to semver-compatible calendar versioning, auto-publish canaries from `master`, promote stable from a chosen tested commit, and use npm trusted publishing plus GitHub environments so no long-lived npm or LLM token needs to live in Actions.
|
||||
|
||||
## Core Decisions
|
||||
|
||||
### 1. Use calendar versions, but keep semver syntax
|
||||
|
||||
The repo and npm tooling still assume semver-shaped version strings in many places. That does not mean Paperclip must keep semver as a product policy. It does mean the version format should remain semver-valid.
|
||||
|
||||
Recommended format:
|
||||
|
||||
- stable: `YYYY.MDD.P`
|
||||
- canary: `YYYY.MDD.P-canary.N`
|
||||
|
||||
Examples:
|
||||
|
||||
- first stable on March 17, 2026: `2026.317.0`
|
||||
- third canary on the `2026.317.0` line: `2026.317.0-canary.2`
|
||||
|
||||
Why this shape:
|
||||
|
||||
- it removes `patch/minor/major` decisions
|
||||
- it is valid semver syntax
|
||||
- it stays compatible with npm, dist-tags, and existing semver validators
|
||||
- it is close to the format you actually want
|
||||
|
||||
Important constraints:
|
||||
|
||||
- the middle numeric slot should be `MDD`, where `M` is the month and `DD` is the zero-padded day
|
||||
- `2026.03.17` is not the format to use
|
||||
- numeric semver identifiers do not allow leading zeroes
|
||||
- `2026.3.17.1` is not the format to use
|
||||
- semver has three numeric components, not four
|
||||
- the practical semver-safe equivalent is `2026.317.0-canary.8`
|
||||
|
||||
This is effectively CalVer on semver rails.
|
||||
|
||||
### 2. Accept that CalVer changes the compatibility contract
|
||||
|
||||
This is not semver in spirit anymore. It is semver in syntax only.
|
||||
|
||||
That tradeoff is probably acceptable for Paperclip, but it should be explicit:
|
||||
|
||||
- consumers no longer infer compatibility from `major/minor/patch`
|
||||
- release notes become the compatibility signal
|
||||
- downstream users should prefer exact pins or deliberate upgrades
|
||||
|
||||
This is especially relevant for public library packages like `@paperclipai/shared`, `@paperclipai/db`, and the adapter packages.
|
||||
|
||||
### 3. Drop release branches for normal publishing
|
||||
|
||||
If every merge to `master` publishes a canary, the current `release/X.Y.Z` train model becomes more ceremony than value.
|
||||
|
||||
Recommended replacement:
|
||||
|
||||
- `master` is the only canary train
|
||||
- every push to `master` can publish a canary
|
||||
- stable is published from a chosen commit or canary tag on `master`
|
||||
|
||||
This matches the workflow you actually want:
|
||||
|
||||
- merge continuously
|
||||
- let npm always have a fresh canary
|
||||
- choose a known-good canary later and promote that commit to stable
|
||||
|
||||
### 4. Promote by source ref, not by "renaming" a canary
|
||||
|
||||
This is the most important mechanical constraint.
|
||||
|
||||
npm can move dist-tags, but it does not let you rename an already-published version. That means:
|
||||
|
||||
- you can move `latest` to `paperclipai@1.2.3`
|
||||
- you cannot turn `paperclipai@2026.317.0-canary.8` into `paperclipai@2026.317.0`
|
||||
|
||||
So "promote canary to stable" really means:
|
||||
|
||||
1. choose the commit or canary tag you trust
|
||||
2. rebuild from that exact commit
|
||||
3. publish it again with the stable version string
|
||||
|
||||
Because of that, the stable workflow should take a source ref, not just a bump type.
|
||||
|
||||
Recommended stable input:
|
||||
|
||||
- `source_ref`
|
||||
- commit SHA, or
|
||||
- a canary git tag such as `canary/v2026.317.1-canary.8`
|
||||
|
||||
### 5. Only stable releases get release notes, tags, and GitHub Releases
|
||||
|
||||
Canaries should stay lightweight:
|
||||
|
||||
- publish to npm under `canary`
|
||||
- optionally create a lightweight or annotated git tag
|
||||
- do not create GitHub Releases
|
||||
- do not require `releases/v*.md`
|
||||
- do not spend LLM tokens
|
||||
|
||||
Stable releases should remain the public narrative surface:
|
||||
|
||||
- git tag `v2026.317.0`
|
||||
- GitHub Release `v2026.317.0`
|
||||
- stable changelog file `releases/v2026.317.0.md`
|
||||
|
||||
## Security Model
|
||||
|
||||
### Recommendation
|
||||
|
||||
Use npm trusted publishing with GitHub Actions OIDC, then disable token-based publishing access for the packages.
|
||||
|
||||
Why:
|
||||
|
||||
- no long-lived `NPM_TOKEN` in repo or org secrets
|
||||
- no personal npm token in Actions
|
||||
- short-lived credentials minted only for the authorized workflow
|
||||
- automatic npm provenance for public packages in public repos
|
||||
|
||||
This is the cleanest answer to the open-repo security concern.
|
||||
|
||||
### Concrete controls
|
||||
|
||||
#### 1. Use one release workflow file
|
||||
|
||||
Use one workflow filename for both canary and stable publishing:
|
||||
|
||||
- `.github/workflows/release.yml`
|
||||
|
||||
Why:
|
||||
|
||||
- npm trusted publishing is configured per workflow filename
|
||||
- npm currently allows one trusted publisher configuration per package
|
||||
- GitHub environments can still provide separate canary/stable approval rules inside the same workflow
|
||||
|
||||
#### 2. Use separate GitHub environments
|
||||
|
||||
Recommended environments:
|
||||
|
||||
- `npm-canary`
|
||||
- `npm-stable`
|
||||
|
||||
Recommended policy:
|
||||
|
||||
- `npm-canary`
|
||||
- allowed branch: `master`
|
||||
- no human reviewer required
|
||||
- `npm-stable`
|
||||
- allowed branch: `master`
|
||||
- required reviewer enabled
|
||||
- prevent self-review enabled
|
||||
- admin bypass disabled
|
||||
|
||||
Stable should require an explicit second human gate even if the workflow is manually dispatched.
|
||||
|
||||
#### 3. Lock down workflow edits
|
||||
|
||||
Add or tighten `CODEOWNERS` coverage for:
|
||||
|
||||
- `.github/workflows/*`
|
||||
- `scripts/release*`
|
||||
- `doc/RELEASING.md`
|
||||
|
||||
This matters because trusted publishing authorizes a workflow file. The biggest remaining risk is not secret exfiltration from forks. It is a maintainer-approved change to the release workflow itself.
|
||||
|
||||
#### 4. Remove traditional npm token access after OIDC works
|
||||
|
||||
After trusted publishing is verified:
|
||||
|
||||
- set package publishing access to require 2FA and disallow tokens
|
||||
- revoke any legacy automation tokens
|
||||
|
||||
That eliminates the "someone stole the npm token" class of failure.
|
||||
|
||||
### What not to do
|
||||
|
||||
- do not put your personal Claude or npm token in GitHub Actions
|
||||
- do not run release logic from `pull_request_target`
|
||||
- do not make stable publishing depend on a repo secret if OIDC can handle it
|
||||
- do not create canary GitHub Releases
|
||||
|
||||
## Changelog Strategy
|
||||
|
||||
### Recommendation
|
||||
|
||||
Generate stable changelogs only, and keep LLM-assisted changelog generation out of CI for now.
|
||||
|
||||
Reasoning:
|
||||
|
||||
- canaries happen too often
|
||||
- canaries do not need polished public notes
|
||||
- putting a personal Claude token into Actions is not worth the risk
|
||||
- stable release cadence is low enough that a human-in-the-loop step is acceptable
|
||||
|
||||
Recommended stable path:
|
||||
|
||||
1. pick a canary commit or tag
|
||||
2. run changelog generation locally from a trusted machine
|
||||
3. commit `releases/vYYYY.MDD.P.md`
|
||||
4. run stable promotion
|
||||
|
||||
If the notes are not ready yet, a fallback is acceptable:
|
||||
|
||||
- publish stable
|
||||
- create a minimal GitHub Release
|
||||
- update `releases/vYYYY.MDD.P.md` immediately afterward
|
||||
|
||||
But the better steady-state is to have the stable notes committed before stable publish.
|
||||
|
||||
### Future option
|
||||
|
||||
If you later want CI-assisted changelog drafting, do it with:
|
||||
|
||||
- a dedicated service account
|
||||
- a token scoped only for changelog generation
|
||||
- a manual workflow
|
||||
- a dedicated environment with required reviewers
|
||||
|
||||
That is phase-two hardening work, not a phase-one requirement.
|
||||
|
||||
## Proposed Future Workflow
|
||||
|
||||
### Canary workflow
|
||||
|
||||
Trigger:
|
||||
|
||||
- `push` on `master`
|
||||
|
||||
Steps:
|
||||
|
||||
1. checkout the merged `master` commit
|
||||
2. run verification on that exact commit
|
||||
3. compute canary version for current UTC date
|
||||
4. version public packages to `YYYY.MDD.P-canary.N`
|
||||
5. publish to npm with dist-tag `canary`
|
||||
6. create a canary git tag for traceability
|
||||
|
||||
Recommended canary tag format:
|
||||
|
||||
- `canary/v2026.317.1-canary.4`
|
||||
|
||||
Outputs:
|
||||
|
||||
- npm canary published
|
||||
- git tag created
|
||||
- no GitHub Release
|
||||
- no changelog file required
|
||||
|
||||
### Stable workflow
|
||||
|
||||
Trigger:
|
||||
|
||||
- `workflow_dispatch`
|
||||
|
||||
Inputs:
|
||||
|
||||
- `source_ref`
|
||||
- optional `stable_date`
|
||||
- `dry_run`
|
||||
|
||||
Steps:
|
||||
|
||||
1. checkout `source_ref`
|
||||
2. run verification on that exact commit
|
||||
3. compute the next stable patch slot for the UTC date or provided override
|
||||
4. fail if `vYYYY.MDD.P` already exists
|
||||
5. require `releases/vYYYY.MDD.P.md`
|
||||
6. version public packages to `YYYY.MDD.P`
|
||||
7. publish to npm under `latest`
|
||||
8. create git tag `vYYYY.MDD.P`
|
||||
9. push tag
|
||||
10. create GitHub Release from `releases/vYYYY.MDD.P.md`
|
||||
|
||||
Outputs:
|
||||
|
||||
- stable npm release
|
||||
- stable git tag
|
||||
- GitHub Release
|
||||
- clean public changelog surface
|
||||
|
||||
## Implementation Guidance
|
||||
|
||||
### 1. Replace bump-type version math with explicit version computation
|
||||
|
||||
The current release scripts depend on:
|
||||
|
||||
- `patch`
|
||||
- `minor`
|
||||
- `major`
|
||||
|
||||
That logic should be replaced with:
|
||||
|
||||
- `compute_canary_version_for_date`
|
||||
- `compute_stable_version_for_date`
|
||||
|
||||
For example:
|
||||
|
||||
- `next_stable_version(2026-03-17) -> 2026.317.0`
|
||||
- `next_canary_for_utc_date(2026-03-17) -> 2026.317.0-canary.0`
|
||||
|
||||
### 2. Stop requiring `release/X.Y.Z`
|
||||
|
||||
These current invariants should be removed from the happy path:
|
||||
|
||||
- "must run from branch `release/X.Y.Z`"
|
||||
- "stable and canary for `X.Y.Z` come from the same release branch"
|
||||
- `release-start.sh`
|
||||
|
||||
Replace them with:
|
||||
|
||||
- canary must run from `master`
|
||||
- stable may run from a pinned `source_ref`
|
||||
|
||||
### 3. Keep Changesets only if it stays helpful
|
||||
|
||||
The current system uses Changesets to:
|
||||
|
||||
- rewrite package versions
|
||||
- maintain package-level `CHANGELOG.md` files
|
||||
- publish packages
|
||||
|
||||
With CalVer, Changesets may still be useful for publish orchestration, but it should no longer own version selection.
|
||||
|
||||
Recommended implementation order:
|
||||
|
||||
1. keep `changeset publish` if it works with explicitly-set versions
|
||||
2. replace version computation with a small explicit versioning script
|
||||
3. if Changesets keeps fighting the model, remove it from release publishing entirely
|
||||
|
||||
Paperclip's release problem is now "publish the whole fixed package set at one explicit version", not "derive the next semantic bump from human intent".
|
||||
|
||||
### 4. Add a dedicated versioning script
|
||||
|
||||
Recommended new script:
|
||||
|
||||
- `scripts/set-release-version.mjs`
|
||||
|
||||
Responsibilities:
|
||||
|
||||
- set the version in all public publishable packages
|
||||
- update any internal exact-version references needed for publishing
|
||||
- update CLI version strings
|
||||
- avoid broad string replacement across unrelated files
|
||||
|
||||
This is safer than keeping a bump-oriented changeset flow and then forcing it into a date-based scheme.
|
||||
|
||||
### 5. Keep rollback based on dist-tags
|
||||
|
||||
`rollback-latest.sh` should stay, but it should stop assuming a semver meaning beyond syntax.
|
||||
|
||||
It should continue to:
|
||||
|
||||
- repoint `latest` to a prior stable version
|
||||
- never unpublish
|
||||
|
||||
## Tradeoffs and Risks
|
||||
|
||||
### 1. The stable patch slot is now part of the version contract
|
||||
|
||||
With `YYYY.MDD.P`, same-day hotfixes are supported, but the stable patch slot is now part of the visible version format.
|
||||
|
||||
That is the right tradeoff because:
|
||||
|
||||
1. npm still gets semver-valid versions
|
||||
2. same-day hotfixes stay possible
|
||||
3. chronological ordering still works as long as the day is zero-padded inside `MDD`
|
||||
|
||||
### 2. Public package consumers lose semver intent signaling
|
||||
|
||||
This is the main downside of CalVer.
|
||||
|
||||
If that becomes a problem, one alternative is:
|
||||
|
||||
- use CalVer for the CLI package only
|
||||
- keep semver for library packages
|
||||
|
||||
That is more complex operationally, so I would not start there unless package consumers actually need it.
|
||||
|
||||
### 3. Auto-canary means more publish traffic
|
||||
|
||||
Publishing on every `master` merge means:
|
||||
|
||||
- more npm versions
|
||||
- more git tags
|
||||
- more registry noise
|
||||
|
||||
That is acceptable if canaries stay clearly separate:
|
||||
|
||||
- npm dist-tag `canary`
|
||||
- no GitHub Release
|
||||
- no external announcement
|
||||
|
||||
## Rollout Plan
|
||||
|
||||
### Phase 1: Security foundation
|
||||
|
||||
1. Create `release.yml`
|
||||
2. Configure npm trusted publishers for all public packages
|
||||
3. Create `npm-canary` and `npm-stable` environments
|
||||
4. Add `CODEOWNERS` protection for release files
|
||||
5. Verify OIDC publishing works
|
||||
6. Disable token-based publishing access and revoke old tokens
|
||||
|
||||
### Phase 2: Canary automation
|
||||
|
||||
1. Add canary workflow on `push` to `master`
|
||||
2. Add explicit calendar-version computation
|
||||
3. Add canary git tagging
|
||||
4. Remove changelog requirement from canaries
|
||||
5. Update `doc/RELEASING.md`
|
||||
|
||||
### Phase 3: Stable promotion
|
||||
|
||||
1. Add manual stable workflow with `source_ref`
|
||||
2. Require stable notes file
|
||||
3. Publish stable + tag + GitHub Release
|
||||
4. Update rollback docs and scripts
|
||||
5. Retire release-branch assumptions
|
||||
|
||||
### Phase 4: Cleanup
|
||||
|
||||
1. Remove `release-start.sh` from the primary path
|
||||
2. Remove `patch/minor/major` from maintainer docs
|
||||
3. Decide whether to keep or remove Changesets from publishing
|
||||
4. Document the CalVer compatibility contract publicly
|
||||
|
||||
## Concrete Recommendation
|
||||
|
||||
Paperclip should adopt this model:
|
||||
|
||||
- stable versions: `YYYY.MDD.P`
|
||||
- canary versions: `YYYY.MDD.P-canary.N`
|
||||
- canaries auto-published on every push to `master`
|
||||
- stables manually promoted from a chosen tested commit or canary tag
|
||||
- no release branches in the default path
|
||||
- no canary changelog files
|
||||
- no canary GitHub Releases
|
||||
- no Claude token in GitHub Actions
|
||||
- no npm automation token in GitHub Actions
|
||||
- npm trusted publishing plus GitHub environments for release security
|
||||
|
||||
That gets rid of the annoying part of semver without fighting npm, makes canaries cheap, keeps stables deliberate, and materially improves the security posture of the public repository.
|
||||
|
||||
## External References
|
||||
|
||||
- npm trusted publishing: https://docs.npmjs.com/trusted-publishers/
|
||||
- npm dist-tags: https://docs.npmjs.com/adding-dist-tags-to-packages/
|
||||
- npm semantic versioning guidance: https://docs.npmjs.com/about-semantic-versioning/
|
||||
- GitHub environments and deployment protection rules: https://docs.github.com/en/actions/how-tos/deploy/configure-and-manage-deployments/manage-environments
|
||||
- GitHub secrets behavior for forks: https://docs.github.com/en/actions/how-tos/write-workflows/choose-what-workflows-do/use-secrets
|
||||
Loading…
Add table
Add a link
Reference in a new issue