Engineering Standards

The machine-readable spec that drives automated conformance checks across every service. Fetched live from ecosystem-standards at build time.

135 rules · 122 checkable GitHub ↗

Api11 rules

API-001WARNrequirement checkable

All new API services on Railway

All new API services deployed on Railway regardless of language. Python services use FastAPI. TypeScript services use Hono on Node. Railway is the single hosting standard. Cloudflare Workers deprecated for new API work.

API-002WARNrequirement checkable

PostgreSQL as data store for new services

All new services that require persistence use PostgreSQL deployed on Railway. Not D1, SQLite, or MySQL.

API-003WARNrequirement checkable

ORM required — SQLAlchemy (Python) or Drizzle (TypeScript)

Python: SQLAlchemy async with asyncpg driver, Pydantic-compatible models, Alembic for migrations. TypeScript: Drizzle ORM with postgres.js driver, TypeScript-native schema definitions, Drizzle Kit for migrations. No raw SQL without ORM.

API-004ERRORrequirement checkable

Versioned routes — /v1/<resource>

All routes versioned from day one: /v1/<resource>. No unversioned routes in production.

API-005ERRORrequirement checkable

Response envelope on all endpoints

All responses wrapped: { data: ..., meta: { count, version } }. Errors: { error: { code, message } }. No bare arrays or inconsistent shapes. API-005 is the presence-of-envelope check. XSTACK-002 is the cross-stack parity check (shape identical across Python and TypeScript). API-005 fails when an endpoint returns nothing envelope-shaped at all; XSTACK-002 fails when the envelope exists but diverges from the canonical shape.

API-006WARNrequirement checkable

Tables authorization-scoped to a Clerk user

Every data-bearing table is authorization-scoped to a Clerk user, so queries can filter by the authenticated identity. Three patterns satisfy this: (1) direct ownership via an owner_id column holding the Clerk user ID; (2) identity tables whose primary key is the user_id itself; (3) relationship tables whose user_id column carries a ForeignKey to an identity table.

API-007WARNrequirement checkable

Clerk for authentication

Authentication via Clerk across all services. Consistent across the ecosystem.

API-008ERRORrequirement checkable

Public endpoints explicit and intentional

Public (unauthenticated) endpoints are explicitly documented and intentional — not accidental.

API-009ERRORrequirement checkable

CORS configured via env var — never hardcoded

All API services configure CORS origins via a CORS_ORIGINS environment variable. Never hardcode allowed origins in source code. The env var holds a comma-separated list of allowed origins. Applies to both FastAPI (Python) and Hono (TypeScript) services. CORS_ORIGINS must be present in .env.example with a documented default.

API-010WARNrequirement checkable

/health endpoint on all API services

All Railway-hosted API services expose a GET /health endpoint returning {"status": "ok"} with HTTP 200. No authentication required. Used by Railway for health checks and as the liveness layer of the three-layer observability stack (CD-010). The endpoint must respond even if downstream dependencies (database, external APIs) are unavailable — it signals process liveness only, not dependency health.

API-011ERRORrequirement checkable

Database migrations run via CI — never manually

Alembic (Python) and Drizzle Kit (TypeScript) migrations are applied as a CI/CD step before the new service version is live. Never run migrations manually against a Railway database. Manual migration against production bypasses the audit trail, risks applying the wrong version, and cannot be rolled back cleanly. The migration step in CI runs after tests pass and before the Railway deploy completes.

Auth2 rules

AUTH-001ERRORrequirement checkable

No unverified write endpoints reachable from the public internet

Any FastAPI service with write endpoints (POST, PATCH, DELETE) that depend on an unverified header (X-Owner-Id or equivalent) must either be provably isolated on a private network (Railway private networking, no public port), OR have CLERK_AUTH_ENABLED=true and RS256 JWT verification active. A module-level docstring in auth.py documenting the current posture and upgrade path is required regardless of which condition is satisfied.

AUTH-002ERRORrequirement checkable

API auth scheme and HTTP client must match

When api-kaianolevine-com's auth scheme changes, CommonPythonApiClient._headers() in common-python-utils must change in the same release. A mismatch causes all pipeline writes to return 401. Auth mechanism used by the API and sent by its client must be documented in the same location (auth.py cross-referencing common-python-utils and vice versa).

Config2 rules

CFG-001WARNrequirement checkable

No getattr() access for undeclared Settings fields

pydantic-settings with extra="ignore" silently drops env vars not declared as fields. Using getattr(settings, "KEY", default) on an undeclared key is always equivalent to using the hardcoded default. This pattern is prohibited. Any key accessed via settings.KEY or getattr(settings, "KEY", ...) must be declared as a typed field on the Settings class. Deferred fields must be documented in a commented-out stub block in config.py.

CFG-002WARNconvention checkable

Every key in .env.example must be declared in Settings

Any key in .env.example that is not a declared field on Settings creates a misleading contract. Exception: keys consumed by external tooling (RAILWAY_*, NODE_ENV) may be listed with a comment indicating they are not read by the app.

Cross Stack5 rules

XSTACK-001ERRORrequirement checkable

Shared library declared as dependency

Python services declare `common-python-utils` as a dependency. TypeScript services declare `common-typescript-utils` as a dependency. This rule is the concrete dep-presence check. The anti-reimplementation concern (don't reimplement shared helpers even if the library is installed) lives in XSTACK-005.

XSTACK-005WARNrequirement checkable

Shared helpers are not reimplemented

Services that declare the shared library as a dependency (XSTACK-001) must actually use its helpers rather than reimplementing the same behavior locally. The behaviors covered: logging setup, Clerk auth verification, API response envelope construction, internal-auth header helpers, retry wrappers around Google APIs.

XSTACK-002WARNrequirement checkable

Cross-stack response shape parity

API response shapes are identical across Python and TypeScript services: success: { data: ..., meta: { count, version } } error: { error: { code, message } } Defined in `common-typescript-utils` for TypeScript. Enforced by `response_model` on FastAPI route decorators for Python.

XSTACK-003ERRORrequirement checkable

pnpm for all TypeScript projects

All TypeScript projects — monorepos and standalone services — use pnpm as the package manager. Not npm, not yarn. pnpm-lock.yaml is the lockfile. package-lock.json and yarn.lock must not be present. This is the TypeScript equivalent of PY-001 (uv for Python). pnpm workspaces are used for monorepo package management (deejaytools-com).

XSTACK-004WARNrequirement checkable

Env var naming conventions by runtime

Environment variable naming follows runtime-specific prefix conventions: PUBLIC_ prefix for Astro client-exposed variables (e.g. PUBLIC_API_URL). VITE_ prefix for Vite/React client-exposed variables (e.g. VITE_API_URL). No prefix for Railway server-side variables accessible only at runtime. Mixing these leaks server secrets into client bundles or causes builds to fail silently. Each repo's .env.example must follow this convention for all declared variables.

Delivery14 rules

CD-001WARNconvention

Feature flags for major functionality toggles

Feature flags are used to decouple deployment from release. They gate major sections of functionality — not minor implementation details. The canonical use cases are: Kill switches: disable a risky integration (e.g. Anthropic API calls in evaluator-cog) without a redeploy. Readiness gates: API ships a new endpoint; frontend UI is hidden until the flag is enabled. Allows frontend code to be merged and deployed independently of API readiness. Maintenance mode: return 503 gracefully during migrations without touching code. Flags are stored in the feature_flags table in api-kaianolevine-com's PostgreSQL database and served via GET /v1/feature-flags. The public read endpoint is unauthenticated. Write endpoints (POST, PATCH, DELETE /v1/feature-flags/:key) require Clerk JWT auth with admin role. Flag naming convention: <service>.<feature> Examples: evaluator_cog.llm_soft_rules evaluator_cog.conformance_check watcher_cog.drive_polling api.maintenance_mode Flag anatomy — each flag must have: key: string identifier enabled: boolean description: why it exists and when it should be deleted permanent: boolean — true for infrastructure flags (maintenance_mode), false for rollout/readiness flags (must be deleted post-rollout) Lifecycle contract: 1. Ship code with flag check 2. Deploy and activate when ready by flipping the flag via the admin panel 3. For non-permanent flags: ship a follow-up PR removing the flag check once the feature is fully rolled out and stable 4. Delete the DB row after the cleanup PR merges Include a comment in code pointing to the flag key so it is easy to find: # feature flag: evaluator_cog.llm_soft_rules Flags are NOT used for: - Fine-grained A/B testing or percentage rollouts - Per-user targeting - Replacing Prefect flow configuration - Gating minor implementation details Clients check flags at runtime (not build time) via a short-TTL in-memory cache (30–60 seconds per process) to avoid hitting the API on every request. Astro static sites fetch flags client-side on load. Fail-open or fail-closed behavior must be intentional and documented per flag.

CD-002WARNrequirement checkable

Sentry for error tracking — all production services

All production services integrate Sentry for unhandled exception tracking. This includes FastAPI services, Hono services, and Python cogs (always-on worker services). Free tier is sufficient. Sentry is initialised at service entry point before any application logic runs. SENTRY_DSN is set as an environment variable — never hardcoded. The same Sentry project may be shared across related services or a dedicated project used per service. Sentry covers Layer 3 observability (unhandled exceptions). It does not replace structured logging (Layer 2) or liveness monitoring (Layer 1).

CD-003WARNrequirement checkable

Structured logging via shared library

All Python services use common-python-utils logger. All TypeScript services use common-typescript-utils logger. Never use print() in production code paths. Log output is JSON-formatted in production. Standard log event shape (all services): timestamp: ISO 8601 service: repo name (e.g. watcher-cog, api-kaianolevine-com) level: DEBUG | INFO | WARN | ERROR category: infra | pipeline | data | api event: snake_case event name (e.g. trigger_fired, file_processed) context: key-value pairs specific to the event Category definitions: infra: service lifecycle, triggers fired/not fired, heartbeats, Drive poll results pipeline: file processed/skipped/failed, pipeline completed/failed, Prefect flow run created data: data quality issues, schema violations, evaluation findings, duplicate detection api: HTTP errors (4xx/5xx), slow responses (>2s), external API failures, auth failures Emoji prefixes used for human-readable local output only. Structured JSON is the production format.

CD-004ERRORconvention

Pinned GitHub Action versions should resolve

Every `uses: owner/repo@vN` reference in `.github/workflows/` should point at a version tag that exists in the action's release history. Invalid tags cause CI to fail silently at workflow-parse time. Verification is expected at the point of change — when bumping or adding an action pin, confirm the tag exists in the upstream repo. This is a code-review expectation rather than a pipeline gate; automating it would mean hitting the GitHub API for every pinned tag on every conformance run, which is costly and doesn't add signal over the first CI execution catching the typo.

CD-005WARNrequirement checkable

Prefect Cloud for pipeline observability

All pipeline flows connect to Prefect Cloud (free tier). Run history, step logs, and flow state are visible at app.prefect.cloud. GitHub Actions logs are for deep debugging only — not the primary observability surface.

CD-006ERRORrequirement checkable

GitHub Actions is CI/CD only — not a trigger relay

GitHub Actions handles lint, tests, deploys, and semantic-release. It does not trigger pipeline flows. Drive-event triggers go through watcher-cog. Scheduled and manual triggers go through Prefect deployments. `repository_dispatch` as a trigger relay for cogs is a retired pattern.

CD-007WARNrequirement checkable

Healthchecks.io for trigger cogs

Trigger cogs — services running their own continuous polling loop without external orchestration visibility — integrate Healthchecks.io as a dead man's switch. If a trigger cog silently dies, no flow runs are triggered and the failure is invisible to Prefect Cloud. The service pings HEALTHCHECKS_URL on every poll cycle. If the ping goes silent beyond the grace period, Healthchecks.io sends an email alert. Recommended settings: period 1 minute, grace 5 minutes. HEALTHCHECKS_URL is set as an environment variable — never hardcoded. Free tier is sufficient. Pipeline cogs using prefect.serve() are exempt — Prefect Cloud provides run-level liveness monitoring and Railway provides process-level restart. The test: if a silent crash would cause work to stop being triggered with no external visibility, Healthchecks is required. If Prefect would surface the failure anyway, it is not.

CD-008WARNrequirement checkable

Log levels used consistently and intentionally

Log levels follow a strict contract across all services: DEBUG: internal state useful for local debugging only. Never emitted in production by default. INFO: meaningful lifecycle events — service started, file processed, trigger fired, pipeline completed. Reviewable weekly. Default production level. WARN: something unexpected happened but the service recovered. Skipped files, retried calls, degraded behaviour. Reviewed promptly. ERROR: something failed and requires attention. Unhandled exceptions, failed triggers, data integrity violations. Reviewed immediately. Never use ERROR for expected failure modes (e.g. file not found when that is a valid outcome). Never use INFO for high-frequency noise that would obscure meaningful events.

CD-009WARNrequirement checkable

Structured log event shape — all services

All meaningful log events emitted by production services include these fields: timestamp: ISO 8601 service: repo name (e.g. watcher-cog, api-kaianolevine-com) level: DEBUG | INFO | WARN | ERROR category: infra | pipeline | data | api event: snake_case event name (e.g. trigger_fired, file_processed) context: key-value pairs specific to the event Category definitions: infra: service lifecycle, triggers fired/not fired, heartbeats, Drive poll results pipeline: file processed/skipped/failed, pipeline completed/failed, Prefect flow run created data: data quality issues, schema violations, evaluation findings, duplicate detection api: HTTP errors (4xx/5xx), slow responses (>2s), external API failures, auth failures Log output is JSON in production, human-readable with emoji prefixes in local development. The shared library (common-python-utils for Python, common-typescript-utils for TypeScript) provides the logger — services do not configure logging inline.

CD-010ERRORrequirement checkable

Three-layer observability stack — all production services

Every production service implements all three observability layers: Layer 1 — Liveness: Healthchecks.io for always-on workers. Railway auto-restart for all services. Layer 2 — Structured logs: common-python-utils or common-typescript-utils logger emitting JSON events with standard shape. Queryable in Railway log viewer or Better Stack. Layer 3 — Exceptions: Sentry capturing all unhandled exceptions with full stack trace and context. These three layers are non-overlapping and non-redundant: Layer 1 catches: process died, service unreachable Layer 2 catches: business logic errors, skipped work, slow operations Layer 3 catches: unhandled exceptions and crashes A service missing any layer has a blind spot in production. Pipelines additionally require Prefect run history as a fourth layer covering orchestration-level observability.

CD-011ERRORrequirement checkable

Doppler as canonical secret store

Doppler must be the canonical secret store and single source of truth for all infrastructure secrets across the MiniAppPolis ecosystem. This includes API keys, database URLs, service tokens, internal API keys, Sentry DSNs, Healthchecks URLs, and Clerk credentials. Doppler syncs automatically to: Railway: all API services and cogs receive secrets via the Doppler → Railway native sync GitHub Actions: CI secrets synced via Doppler → GitHub integration Cloudflare Pages: secrets synced via Doppler CLI in CI/CD Secret management workflow: - All secret changes are made in Doppler only - Doppler pushes changes to downstream platforms automatically - Never manually set secrets in Railway, GitHub, or Cloudflare when Doppler manages that service — manual overrides must not diverge from Doppler as the source of truth - .env.example lists all required secret keys with no values — this file is the human-readable contract of what a service needs Doppler project structure: One Doppler project per service or cog. Environments: development, staging, production (minimum). The only known carve-out is Prefect Cloud. Prefect uses its own encrypted Blocks for flow-level secrets. Prefect Blocks are managed directly in Prefect Cloud and are not synced from Doppler. When a secret used in a Prefect Block is rotated, Prefect must be updated manually. This is a known limitation — document it per-secret in Doppler's description field.

CD-012ERRORrequirement checkable

Internal auth via Clerk Bearer tokens

All authenticated calls to ecosystem APIs use a single `Authorization: Bearer <token>` header carrying a Clerk-issued credential. The same header accepts two token types: - Clerk session JWTs (human users) — RS256, verified locally by the API against Clerk's JWKS document (cached for 5 minutes). Claims include `sub`, which the API extracts as `owner_id`. - Clerk M2M opaque tokens (cogs / services) — verified remotely by the API via Clerk's BAPI endpoint (`POST https://api.clerk.com/v1/m2m_tokens/verify`) using the API's `CLERK_SECRET_KEY`. The response's `subject` (or `sub`) is the machine identity and is also treated as `owner_id`. Token format discrimination is by dot count: JWTs have exactly two dots; opaque tokens have none. The API selects the verification path accordingly. Required env vars on any API enforcing this rule: - CLERK_JWKS_URL — JWKS document URL for the Clerk instance - CLERK_ISSUER — expected `iss` claim (e.g. https://clerk.kaianolevine.com) - CLERK_SECRET_KEY — Clerk secret key used to call the BAPI verify endpoint Required env vars on any service or cog calling the API: - CLERK_SECRET_KEY (or equivalent M2M credential) — configured on the caller's environment. Callers do not touch this value directly: `common_python_utils.auth.get_m2m_token` reads it and returns a short-lived M2M opaque token, cached until expiry. Calling code just receives the token and sets the `Authorization: Bearer <token>` header. The retired mechanism (`X-Internal-API-Key` header with Doppler-managed per-caller keys) was superseded during Project Keystone. No API in the ecosystem accepts `X-Internal-API-Key`. Any remaining references in source, documentation, or tooling are drift and must be removed.

CD-014WARNrequirement checkable

Cloudflare Pages for all static site deployments

All Astro static sites deploy to Cloudflare Pages. Not Netlify, not Vercel, not GitHub Pages. Cloudflare Pages provides dev/prod environment split via branch deploy previews, zero-config CDN, and consistent with the broader ecosystem's Cloudflare footprint (Workers, DNS). New static sites are connected to Cloudflare Pages before being considered deployed.

CD-015ERRORrequirement checkable

Prefect serve() — no work pool

Cogs register flows using prefect.serve() running in-process on Railway. The Prefect work pool architecture is not used — it requires an external agent process and adds infrastructure complexity with no benefit at this scale. prefect.serve() runs the flow directly in the cog's Railway process with full access to environment variables and the filesystem. Any cog using flow.deploy() targeting a work pool, or referencing a work pool name in its Prefect configuration, is violating this pattern.

Documentation12 rules

DOC-001ERRORrequirement checkable

README.md is mandatory

Every repo has a README. It describes: purpose (one paragraph), inputs, outputs, environment variables, how to run locally, how to run tests, and versioning policy.

DOC-002WARNrequirement checkable

README describes inputs and outputs

For pipeline/trigger cogs: what files are expected, where they come from, what is produced and where it goes. For APIs: what endpoints exist and what they serve. Scoped to repos that have a well-defined input→output boundary. Static sites render content (the "input" is the content directory, not an external data source); react-apps consume APIs (the input/output description belongs on the API); shared libraries document their public API via docstrings and pyproject metadata. The pipeline-signal vocabulary used by the LLM check (source folder, produces, writes to, etc.) is a poor fit for those repo types.

DOC-003WARNrequirement checkable

CHANGELOG.md required

Tracks meaningful changes per version. Not every commit — only changes that affect behavior, interface, or configuration. Managed by semantic-release. Never edited manually.

DOC-004WARNrequirement checkable

.env.example is current

Every environment variable used by the service is documented in .env.example with a description and example value. Kept current — not a one-time artifact.

DOC-005INFOrequirement checkable

Design decisions captured as ADRs

When a significant architectural decision is made, the rationale is captured as an Architecture Decision Record (ADR) in `docs/decisions/`. Each ADR uses the format: - `ADR-NNN-short-slug.md` - Sections: Context, Decision, Consequences ADRs record the "why" — what forces led to the decision, what was considered, what tradeoff was accepted. They are not a replacement for CHANGELOG entries (which record what changed), but they are the permanent record of why the ecosystem looks the way it does.

DOC-006WARNrequirement checkable

Docstrings on all public functions and classes

All public functions and classes have docstrings. One sentence minimum describing what the function does, not how.

DOC-007WARNrequirement checkable

Pydantic field descriptions required

Pydantic model fields use the description parameter. Models are self-documenting — no separate documentation needed for data shapes.

DOC-008WARNrequirement checkable

No dead code

Commented-out code is removed, not left in place. Version control is the history — the codebase is the present. Scoped to repos with an active code surface. Static sites are primarily content (commented-out Markdown / HTML structure is not dead code in the same sense); shared libraries intentionally retain commented reference implementations for API callers; the standards repo is catalog YAML.

DOC-009WARNrequirement checkable

Split package identity documented at entry point

If a Python library's install name (pyproject.toml [project] name) differs from its import namespace (src/ directory name), both names and their relationship must be documented in __init__.py docstring. README must show both correct install snippet and correct import path.

DOC-010ERRORrequirement checkable

OpenAPI docs are first-class

FastAPI's /docs is a deliverable, not a side effect. Every endpoint must be complete and accurate before the service is considered done. All endpoints have summary, description, and response_model defined.

DOC-011WARNrequirement checkable

Public endpoints documented as intentional

Public (unauthenticated) endpoints include a note in their description confirming they are intentionally public.

DOC-013WARNrequirement checkable

README "Running locally" section is complete

Every repo's README includes a "Running locally" section that covers five conceptual elements, expressed in the idioms of the repo's stack. Python repos (cogs, FastAPI services, libraries): (1) prerequisites — Python version (≥3.11) and uv (2) install — `uv sync --all-extras` (3) pre-commit — `uv run pre-commit install` and `uv run pre-commit run --all-files` (4) run — the exact command(s) to start the service or scripts (5) test — `uv run pytest` and the coverage variant TypeScript API services (Hono): (1) prerequisites — Node.js version and pnpm (2) install — `pnpm install` (3) lint/type-check — `pnpm lint` and `pnpm typecheck` (4) run — `pnpm dev` or equivalent dev server command (5) test — `pnpm test` Frontend sites and apps (Astro, React/Vite): (1) prerequisites — Node.js version and pnpm (or npm) (2) install — `pnpm install` (3) environment — copy `.env.example` to `.env` and fill values (4) run — `pnpm dev` or `astro dev` (5) build — `pnpm build` and confirm it succeeds The section must be kept current. Copy-paste from the README must work on a clean clone with no prior knowledge of the repo.

Evaluation7 rules

EVAL-001WARNconvention

Evaluation findings stored, not ephemeral

Evaluation results are written to the database alongside pipeline outputs. They are reviewable, queryable, and feed the portfolio surface. Results logged only are not acceptable for production pipelines.

EVAL-002ERRORrequirement checkable

Every evaluation references a standards version

Every evaluation call includes a reference to the current version of the standards document being evaluated against. Without the rubric, evaluation is undefined. Canonical implementation: resolve standards_version at runtime by reading package.json from the ecosystem-standards repo, cached for the process lifetime. Allow an explicit environment variable override (STANDARDS_VERSION) for tests and controlled pins. Do not hardcode version strings as defaults — they will go stale silently, and presence checks alone will not catch the drift.

EVAL-003WARNrequirement checkable

Findings are specific and actionable

Every finding emitted by evaluator-cog references a specific standard by ID, describes what was observed with enough specificity to locate the violation, and suggests a concrete remediation. The remediation must describe a concrete action — placeholders like "Fix this" or "See docs" are not acceptable.

EVAL-006INFOconvention

Evaluations should stay current with the standards version

A repo's most recent evaluation should have been scored against the current standards catalog version, not a materially older one. Drift here means the pipeline page is showing findings generated against stale rules — misleading, though usually self-correcting on the next scheduled conformance run. Because conformance runs daily across every repo in `active_repos`, long-lived drift only happens when a repo is dropped from the active list or its runs have been failing. Both are easier to notice through run-health monitoring than through a separate automated check.

EVAL-007ERRORrequirement checkable

Standards/evaluator check coverage must be tracked and in sync

Every checkable rule in ecosystem-standards (those with `checkable: true`) must have a corresponding implementation in evaluator-cog's deterministic check engine. Every check in evaluator-cog's deterministic engine must map back to a real rule in ecosystem-standards. The mapping contract uses the rule's `id` as the stable check identifier. Each deterministic check in evaluator-cog registers itself under the matching rule `id` string (e.g. a check for CD-004 registers as `"CD-004"`). Drift between the two repos is detected automatically by evaluator-cog itself on every evaluation run. When drift is detected: - Unimplemented checkable rules → severity WARN finding - Orphaned evaluator checks (no backing standard) → severity ERROR finding - Version skew (evaluator's pinned standards version is more than one minor version behind current package.json) → severity WARN finding All EVAL-007 findings are written to pipeline_evaluations via api-kaianolevine-com with source=standards_drift, so they appear in the Pipeline Health view alongside all other conformance findings.

EVAL-008WARNrequirement checkable

Every active repo must have an evaluator.yaml at its root

All active repos in the ecosystem must have an evaluator.yaml file at the repo root. This file declares the repo's type, traits, exemptions, deferrals, and downgrades — the per-repo evaluation configuration introduced in ADR-001 (federated evaluation) and ADR-002 (type/trait taxonomy). See index.yaml schema.evaluator_yaml for the authoritative field contract. Required fields: type: one of the repo types defined in index.yaml schema.repo_types Optional fields: traits: list of trait names from index.yaml schema.traits exemptions: list of {rule, reason} objects for rules that genuinely do not apply to this repo deferrals: list of {rule, reason, until} objects for known failures not currently prioritized downgrades: list of {rule, to, reason} objects for rules whose check fires incorrectly on this repo for a documented reason — the check still runs but the finding is emitted at the listed severity Minimum valid evaluator.yaml: type: pipeline-cog During the migration period (while ecosystem.yaml still carries check_exceptions), a missing evaluator.yaml is a WARN not an ERROR. Once migration is complete and check_exceptions are stripped from ecosystem.yaml, this rule will be promoted to ERROR.

EVAL-GAP-001INFOgap

Deterministic conformance checks — partial coverage

The conformance flow in evaluator-cog implements a subset of deterministic checks covering file presence, pyproject.toml, CI YAML, AST scanning, and test structure. Of the ~122 checkable rules defined across the standards catalog, the majority are not yet covered deterministically and fall to LLM assessment only. Remediation: backfill deterministic checks one domain at a time, starting with delivery.yaml (CD-004 action version pinning) and documentation.yaml (DOC-013 README completeness). Tracking of implemented vs unimplemented checks is automated via EVAL-007 drift detection, which runs inside evaluator-cog.

Frontend10 rules

FE-001WARNrequirement checkable

Astro for all static sites

All portfolio, technical, and community sites use Astro. Static-first, component islands for interactivity. If the primary product is content delivery, use Astro. If it is an app with persistent UI state, use React.

FE-002ERRORrequirement checkable

Vite + React + TypeScript for web apps

Standard stack for all React web apps. Vite for bundling, React 19, TypeScript throughout. No exceptions.

FE-003WARNrequirement checkable

Tailwind CSS for styling

Tailwind is the styling primitive for all React apps and Astro sites. No CSS modules, no styled-components, no other CSS framework.

FE-004WARNrequirement checkable

shadcn/ui for components

shadcn/ui is the component library standard. Components are copied into src/components/ui/ — not installed as a dependency. Built on Radix UI primitives and Tailwind.

FE-005WARNrequirement checkable

React Hook Form + Zod for forms and validation

All forms use React Hook Form. All validation schemas use Zod. These are assumed by shadcn/ui form components and are the ecosystem standard.

FE-006ERRORrequirement checkable

Graceful degradation — build succeeds without API

Build succeeds with empty or unavailable API data. Site does not break if API is down.

FE-007ERRORrequirement checkable

No hardcoded API URLs

No hardcoded API URLs. PUBLIC_API_URL (Astro) or VITE_API_URL (React) env var used throughout.

FE-008WARNrequirement checkable

Pinned starter version

Astro sites pin their starter version. Upstream preserved in-repo for reference. Do not blindly upgrade starter versions — pin and document the version in use.

FE-009INFOrequirement checkable

Build-time data for static content

Infrequently-changing data (collections, summaries) fetched at build time via Astro data files or frontmatter. Do not make runtime API calls for content that could be fetched at build time.

FE-010INFOrequirement checkable

Runtime queries for interactive demos only

Live API queries reserved for search, filtering, and interactive demo surfaces. Implemented via Astro component islands. Document which endpoints are called at runtime vs build time.

Meta7 rules

META-001ERRORrequirement checkable

Repo version is managed by semantic-release

The standards repo version of record lives in package.json, owned and written by semantic-release on every release. Humans do not edit the version field. All consumers (evaluator-cog and any future consumer) resolve `standards_version` by reading package.json at runtime.

META-002ERRORrequirement checkable

No scattered version or date metadata

Version and date metadata lives in exactly one place (package.json, managed by semantic-release — see META-001). Standards files and the manifest must not carry header `# Version:` comments, top-level `updated:` fields, top-level `maintainer:` fields, or per-rule `added:` fields. These were all human-maintained and historically drifted; the cleanup is now enforced.

META-003WARNrequirement checkable

Canonical enums in index.yaml are dictionaries

Canonical value lists in index.yaml (severities, dimensions, statuses, repo_types, traits) are declared as dictionaries mapping each value to a one-line description. Bare lists are not acceptable — they fail to document what each value means and force consumers to look elsewhere for the definition.

META-004WARNrequirement checkable

Canonical field-value enums have CONTEXT CONTRACT blocks

When a field in the pipeline_evaluations data model takes a canonical set of string values (severity, source, dimension), the authoritative list of values and their semantics must be documented in the relevant domain file's CONTEXT CONTRACT section. This is how downstream consumers (evaluator-cog, cogs, api-kaianolevine-com) know which values are valid. Current examples: the `pipeline_evaluations.source` canonical values block in standards/evaluation.yaml.

META-005WARNrequirement checkable

Checkable rules declare DETERMINISTIC or LLM in check_notes

Every rule with `checkable: true` must open its `check_notes` block with either `DETERMINISTIC CHECK.` or `LLM CHECK.` on the first line of the value. This tells evaluator-cog (and human readers) at a glance whether the check runs through the deterministic engine (file-presence, regex, AST, CI YAML parsing) or the LLM engine (judgment over source code or prose). The convention is load-bearing: evaluator-cog routes checks to the correct engine based on the prefix, and drift detection between the standards catalog and the evaluator's registry (EVAL-007) keys off the same classification. A checkable rule without a prefix defaults to LLM, which is slower and less reliable than a deterministic check that simply forgot to declare itself. This prefix is part of the canonical `check_notes` schema — see `index.yaml` `schema.rule_fields.check_notes`. META-005 is the deterministic enforcement of that schema contract.

META-006WARNrequirement checkable

Rule ID prefix matches the file's declared rule_prefix

Every rule in `standards/<file>.yaml` must have an ID whose prefix appears in that file's `rule_prefix` entry in `index.yaml`. The prefix-to-file mapping in `index.yaml` `files:` is authoritative — it is how readers navigate the catalog, and how this playbook's "pick the right file" step resolves. A rule whose prefix disagrees with its file silently breaks that navigation. Gap rules use the form `<PREFIX>-GAP-NNN` and must still match the file's declared prefix.

META-007WARNconvention checkable

Rule IDs are append-only — retired numbers should not be reused

When a rule is retired, its entry is deleted from the YAML file (git history is the record — see `playbooks/new-standard.md` Step 8). The retired ID number should never be reused by a subsequent rule. A new rule takes `max(ever-seen ID for this prefix) + 1` as computed from git history — not `max(current IDs) + 1`. This keeps cross-references (findings in `pipeline_evaluations`, prose in ADRs, inline comments in downstream repos) unambiguous across time. This is a convention, not a requirement. Full enforcement would require walking git history on every PR or evaluator run, which is not worth the runtime cost. The primary control is author discipline following `playbooks/new-standard.md` Step 4, which provides a git-log-based command for finding the next correct ID. CI enforces only the lightweight guard: no duplicate IDs within the current catalog.

Monorepo3 rules

MONO-001ERRORrequirement checkable

Workspace-level deps satisfy per-app shared library requirements

For services inside a pnpm monorepo, shared library requirements (XSTACK-001) are checked at the workspace root package.json, not at the per-app package.json. A dep declared in the workspace root satisfies XSTACK-001 for all apps in that monorepo. Rationale: pnpm workspaces hoist deps to the root by default. Requiring each app to re-declare common-typescript-utils independently is redundant and not how pnpm workspaces are intended to work. Consequence: evaluator-cog must not flag XSTACK-001 for a monorepo app if the dep is present in the monorepo's `workspace_deps` list in ecosystem.yaml and is declared in the workspace root package.json.

MONO-002WARNrequirement checkable

CI is evaluated at the monorepo root — not per-app

For services inside a pnpm monorepo, CI configuration (ci.yml, .github/workflows/) is evaluated at the monorepo repo root. Per-app CI files are additive only — they do not replace root CI. The evaluator assesses CD-rule compliance (semantic-release, test job, pnpm setup) against the root CI config, not a per-app subfolder. This means: - A single root ci.yml covering both apps satisfies CD requirements for both services. - If only one app has a CI file in its subdirectory and the other does not, this is not a violation as long as the root CI covers both. - If the root CI is absent entirely, both apps are flagged — not just one.

MONO-003WARNrequirement checkable

Deduplicate sibling findings with shared root cause

When two apps in the same monorepo generate findings with the same rule_id and the same root cause (e.g. both fail CD-002 because Sentry is absent at the workspace level), evaluator-cog emits one finding tagged with both service IDs rather than two identical findings. Distinct per-app failures (e.g. api fails XSTACK-001 but app passes) remain as separate findings.

Pipeline14 rules

PIPE-001ERRORrequirement checkable

Prefect for new pipelines

New event-driven pipelines use Prefect. Python-native, built-in observability, retry logic, run history. GitHub Actions is for CI/CD only — not pipeline orchestration.

PIPE-002ERRORrequirement checkable

Database writes use upsert patterns

Database writes in pipeline steps use upsert patterns — on_conflict_do_update, merge, or equivalent — rather than plain inserts. This is the concrete, mechanically-checkable portion of pipeline idempotency. The broader "is the whole flow idempotent" concern lives in PIPE-013.

PIPE-003WARNrequirement checkable

Separate process and collect steps

Processing new inputs and rebuilding the collection are separate pipeline steps — not combined. Allows independent retry.

PIPE-004ERRORrequirement checkable

Concurrency groups required for shared resources

Pipelines writing to shared resources declare a concurrency group. cancel-in-progress: false for data pipelines.

PIPE-005ERRORrequirement checkable

Raw inputs are archived, not deleted

Raw inputs are archived after processing, never deleted. This rule covers the concrete delete-vs-archive check. The broader "is archiving applied consistently across all input types" check lives in PIPE-014.

PIPE-006WARNrequirement checkable

Dual logger pattern in Prefect flows

Flow functions use get_run_logger() for Prefect-visible logs with fallback to standard logger for local runs: try: logger = get_run_logger() except Exception: logger = log

PIPE-007WARNrequirement checkable

Retry logic on external API calls

Tasks that call external APIs use retries=2, retry_delay_seconds=30. Tasks calling Google APIs inherit common-python-utils retry logic.

PIPE-008WARNrequirement checkable

Retired trigger patterns are not used

Pipeline cogs do not contain any of the retired trigger patterns (repository_dispatch relays, workflow_dispatch from application code, direct GitHub API calls to trigger workflows, google-app-script-trigger references). These were replaced by the watcher-cog → Prefect API pattern. The broader "does this cog's trigger architecture match the canonical pattern" check lives in PIPE-015.

PIPE-009ERRORrequirement checkable

Pipeline cogs must acquire a named Prefect concurrency slot before scanning shared resources

Any pipeline cog that scans a shared resource (Drive folder, S3 bucket, database table, etc.) and processes items sequentially must acquire a named Prefect concurrency slot before scanning. This prevents two concurrent runs from scanning the same resource state simultaneously, which would cause duplicate LLM calls, redundant processing, and potential data integrity issues even when a unique constraint is present. The deployment concurrency_limit=1 is insufficient — it controls scheduling but does not prevent a second run from executing if it was picked up by the runner before the first run registered as active. The runtime concurrency slot is the correct enforcement point. Implementation: from prefect.concurrency.sync import concurrency with concurrency("cog-name", occupy=1): # all scanning and processing logic here The slot name must match a concurrency limit created in Prefect Cloud (Settings → Concurrency). Limit must be set to 1. The limit must exist before the cog is deployed or the slot acquisition will be silently skipped by Prefect. Naming convention: use the cog name as the slot name (e.g. notes-ingest, deejay-cog, evaluator-cog).

PIPE-011WARNrequirement checkable

AI evaluation step as final pipeline task

All production pipeline cogs include a behavioral evaluation step as the final task in the flow. The step calls evaluate_pipeline_run() from evaluator-cog, summarising what the run did (items processed, failures, skips) and posts a structured finding to pipeline_evaluations. This ensures every run — successful or not — is visible in pipeline health. Crash and failure hooks (on_crashed, on_failure) are separate and complementary — they cover the case where the flow body never reaches the evaluation step. Both patterns must be present: the inline evaluation step for successful runs, and the crash/failure hooks for abnormal termination.

PIPE-012WARNconvention checkable

Retry delay disabled in test context

Prefect @task decorators with retry_delay_seconds= must zero out the delay when running under pytest. Use os.getenv("PYTEST_CURRENT_TEST") as the guard — pytest sets this automatically on every test run. Retries still happen; only the wait between retries is zeroed out. This prevents retry cooldowns from inflating test suite duration. Correct pattern: @task( retries=2, retry_delay_seconds=0 if os.getenv("PYTEST_CURRENT_TEST") else 30, ) Never monkeypatch retry settings in tests — fix the decorator.

PIPE-013WARNrequirement checkable

Flow design is idempotent end-to-end

Beyond database writes (PIPE-002), the overall flow design must produce the same result when re-run. This covers: file moves that depend on current state, stateful counters, external side effects that are not guarded by a deduplication key, and flow-level retry semantics.

PIPE-014WARNrequirement checkable

Archiving pattern is consistent across all input types

For pipeline cogs that handle multiple input types (e.g. Drive files + API submissions + scheduled batches), the archiving behavior must be consistent across all of them. One input type being archived to `processed/` while another is silently retained in the source folder creates operational confusion.

PIPE-015WARNrequirement checkable

Cog trigger architecture matches the canonical pattern

The canonical Drive-event trigger pattern is: File appears in watched Drive folder → watcher-cog detects the change (1-minute poll via Drive API) → watcher-cog calls Prefect API to create a flow run → Prefect executes the cog flow. This rule checks whether a cog's overall trigger architecture conforms to the pattern — not just whether it avoids retired patterns (PIPE-008). The two checks fail for different reasons and require different fixes.

Principles10 rules

PRIN-001ERRORconvention

Reuse over rebuild

Work in one layer should have a clear path to surfacing in another. Cogs feed APIs. APIs feed sites. Shared libraries reduce duplication across all layers.

PRIN-002ERRORrequirement checkable

Pipeline resilience — continue on item failure

Pipelines continue on per-file errors. One bad input does not abort a run. Failures are logged, prefixed, and left for retry — never silently dropped. Scoped to pipeline-cog and trigger-cog: the rule describes iteration over a batch of items, which is only meaningful for repos that process such batches. API services handle one request at a time; static sites and react-apps don't iterate inputs; shared libraries and the standards repo don't run pipelines.

PRIN-003ERRORconvention

AI evaluates process, not data

AI is infrastructure, not a feature. It runs during the pipeline to evaluate health and conformance — not to enrich output data for end users.

PRIN-004INFOconvention

New stack, opportunistic migration

New things are built with the current best stack. Existing things migrate when the opportunity arises — not on a forced schedule.

PRIN-005ERRORrequirement checkable

Observability is not optional

If you cannot see what a system is doing in production, you cannot deploy it confidently. Logging, error tracking, and run history are requirements, not enhancements. Every production service must have all three. Scoped to production services. Shared libraries are consumed by services that own observability at their boundary; static sites and react-apps have different observability concerns (client-side error tracking) that are covered by the frontend standards, not this rule.

PRIN-006WARNconvention

Deploy anytime, release when ready

Code and features are decoupled. Deployments are safe at any time. Feature flags control what users see. A flag should be the first tool considered when a feature spans multiple services or when the frontend needs to ship before the API is ready. See CD-001 for the full feature flag standard.

PRIN-007INFOconvention

Portfolio artifacts are working systems

Portfolio pieces are live, queryable, demonstrable systems — not case studies, READMEs, or screenshots.

PRIN-008ERRORconvention

Bug fixes should bring regression coverage

When a bug is found in production, the fix should land alongside a test that would have caught it. In practice this is a code- review expectation rather than an automated gate — "did a test file change alongside the source change" is only a weak proxy (a touched test isn't necessarily a regression test, and some fixes legitimately have no unit-testable surface area). Whether a given test is actually a regression test for the specific bug lives in PRIN-010.

PRIN-009WARNrequirement checkable

Standards are a living document

This document is updated as the ecosystem evolves. Outdated standards are worse than no standards — they create confusion and erode trust in the process. Standards must be reviewed at least every 90 days.

PRIN-010WARNrequirement checkable

Bug fix tests are genuine regression tests

Beyond the presence of test changes in a fix commit (PRIN-008), the test changes must actually exercise the failure path that the bug fix addresses. A fix commit that modifies an existing test's assertion to match new behavior — without adding a test that would have failed pre-fix — does not satisfy this rule.

Python15 rules

PY-001WARNrequirement checkable

uv for dependency management

uv replaces Poetry. Faster installs, simpler lockfile, better resolution. All new cogs use uv. Existing cogs migrate when touched.

PY-002WARNrequirement checkable

ruff for linting and formatting

ruff replaces black + isort + flake8. Single tool, significantly faster. Configured in pyproject.toml.

PY-003WARNrequirement checkable

Python 3.11+ minimum version

Minimum version for new work is Python 3.11. Python 3.13 preferred. Type hints used throughout.

PY-004WARNrequirement checkable

Pydantic for all external data validation

All external data (CSV rows, API responses, Drive file metadata) validated through Pydantic models before processing. Replaces ad-hoc dict access.

PY-005ERRORrequirement checkable

src layout required

All packages use src/<package_name>/ + tests/<package_name>/. No flat layouts.

PY-006ERRORrequirement checkable

common-python-utils declared as dependency

All cogs declare common-python-utils as a dependency. Shared behaviors live there, not duplicated per-repo. Flag if cog re-implements logging, API clients, or metadata helpers.

PY-007WARNrequirement checkable

pyproject.toml as single source of truth

pyproject.toml is the only configuration file. No setup.py, no requirements.txt.

PY-008WARNrequirement checkable

pre-commit configured

ruff and basic hooks configured via .pre-commit-config.yaml. Runs on every commit.

PY-009INFOconvention checkable

hatchling as build backend

hatchling is the ecosystem standard build backend for all Python packages. Already used by common-python-utils. Requires no [tool.setuptools] config. New repos must use hatchling from day one. Existing repos migrate as a chore commit.

PY-010INFOconvention checkable

ruff line length is 88

All Python repos set line-length = 88 in [tool.ruff] in pyproject.toml. Matches Python community default and Black. Ensures cross-repo consistency.

PY-011WARNconvention checkable

Naming conventions — Python

Package name: snake_case matching repo name (hyphens become underscores). Module files: snake_case verb phrases (process_new_files.py). Functions: snake_case descriptive verbs (generate_dj_set_collection()). Constants: UPPER_SNAKE_CASE in config module. Pydantic models: PascalCase nouns (DjSetRecord, TrackRow). Log messages: emoji prefix for lifecycle events (🚀 start ✅ success ❌ failure).

PY-012WARNrequirement checkable

FAILED_ prefix for failed inputs

Files that fail processing are renamed FAILED_<original> in the source folder for manual retry. Failed files must not be silently deleted or left unmarked.

PY-013WARNrequirement checkable

Deduplication handling is visible

A pipeline-cog or trigger-cog that processes potentially-repeating input shows explicit handling of duplicate input in its source. The specific mechanism is an implementation decision — common patterns include file-level renames (e.g. possible_duplicate_ prefix), row-level uniqueness constraints with skip-on-conflict, content-hash dedup, or equivalent. The chosen strategy is recorded in docs/decisions/ as an ADR. Cogs whose inputs cannot contain duplicates by construction (e.g. a pure-compute transform over already-canonicalized input) are expected to declare this in evaluator.yaml as an exemption with a reason naming why duplicates are impossible.

PY-014WARNrequirement checkable

finally for temp file cleanup

Temp files created during processing are always cleaned up in a finally block regardless of success or failure.

PY-015ERRORrequirement checkable

SQLAlchemy async pattern required in FastAPI services

FastAPI services use SQLAlchemy's async engine and session — AsyncEngine, async_sessionmaker, AsyncSession, and await session.execute(). Sync SQLAlchemy sessions (Session, sessionmaker) must not be used in an async FastAPI context. Sync sessions block the event loop and cause connection pool exhaustion under load. The asyncpg driver is required alongside SQLAlchemy async.

Testing15 rules

TEST-000INFOconvention

Testing pyramid — right ratio, not maximum coverage

The goal is the right ratio of test types, not maximum line coverage. Unit tests (many): pure functions in isolation, no I/O, no external calls. Fast. Integration tests (some): cog interaction with mocked external dependencies. One per major external integration. E2E tests (few): happy path only, run nightly not on every commit. One per pipeline covering the full flow.

TEST-001WARNrequirement checkable

Normalization test required per cog

A test that verifies the normalization/cleaning logic on representative input. Covers the core transformation the cog exists to do.

TEST-002WARNrequirement checkable

Deduplication test required per cog

A test that verifies duplicate inputs are correctly identified and handled — not silently overwritten.

TEST-003WARNrequirement checkable

Failure path test required per cog

A test that verifies the cog handles a bad input correctly — logs the error, marks the file, and continues rather than aborting.

TEST-004WARNrequirement checkable

Output shape test required per cog

A test that verifies the output (JSON schema, DB row, or file structure) matches the expected contract.

TEST-005WARNrequirement checkable

pytest as test runner

pytest is the test runner for all Python projects. Configured in pyproject.toml.

TEST-006WARNrequirement checkable

pytest-cov for coverage in CI

Coverage measured on every CI run. Report in terminal. Threshold not enforced by number — enforced by critical path coverage (TEST-001 through TEST-004).

TEST-007ERRORrequirement checkable

respx/httpx for HTTP mocking — no real external calls

respx/httpx used for mocking HTTP calls in integration tests. No real external calls to Drive, Sheets, or any external API in unit or integration tests.

TEST-008WARNrequirement checkable

FastAPI TestClient for API tests

FastAPI's TestClient via httpx used for all API endpoint tests. Tests run without a live server.

TEST-009ERRORrequirement checkable

Database fixtures — no production data in tests

Tests use a separate test database or in-memory SQLite with transaction rollback. Never run against the production database.

TEST-010ERRORrequirement checkable

Contract test for every API endpoint

Every API endpoint has a dedicated test asserting the response envelope shape: { data, meta } on success, { error: { code, message } } on failure. One contract test per endpoint minimum. Contract tests run in CI on every push.

TEST-011ERRORrequirement checkable

Mock verification required

Every mock in a test must be verified with assert_called() or assert_called_once(). Tests that pass because a mock was never called are false positives.

TEST-012WARNrequirement checkable

mypy must run in CI if [tool.mypy] is declared

If a repo declares [tool.mypy] in pyproject.toml, a CI step invoking "uv run mypy src/" (or equivalent) is required. A mypy config that is never run in CI gives false assurance and drifts silently. Exception: mypy may be omitted during initial setup if repo is in a documented "typing: in progress" state. This exception must not persist past first stable release.

TEST-013INFOconvention checkable

Operational delays sourced from config, not hardcoded

Deliberate delays and waits in production code — retry intervals, poll sleeps, request timeouts — should be sourced from Settings or environment variables rather than hardcoded. This allows them to be set to zero or near-zero in test environments without modifying source code, keeping test runs fast without changing behavior. This is a quality-of-life convention, not a blocking requirement. Python: source from Settings or os.getenv(), default to production value. TypeScript: source from process.env, default to production value.

TEST-GAP-001INFOgap checkable

GAP: Most existing cogs have import-level tests only

Most existing cogs currently have import-level tests only. This is a known gap with active remediation underway. Per-repo remediation sequencing (which cog goes first, priority order) lives in BACKLOG.md — not here — so it can be edited without churning the standards catalog. When a cog is remediated, this gap entry is closed for that cog and it is marked as compliant with TEST-001 through TEST-004.

Versioning8 rules

VER-001WARNrequirement checkable

Conventional Commits format

All commit messages follow: type: description. Types: feat, fix, docs, refactor, chore, test, ci.

VER-002WARNrequirement checkable

BREAKING CHANGE footer for major bumps

Use explicit BREAKING CHANGE in commit body for major version bumps. The feat!: shorthand is unreliable. Correct pattern using two -m flags: git commit -m 'feat: description' -m 'BREAKING CHANGE: explanation'

VER-003ERRORrequirement checkable

semantic-release on all repos

Every repo has .releaserc.json and a release job in ci.yml. On merge to main: tests run, semantic-release reads commits, determines bump, updates version file, updates CHANGELOG.md, creates git tag, creates GitHub Release.

VER-004ERRORrequirement checkable

Never manually edit version files or CHANGELOG

Never manually edit version in pyproject.toml or package.json. Never manually edit CHANGELOG.md. Both are owned by semantic-release.

VER-005ERRORrequirement checkable

fetch-depth: 0 on CI checkout

All CI jobs that run semantic-release must use fetch-depth: 0 on the checkout step. Without this semantic-release cannot read full git history and will not release correctly.

VER-006ERRORrequirement checkable

Plugins installed explicitly before semantic-release

semantic-release plugins are installed via npm install --no-save in the release job before running npx semantic-release. Not via package.json. Required because npx only installs the core package.

VER-007INFOconvention

Pull before branching

Always run git pull origin main before creating a new feature branch. Prevents merge conflicts with semantic-release commits (chore(release): X.Y.Z [skip ci]) that land on main after every release.

VER-008ERRORrequirement checkable

.releaserc.json assets must include all version-managed files

The assets array in the @semantic-release/git plugin must include every file that semantic-release modifies during a release. At minimum: CHANGELOG.md plus whichever version file the exec plugin updates — index.yaml for ecosystem-standards, pyproject.toml for Python services, package.json for TypeScript services. If a file is bumped by the exec prepareCmd but not listed in assets, semantic-release commits back to main without it, creating a version drift that is invisible until the next clone.