Skip to content

Engineering Standards

The machine-readable spec that drives automated conformance checks across every service. Fetched live from ecosystem-standards at build time.

v1.6.0110 rules · 51 checkable GitHub ↗
Principles9 rules
PRIN-001ERRORrequirement

Reuse over rebuild

Work in one layer should have a clear path to surfacing in another. Cogs feed APIs. APIs feed sites. Shared libraries reduce duplication across all layers.

PRIN-002ERRORrequirement checkable

Pipeline resilience — continue on item failure

Pipelines continue on per-file errors. One bad input does not abort a run. Failures are logged, prefixed, and left for retry — never silently dropped.

PRIN-003ERRORrequirement

AI evaluates process, not data

AI is infrastructure, not a feature. It runs during the pipeline to evaluate health and conformance — not to enrich output data for end users.

PRIN-004INFOrequirement

New stack, opportunistic migration

New things are built with the current best stack. Existing things migrate when the opportunity arises — not on a forced schedule.

PRIN-005ERRORrequirement checkable

Observability is not optional

If you cannot see what a system is doing in production, you cannot deploy it confidently. Logging, error tracking, and run history are requirements, not enhancements. Every production service must have all three.

PRIN-006WARNrequirement

Deploy anytime, release when ready

Code and features are decoupled. Deployments are safe at any time. Feature flags control what users see. A flag should be the first tool considered when a feature spans multiple services or when the frontend needs to ship before the API is ready. See CD-001 for the full feature flag standard.

PRIN-007INFOrequirement

Portfolio artifacts are working systems

Portfolio pieces are live, queryable, demonstrable systems — not case studies, READMEs, or screenshots.

PRIN-008ERRORrequirement

Every bug fix includes a regression test

If a bug was found in production, a test is written that would have caught it. This is non-negotiable.

PRIN-009WARNrequirement checkable

Standards are a living document

This document is updated as the ecosystem evolves. Outdated standards are worse than no standards — they create confusion and erode trust in the process. Standards must be reviewed at least every 90 days.

Python16 rules
PY-001WARNrequirement checkable

uv for dependency management

uv replaces Poetry. Faster installs, simpler lockfile, better resolution. All new cogs use uv. Existing cogs migrate when touched.

PY-002WARNrequirement checkable

ruff for linting and formatting

ruff replaces black + isort + flake8. Single tool, significantly faster. Configured in pyproject.toml.

PY-003WARNrequirement checkable

Python 3.11+ minimum version

Minimum version for new work is Python 3.11. Python 3.13 preferred. Type hints used throughout.

PY-004WARNrequirement

Pydantic for all external data validation

All external data (CSV rows, API responses, Drive file metadata) validated through Pydantic models before processing. Replaces ad-hoc dict access.

PY-009INFOconvention checkable

hatchling as build backend

hatchling is the ecosystem standard build backend for all Python packages. Already used by common-python-utils. Requires no [tool.setuptools] config. New repos must use hatchling from day one. Existing repos migrate as a chore commit.

PY-010INFOconvention checkable

ruff line length is 88

All Python repos set line-length = 88 in [tool.ruff] in pyproject.toml. Matches Python community default and Black. Ensures cross-repo consistency.

PY-005ERRORrequirement checkable

src layout required

All packages use src/<package_name>/ + tests/<package_name>/. No flat layouts.

PY-006ERRORrequirement checkable

common-python-utils declared as dependency

All cogs declare common-python-utils as a dependency. Shared behaviors live there, not duplicated per-repo. Flag if cog re-implements logging, API clients, or metadata helpers.

PY-007WARNrequirement checkable

pyproject.toml as single source of truth

pyproject.toml is the only configuration file. No setup.py, no requirements.txt.

PY-008WARNrequirement checkable

pre-commit configured

ruff and basic hooks configured via .pre-commit-config.yaml. Runs on every commit.

PY-011WARNconvention

Naming conventions — Python

Package name: snake_case matching repo name (hyphens become underscores). Module files: snake_case verb phrases (process_new_files.py). Functions: snake_case descriptive verbs (generate_dj_set_collection()). Constants: UPPER_SNAKE_CASE in config module. Pydantic models: PascalCase nouns (DjSetRecord, TrackRow). Log messages: emoji prefix for lifecycle events (🚀 start ✅ success ❌ failure).

PY-012WARNrequirement

FAILED_ prefix for failed inputs

Files that fail processing are renamed FAILED_<original> in the source folder for manual retry. Failed files must not be silently deleted or left unmarked.

PY-013WARNrequirement

possible_duplicate_ prefix for duplicates

Duplicate files are renamed rather than overwritten or deleted. Human review required.

PY-014WARNrequirement

finally for temp file cleanup

Temp files created during processing are always cleaned up in a finally block regardless of success or failure.

CFG-001WARNrequirement checkable

No getattr() access for undeclared Settings fields

pydantic-settings with extra="ignore" silently drops env vars not declared as fields. Using getattr(settings, "KEY", default) on an undeclared key is always equivalent to using the hardcoded default. This pattern is prohibited. Any key accessed via settings.KEY or getattr(settings, "KEY", ...) must be declared as a typed field on the Settings class. Deferred fields must be documented in a commented-out stub block in config.py.

CFG-002WARNconvention checkable

Every key in .env.example must be declared in Settings

Any key in .env.example that is not a declared field on Settings creates a misleading contract. Exception: keys consumed by external tooling (RAILWAY_*, NODE_ENV) may be listed with a comment indicating they are not read by the app.

Testing14 rules
TEST-000INFOrequirement

Testing pyramid — right ratio, not maximum coverage

The goal is the right ratio of test types, not maximum line coverage. Unit tests (many): pure functions in isolation, no I/O, no external calls. Fast. Integration tests (some): cog interaction with mocked external dependencies. One per major external integration. E2E tests (few): happy path only, run nightly not on every commit. One per pipeline covering the full flow.

TEST-001WARNrequirement checkable

Normalization test required per cog

A test that verifies the normalization/cleaning logic on representative input. Covers the core transformation the cog exists to do.

TEST-002WARNrequirement checkable

Deduplication test required per cog

A test that verifies duplicate inputs are correctly identified and handled — not silently overwritten.

TEST-003ERRORrequirement checkable

Failure path test required per cog

A test that verifies the cog handles a bad input correctly — logs the error, marks the file, and continues rather than aborting.

TEST-004WARNrequirement checkable

Output shape test required per cog

A test that verifies the output (JSON schema, DB row, or file structure) matches the expected contract.

TEST-005WARNrequirement checkable

pytest as test runner

pytest is the test runner for all Python projects. Configured in pyproject.toml.

TEST-006WARNrequirement checkable

pytest-cov for coverage in CI

Coverage measured on every CI run. Report in terminal. Threshold not enforced by number — enforced by critical path coverage (TEST-001 through TEST-004).

TEST-007ERRORrequirement

respx/httpx for HTTP mocking — no real external calls

respx/httpx used for mocking HTTP calls in integration tests. No real external calls to Drive, Sheets, or any external API in unit or integration tests.

TEST-012WARNrequirement checkable

mypy must run in CI if [tool.mypy] is declared

If a repo declares [tool.mypy] in pyproject.toml, a CI step invoking "uv run mypy src/" (or equivalent) is required. A mypy config that is never run in CI gives false assurance and drifts silently. Exception: mypy may be omitted during initial setup if repo is in a documented "typing: in progress" state. This exception must not persist past first stable release.

TEST-008WARNrequirement

FastAPI TestClient for API tests

FastAPI's TestClient via httpx used for all API endpoint tests. Tests run without a live server.

TEST-009ERRORrequirement

Database fixtures — no production data in tests

Tests use a separate test database or in-memory SQLite with transaction rollback. Never run against the production database.

TEST-010ERRORrequirement checkable

Contract test for every API endpoint

Every API endpoint has a dedicated test asserting the response envelope shape: { data, meta } on success, { error: { code, message } } on failure. One contract test per endpoint minimum. Contract tests run in CI on every push.

TEST-011ERRORrequirement

Mock verification required

Every mock in a test must be verified with assert_called() or assert_called_once(). Tests that pass because a mock was never called are false positives.

TEST-GAP-001INFOgap checkable

GAP: Most existing cogs have import-level tests only

Most existing cogs currently have import-level tests only. This is a known gap with active remediation underway. Remediation order: deejay-cog first (new platform foundation), then by activity level. When a cog is remediated, this gap entry is closed and the cog is marked as compliant with TEST-001 through TEST-004.

Documentation13 rules
DOC-001ERRORrequirement checkable

README.md is mandatory

Every repo has a README. It describes: purpose (one paragraph), inputs, outputs, environment variables, how to run locally, how to run tests, and versioning policy.

DOC-002WARNrequirement

README describes inputs and outputs

For processors: what files are expected, where they come from, what is produced and where it goes. For APIs: what endpoints exist and what they serve.

DOC-003WARNrequirement checkable

CHANGELOG.md required

Tracks meaningful changes per version. Not every commit — only changes that affect behavior, interface, or configuration. Managed by semantic-release. Never edited manually.

DOC-004WARNrequirement checkable

.env.example is current

Every environment variable used by the service is documented in .env.example with a description and example value. Kept current — not a one-time artifact.

DOC-005WARNrequirement

Design decisions captured

When a significant architectural decision is made, the rationale is written down — in docs/DESIGN.md, a CHANGELOG entry, or a README note. Not just what was decided but why.

DOC-013WARNrequirement checkable

README "Running locally" section is complete

Every repo's README includes a "Running locally" section that covers: (1) prerequisites — Python version (≥3.11) and uv; (2) install — `uv sync --all-extras`; (3) pre-commit — `uv run pre-commit install` (run once after cloning) and `uv run pre-commit run --all-files` (run manually at any time); (4) run — the exact command(s) to execute the service or scripts; (5) test — `uv run pytest` and the coverage variant. The section must be kept current. Copy-paste from the README must work on a clean clone with no prior knowledge of the repo.

DOC-006WARNrequirement checkable

Docstrings on all public functions and classes

All public functions and classes have docstrings. One sentence minimum describing what the function does, not how.

DOC-007WARNrequirement

Pydantic field descriptions required

Pydantic model fields use the description parameter. Models are self-documenting — no separate documentation needed for data shapes.

DOC-008WARNrequirement checkable

No dead code

Commented-out code is removed, not left in place. Version control is the history — the codebase is the present.

DOC-009WARNrequirement checkable

Split package identity documented at entry point

If a Python library's install name (pyproject.toml [project] name) differs from its import namespace (src/ directory name), both names and their relationship must be documented in __init__.py docstring. README must show both correct install snippet and correct import path.

DOC-010ERRORrequirement checkable

OpenAPI docs are first-class

FastAPI's /docs is a deliverable, not a side effect. Every endpoint must be complete and accurate before the service is considered done. All endpoints have summary, description, and response_model defined.

DOC-011WARNrequirement

Public endpoints documented as intentional

Public (unauthenticated) endpoints include a note in their description confirming they are intentionally public.

DOC-012ERRORrequirement checkable

Standards document is versioned

Every update to the standards repo increments the version in index.yaml. The AI evaluator always references a specific version. Evaluations without a standards version reference are invalid.

API10 rules
API-001ERRORrequirement

All new API services on Railway

All new API services deployed on Railway regardless of language. Python services use FastAPI. TypeScript services use Hono on Node. Railway is the single hosting standard. Cloudflare Workers deprecated for new API work.

API-002WARNrequirement

PostgreSQL as data store for new services

All new services use PostgreSQL deployed on Railway. Not D1 or SQLite.

API-003WARNrequirement

ORM required — SQLAlchemy (Python) or Drizzle (TypeScript)

Python: SQLAlchemy async with asyncpg driver, Pydantic-compatible models, Alembic for migrations. TypeScript: Drizzle ORM with postgres.js driver, TypeScript-native schema definitions, Drizzle Kit for migrations. No raw SQL without ORM.

API-004ERRORrequirement checkable

Versioned routes — /v1/<resource>

All routes versioned from day one: /v1/<resource>. No unversioned routes in production.

API-005ERRORrequirement checkable

Response envelope on all endpoints

All responses wrapped: { data: ..., meta: { count, version } }. Errors: { error: { code, message } }. No bare arrays or inconsistent shapes.

API-006WARNrequirement

owner_id on all tables

Every table includes owner_id (Clerk user ID). Not enforced as FK today but present for future multi-tenant use.

API-007WARNrequirement

Clerk for authentication

Authentication via Clerk across all services. Consistent across the ecosystem.

API-008ERRORrequirement

Public endpoints explicit and intentional

Public (unauthenticated) endpoints are explicitly documented and intentional — not accidental.

AUTH-001ERRORrequirement checkable

No unverified write endpoints reachable from the public internet

Any FastAPI service with write endpoints (POST, PATCH, DELETE) that depend on an unverified header (X-Owner-Id or equivalent) must either be provably isolated on a private network (Railway private networking, no public port), OR have CLERK_AUTH_ENABLED=true and RS256 JWT verification active. A module-level docstring in auth.py documenting the current posture and upgrade path is required regardless of which condition is satisfied.

AUTH-002ERRORrequirement checkable

API auth scheme and HTTP client must match

When kaianolevine-api's auth scheme changes, CommonPythonApiClient._headers() in common-python-utils must change in the same release. A mismatch causes all pipeline writes to return 401. Auth mechanism used by the API and sent by its client must be documented in the same location (auth.py cross-referencing common-python-utils and vice versa).

Pipeline9 rules
PIPE-001WARNrequirement

Prefect for new pipelines

New event-driven pipelines use Prefect. Python-native, built-in observability, retry logic, run history. GitHub Actions is for CI/CD only — not pipeline orchestration.

PIPE-002ERRORrequirement

Idempotent pipeline steps

Every pipeline step can be safely re-run without side effects. Re-running a step produces the same result as running it once.

PIPE-003WARNrequirement

Separate process and collect steps

Processing new inputs and rebuilding the collection are separate pipeline steps — not combined. Allows independent retry.

PIPE-004ERRORrequirement

Concurrency groups required for shared resources

Pipelines writing to shared resources declare a concurrency group. cancel-in-progress: false for data pipelines.

PIPE-005ERRORrequirement

Archive, never delete raw inputs

Raw inputs are archived after processing, never deleted. Archive subfolder used.

PIPE-006WARNrequirement checkable

Dual logger pattern in Prefect flows

Flow functions use get_run_logger() for Prefect-visible logs with fallback to standard logger for local runs: try: logger = get_run_logger() except Exception: logger = log

PIPE-007WARNrequirement

Retry logic on external API calls

Tasks that call external APIs use retries=2, retry_delay_seconds=30. Tasks calling Google APIs inherit common-python-utils retry logic.

PIPE-008WARNrequirement

watcher-cog as canonical Drive trigger

The canonical Drive-event trigger pattern is: File appears in watched Drive folder → watcher-cog detects change (1-minute poll via Drive API) → watcher-cog calls Prefect API to create flow run → Prefect executes the cog flow watcher-cog is an always-on Railway worker service. Config-driven: adding a new folder-to-cog mapping requires one WatcherConfig entry — no code changes. The previous pattern (Apps Script → repository_dispatch → GitHub Actions) is retired. google-app-script-trigger is archived.

PIPE-009WARNrequirement

AI evaluation step as final pipeline task

All production pipelines include an AI evaluation step as the final task. The step assesses run conformance against the current standards version and writes findings to the pipeline_evaluations table.

Frontend10 rules
FE-001WARNrequirement

Astro for all static sites

All portfolio, technical, and community sites use Astro. Static-first, component islands for interactivity. If the primary product is content delivery, use Astro. If it is an app with persistent UI state, use React.

FE-002ERRORrequirement

Vite + React + TypeScript for web apps

Standard stack for all React web apps. Vite for bundling, React 19, TypeScript throughout. No exceptions.

FE-003WARNrequirement

Tailwind CSS for styling

Tailwind is the styling primitive for all React apps and Astro sites. No CSS modules, no styled-components, no other CSS framework.

FE-004WARNrequirement

shadcn/ui for components

shadcn/ui is the component library standard. Components are copied into src/components/ui/ — not installed as a dependency. Built on Radix UI primitives and Tailwind.

FE-005WARNrequirement

React Hook Form + Zod for forms and validation

All forms use React Hook Form. All validation schemas use Zod. These are assumed by shadcn/ui form components and are the ecosystem standard.

FE-006ERRORrequirement

Graceful degradation — build succeeds without API

Build succeeds with empty or unavailable API data. Site does not break if API is down.

FE-007ERRORrequirement checkable

No hardcoded API URLs

No hardcoded API URLs. PUBLIC_API_URL (Astro) or VITE_API_URL (React) env var used throughout.

FE-008WARNrequirement

Pinned starter version

Astro sites pin their starter version. Upstream preserved in-repo for reference. Do not blindly upgrade starter versions — pin and document the version in use.

FE-009WARNrequirement

Build-time data for static content

Infrequently-changing data (collections, summaries) fetched at build time via Astro data files. Do not make runtime API calls for content that could be fetched at build time.

FE-010INFOrequirement

Runtime queries for interactive demos only

Live API queries reserved for search, filtering, and interactive demo surfaces. Implemented via Astro component islands. Document which endpoints are called at runtime vs build time.

Delivery20 rules
CD-001WARNrequirement

Feature flags for major functionality toggles

Feature flags are used to decouple deployment from release. They gate major sections of functionality — not minor implementation details. The canonical use cases are: Kill switches: disable a risky integration (e.g. Anthropic API calls in evaluator-cog) without a redeploy. Readiness gates: API ships a new endpoint; frontend UI is hidden until the flag is enabled. Allows frontend code to be merged and deployed independently of API readiness. Maintenance mode: return 503 gracefully during migrations without touching code. Flags are stored in the feature_flags table in api-kaianolevine-com's PostgreSQL database and served via GET /v1/feature-flags. The public read endpoint is unauthenticated. Write endpoints (POST, PATCH, DELETE /v1/feature-flags/:key) require Clerk JWT auth with admin role. Flag naming convention: <service>.<feature> Examples: evaluator_cog.llm_soft_rules evaluator_cog.conformance_check watcher_cog.drive_polling api.maintenance_mode Flag anatomy — each flag must have: key: string identifier enabled: boolean description: why it exists and when it should be deleted permanent: boolean — true for infrastructure flags (maintenance_mode), false for rollout/readiness flags (must be deleted post-rollout) Lifecycle contract: 1. Ship code with flag check 2. Deploy and activate when ready by flipping the flag via the admin panel 3. For non-permanent flags: ship a follow-up PR removing the flag check once the feature is fully rolled out and stable 4. Delete the DB row after the cleanup PR merges Include a comment in code pointing to the flag key so it is easy to find: # feature flag: evaluator_cog.llm_soft_rules Flags are NOT used for: - Fine-grained A/B testing or percentage rollouts - Per-user targeting - Replacing Prefect flow configuration - Gating minor implementation details Clients check flags at runtime (not build time) via a short-TTL in-memory cache (30–60 seconds per process) to avoid hitting the API on every request. Astro static sites fetch flags client-side on load. Fail-open or fail-closed behavior must be intentional and documented per flag.

CD-002WARNrequirement checkable

Sentry for error tracking — all production services

All production services integrate Sentry for unhandled exception tracking. This includes FastAPI services, Hono services, and Python cogs (always-on worker services). Free tier is sufficient. Sentry is initialised at service entry point before any application logic runs. SENTRY_DSN is set as an environment variable — never hardcoded. The same Sentry project may be shared across related services or a dedicated project used per service. Sentry covers Layer 3 observability (unhandled exceptions). It does not replace structured logging (Layer 2) or liveness monitoring (Layer 1).

CD-003WARNrequirement checkable

Structured logging via shared library

All Python services use common-python-utils logger. All TypeScript services use kaiano-ts-utils logger. Never use print() in production code paths. Log output is JSON-formatted in production. Standard log event shape (all services): timestamp: ISO 8601 service: repo name (e.g. watcher-cog, api-kaianolevine-com) level: DEBUG | INFO | WARN | ERROR category: infra | pipeline | data | api event: snake_case event name (e.g. trigger_fired, file_processed) context: key-value pairs specific to the event Category definitions: infra: service lifecycle, triggers fired/not fired, heartbeats, Drive poll results pipeline: file processed/skipped/failed, pipeline completed/failed, Prefect flow run created data: data quality issues, schema violations, evaluation findings, duplicate detection api: HTTP errors (4xx/5xx), slow responses (>2s), external API failures, auth failures Emoji prefixes used for human-readable local output only. Structured JSON is the production format.

CD-004ERRORrequirement checkable

GitHub Actions version tags must be valid

All "uses: owner/repo@vN" references in .github/workflows/ must reference version tags that exist in the action's release history. Invalid version tags cause CI to fail silently. Current pinned versions (March 2026): actions/checkout@v6, actions/setup-node@v6, astral-sh/setup-uv@v7. Verify the tag exists before committing any version pin.

VER-001WARNrequirement

Conventional Commits format

All commit messages follow: type: description. Types: feat, fix, docs, refactor, chore, test, ci.

VER-002WARNrequirement

BREAKING CHANGE footer for major bumps

Use explicit BREAKING CHANGE in commit body for major version bumps. The feat!: shorthand is unreliable. Correct pattern using two -m flags: git commit -m 'feat: description' -m 'BREAKING CHANGE: explanation'

VER-003ERRORrequirement checkable

semantic-release on all repos

Every repo has .releaserc.json and a release job in ci.yml. On merge to main: tests run, semantic-release reads commits, determines bump, updates version file, updates CHANGELOG.md, creates git tag, creates GitHub Release.

VER-004ERRORrequirement

Never manually edit version files or CHANGELOG

Never manually edit version in pyproject.toml or package.json. Never manually edit CHANGELOG.md. Both are owned by semantic-release.

VER-005ERRORrequirement checkable

fetch-depth: 0 on CI checkout

All CI jobs that run semantic-release must use fetch-depth: 0 on the checkout step. Without this semantic-release cannot read full git history and will not release correctly.

VER-006ERRORrequirement checkable

Plugins installed explicitly before semantic-release

semantic-release plugins are installed via npm install --no-save in the release job before running npx semantic-release. Not via package.json. Required because npx only installs the core package.

CD-005WARNrequirement

Prefect Cloud for pipeline observability

All pipeline flows connect to Prefect Cloud (free tier). Run history, step logs, and flow state are visible at app.prefect.cloud. GitHub Actions logs are for deep debugging only — not the primary observability surface.

CD-006WARNrequirement

GitHub Actions is CI/CD only — not a trigger relay

GitHub Actions handles lint, tests, deploys, and semantic-release. It does not trigger pipeline flows. Drive-event triggers go through watcher-cog. Scheduled and manual triggers go through Prefect deployments. repository_dispatch as a trigger relay for cogs is a retired pattern.

CD-007WARNrequirement checkable

Healthchecks.io for always-on worker services

Always-on Railway worker services (services with no HTTP port that run continuously) integrate Healthchecks.io as a dead man's switch. The service pings HEALTHCHECKS_URL on every work cycle. If the ping goes silent beyond the grace period, Healthchecks.io sends an email alert. Recommended settings: period 1 minute, grace 5 minutes. HEALTHCHECKS_URL is set as an environment variable — never hardcoded. Free tier is sufficient.

CD-008WARNrequirement

Log levels used consistently and intentionally

Log levels follow a strict contract across all services: DEBUG: internal state useful for local debugging only. Never emitted in production by default. INFO: meaningful lifecycle events — service started, file processed, trigger fired, pipeline completed. Reviewable weekly. Default production level. WARN: something unexpected happened but the service recovered. Skipped files, retried calls, degraded behaviour. Reviewed promptly. ERROR: something failed and requires attention. Unhandled exceptions, failed triggers, data integrity violations. Reviewed immediately. Never use ERROR for expected failure modes (e.g. file not found when that is a valid outcome). Never use INFO for high-frequency noise that would obscure meaningful events.

CD-009WARNrequirement checkable

Structured log event shape — all services

All meaningful log events emitted by production services include these fields: timestamp: ISO 8601 service: repo name (e.g. watcher-cog, api-kaianolevine-com) level: DEBUG | INFO | WARN | ERROR category: infra | pipeline | data | api event: snake_case event name (e.g. trigger_fired, file_processed) context: key-value pairs specific to the event Category definitions: infra: service lifecycle, triggers fired/not fired, heartbeats, Drive poll results pipeline: file processed/skipped/failed, pipeline completed/failed, Prefect flow run created data: data quality issues, schema violations, evaluation findings, duplicate detection api: HTTP errors (4xx/5xx), slow responses (>2s), external API failures, auth failures Log output is JSON in production, human-readable with emoji prefixes in local development. The shared library (common-python-utils for Python, kaiano-ts-utils for TypeScript) provides the logger — services do not configure logging inline.

CD-010ERRORrequirement checkable

Three-layer observability stack — all production services

Every production service implements all three observability layers: Layer 1 — Liveness: Healthchecks.io for always-on workers. Railway auto-restart for all services. Layer 2 — Structured logs: common-python-utils or kaiano-ts-utils logger emitting JSON events with standard shape. Queryable in Railway log viewer or Better Stack. Layer 3 — Exceptions: Sentry capturing all unhandled exceptions with full stack trace and context. These three layers are non-overlapping and non-redundant: Layer 1 catches: process died, service unreachable Layer 2 catches: business logic errors, skipped work, slow operations Layer 3 catches: unhandled exceptions and crashes A service missing any layer has a blind spot in production. Pipelines additionally require Prefect run history as a fourth layer covering orchestration-level observability.

CD-011ERRORrequirement checkable

Doppler as canonical secret store

Doppler is the single source of truth for all infrastructure secrets across the MiniAppPolis ecosystem. This includes API keys, database URLs, service tokens, internal API keys, Sentry DSNs, Healthchecks URLs, and Clerk credentials. Doppler syncs automatically to: Railway: all API services and cogs receive secrets via the Doppler → Railway native sync GitHub Actions: CI secrets synced via Doppler → GitHub integration Cloudflare Pages: secrets synced via Doppler CLI in CI/CD Secret management workflow: - All secret changes are made in Doppler only - Doppler pushes changes to downstream platforms automatically - Never manually set secrets in Railway, GitHub, or Cloudflare if Doppler is managing that service - .env.example lists all required secret keys with no values — this file is the human-readable contract of what a service needs Doppler project structure: One Doppler project per service or cog. Environments: development, staging, production (minimum). The only known carve-out is Prefect Cloud. Prefect uses its own encrypted Blocks for flow-level secrets. Prefect Blocks are managed directly in Prefect Cloud and are not synced from Doppler. When a secret used in a Prefect Block is rotated, Prefect must be updated manually. This is a known limitation — document it per-secret in Doppler's description field.

CD-012WARNrequirement

Internal service-to-service auth via per-caller API keys

Internal calls between Railway services and cogs use per-caller API keys passed via the X-Internal-API-Key header. Each caller (each cog or service making internal API calls) has its own key — not a single shared secret across all callers. Per-caller keys allow: - Independent rotation without touching all callers - Caller identity logging on every internal request - Revocation of a single caller's access without affecting others All internal API keys are stored in Doppler and injected at runtime. The receiving service (api-kaianolevine-com) validates the key and logs the caller identity on every internal request. common-python-utils provides get_internal_headers() returning the correct header dict for outgoing internal requests. All Python cogs use this utility — never construct the header inline. Clerk JWT auth is for user-facing requests only. Internal service-to-service calls never use Clerk JWTs.

CD-013WARNrequirement

Feature flag admin panel in kaianolevine.com

The admin control panel for feature flags lives at an /admin route in the kaianolevine.com software portfolio site. It is gated behind Clerk authentication with admin role verification. Capabilities: - View all flags and their current state - Toggle flags on/off - Create new flags (key, description, permanent boolean) - Delete flags (permanent flags require confirmation) The admin panel calls authenticated write endpoints on api-kaianolevine-com. The public GET /v1/feature-flags endpoint remains unauthenticated for use by Astro sites and cogs. Each site (kaianolevine.com, wcs.kaianolevine.com, deejaytools.com) has its own Clerk application and independent user pool. Admin access to the flag panel is scoped to the kaianolevine.com Clerk app only.

VER-007INFOconvention

Pull before branching

Always run git pull origin main before creating a new feature branch. Prevents merge conflicts with semantic-release commits (chore(release): X.Y.Z [skip ci]) that land on main after every release.

Evaluation7 rules
EVAL-001WARNrequirement

Evaluation findings stored, not ephemeral

Evaluation results are written to the database alongside pipeline outputs. They are reviewable, queryable, and feed the portfolio surface. Results logged only are not acceptable for production pipelines.

EVAL-002ERRORrequirement checkable

Every evaluation references a standards version

Every evaluation call includes a reference to the current version of the standards document being evaluated against. Without the rubric, evaluation is undefined.

EVAL-003WARNrequirement

Findings are specific and actionable

Findings reference a specific standard by ID, describe what was observed, and suggest a concrete remediation. Vague findings are not acceptable.

EVAL-004INFOrequirement

Structural conformance dimension

Evaluation dimension: does the repo/workflow follow the patterns in the standards? Covers src layout, naming conventions, error handling, tooling choices.

EVAL-005INFOrequirement

Pipeline consistency dimension

Evaluation dimension: did this run behave the same way as previous runs? Covers unexpected timing changes, new error types, missing steps.

EVAL-006INFOrequirement checkable

Standards currency dimension

Evaluation dimension: has this repo been evaluated against the current version of the standards document? Is it behind? Flag if the last evaluation used a standards version more than one minor version behind current.

EVAL-GAP-001INFOgap

Deterministic conformance checks — partial coverage

The conformance flow in evaluator-cog implements ~12 deterministic checks covering file presence, pyproject.toml, CI YAML, AST scanning, and test structure. Approximately 38 of the 50 checkable rules are not yet covered deterministically and fall to LLM assessment only. Remediation: backfill deterministic checks one domain at a time, starting with delivery.yaml (CD-004 action version pinning) and documentation.yaml (DOC-013 README completeness).

Cross-Stack2 rules
XSTACK-001ERRORrequirement checkable

Use shared library — do not reimplement shared behaviors

Python services declare common-python-utils as a dependency. TypeScript services declare kaiano-ts-utils as a dependency. Neither stack reimplements logging, auth verification, or response helpers provided by the shared library.

XSTACK-002ERRORrequirement

Cross-stack response shape parity

API response shapes are identical across Python and TypeScript services: success: { data: ..., meta: { count, version } } error: { error: { code, message } } Defined in kaiano-ts-utils for TypeScript. Enforced by response_model in FastAPI.