diff --git a/.claude-sync.conf b/.claude-sync.conf index c5e007ee..bc39a1f0 100644 --- a/.claude-sync.conf +++ b/.claude-sync.conf @@ -16,6 +16,7 @@ dev-docs/ARCH-API-VALIDATION.md dev-docs/RFC-WS-7-OBSERVABILITY.md dev-docs/GLITCHTIP.md dev-docs/ARCH-OBSERVABILITY.md +dev-docs/ARCH-TESTING.md dev-docs/runbooks/observability-triage.md dev-docs/runbooks/observability-erasure.md dev-docs/RFC-WS-6.md diff --git a/CLAUDE.md b/CLAUDE.md index 01103de4..4c68aec1 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -160,6 +160,10 @@ PR as the migration. - Use factories for all test data - After each module: `php artisan test --filter=ModuleName` +Frontend test architecture (5 tiers: Unit / Component / Integration / +Visual / E2E) is documented in `dev-docs/ARCH-TESTING.md`. Choose the +right tier per the decision tree there before adding new tests. + ## Frontend rules (strict) ### Vuexy reference source (mandatory) diff --git a/dev-docs/ARCH-TESTING.md b/dev-docs/ARCH-TESTING.md new file mode 100644 index 00000000..97feaf2b --- /dev/null +++ b/dev-docs/ARCH-TESTING.md @@ -0,0 +1,341 @@ +# Crewli — Test Architecture + +> Authoritative reference for test-tier choices in the SPA. Read this +> before adding a new test. Linked from `CLAUDE.md`. + +This document describes: + +1. The test pyramid Crewli uses, and what each tier is for +2. When to use which tier (decision tree) +3. Mock-vs-real-backend rules +4. Visual baseline workflow +5. CI integration status +6. Conventions and anti-patterns +7. Vuetify-during-PrimeVue-migration: the temporary state in test infra +8. Host setup requirements +9. Deferred work (BACKLOG references) + +--- + +## 1. Test pyramid and scope per layer + +Crewli runs five test tiers in the SPA. Each has a narrow purpose; +overlap is wasted work, gaps are silent risk. Pick the tier whose +purpose matches what you're actually verifying. + +### Tier 1 — Unit (Vitest + happy-dom) + +**Run via:** `pnpm test` (filtered by `tests/unit/**`) +**Environment:** Node + happy-dom, single module graph +**Cost:** ~20 ms per test +**For:** Pure logic, schema parsing, store reducers, isolated composable +behaviour. No DOM. Fastest tier; safe for pre-commit if we ever add it. + +### Tier 2 — Component (Playwright Component Testing) + +**Run via:** `pnpm test:component` +**Environment:** Real Chromium via `@playwright/experimental-ct-vue` +**Cost:** ~300 ms per test (incl. Chromium reuse) +**For:** Single-component verification. DOM rendering, click/keyboard, +prop propagation, slot rendering, CSS resolution. Mocks API at axios +layer. Provider stack (Vuetify [TEMP], Pinia, TanStack Query, Router) is +wired in `apps/app/playwright/index.ts`'s `beforeMount` hook. + +### Tier 3 — Integration (Playwright CT, multi-component) + +**Run via:** Same `pnpm test:component` runner; placement convention +distinguishes integration from single-component. +**Cost:** ~500 ms per test +**For:** Page-level mounting with mocked API responses. Tests +cross-component coordination (drag from Wachtrij → canvas, popover +→ mutation flow). Same provider stack as Tier 2. + +### Tier 4 — Visual regression (Playwright CT, `@visual` tag) + +**Run via:** `pnpm test:visual` (verify), `pnpm test:visual:update` +(regenerate baselines) +**Environment:** Real Chromium driving the canonical prototype HTML +served by a tiny static-server fixture (`tests/playwright-ct/visual/ +static-server.mjs`). +**Cost:** ~1.2 s per test +**For:** Pixel baselines against canonical visual sources. The +prototype HTML at `resources/Crewli - Artist Timetable Management/ +crewli-timetable.html` is the source of truth for Artist Management +surfaces. F4 (component migration) extends visual coverage to live +SPA components against the prototype. + +### Tier 5 — E2E (Playwright) + +**Run via:** `pnpm test:e2e` +**Environment:** Real Laravel test server (`php artisan serve --port= +8001`, DB `crewli_test`) + real Chromium browser context. +**Cost:** ~5 s for the suite (includes migrate:fresh + seed) +**For:** Contract verification end-to-end. Real network, real auth, +real DB transactions. Currently only the 409-conflict optimistic- +locking contract test (TEST-CONTRACT-001). Add tests sparingly — this +is the most expensive tier. + +--- + +## 2. When to use what — decision tree + +``` +Is the thing under test pure logic with no DOM? + └─ YES → Unit (Vitest + happy-dom) + +Is it a single component? (props, events, slots, CSS, keyboard) + └─ YES → Component (Playwright CT) + +Is it cross-component coordination, but no real backend? + └─ YES → Integration (Playwright CT) + +Is it a contract between SPA and backend (request/response shape)? + └─ YES → E2E (Playwright + Laravel) + +Is it visual fidelity to a canonical baseline? + └─ YES → Visual (Playwright CT, @visual tag) +``` + +**Don't pick by speed.** Pick by what you're verifying. A unit test +that mocks the backend cannot catch a contract-drift bug; an e2e test +for pure logic is wasted CI time. + +--- + +## 3. Mock-vs-real-backend choice rules + +### Mock when + +- The test verifies SPA behaviour given a known response shape +- Backend availability would slow the test below the relevant tier's + cost budget +- The path under test is independent of transactional / auth + semantics + +### Real backend when + +- The test verifies the contract between frontend and backend (Zod + schema vs. PHP Resource shape) +- Authentication or authorisation flows are involved +- Optimistic-locking, idempotency, or other multi-request semantics + matter + +**Anti-pattern: matching mocks to schemas.** Don't mock with the same +shape your Zod schema validates — that creates self-confirming bias +where both sides agree but neither matches reality. This is the +exact failure mode TEST-CONTRACT-001 was created to catch (timetable- +stabilization B5). + +--- + +## 4. Visual baseline workflow + +### Capturing baselines + +```bash +pnpm test:visual:update +``` + +Reviews PNG diffs in PRs. Baselines live at: +``` +apps/app/tests/playwright-ct/__screenshots__/visual//.png +``` +Tracked via Git LFS (see `.gitattributes`). Pixel tolerance: +`maxDiffPixelRatio: 0.001` (0.1%) per `playwright-ct.config.ts`. + +### Updating baselines (intentional UX change) + +1. Make the UX change (component edit, token edit, …) +2. Run `pnpm test:visual:update` locally +3. Review the diff PNG manually — does the new baseline match the + intended UX? +4. Commit baseline + UX change in the **same PR**. Reviewer can + compare baseline change against the UX intent. +5. Never update baselines to "make tests pass" without a UX-justified + reason in the PR description. + +### Updating baselines (unintentional diff in CI) + +1. Determine if the diff is environmental (font hinting, OS rendering, + timezone-based date formatting) or a real regression. +2. Environmental → consider tightening determinism (lock fonts, fake + timers, fixed locale) before tweaking tolerance. +3. Real regression → fix the regression, not the baseline. + +### Composite-over-isolated strategy (B3 baselines) + +Some surfaces enumerated in RFC §A.3's baseline list are captured as +composite views rather than individual block-state baselines. Reason: +the prototype's DOM exposes status only via inline `style.background`, +no `data-*` attributes. Isolated locators (e.g. by artist name) lock +the test to specific seed data and silently rot if data changes. + +The current 5 baselines cover the visual vocabulary: + +| File | Captures | +| ----------------------------- | ------------------------------------------------------- | +| `canvas-friday.png` | Status colors, b2b indicators, multi-lane stacking | +| `canvas-saturday.png` | Conflict ring, capacity warning | +| `stage-row-multilane.png` | First row in isolation | +| `wachtrij-populated.png` | Sidebar list rendering, status badges, counts | +| `popover.png` | Block-click popover layout | + +9 additional surfaces are documented as `test.skip()` in +`tests/playwright-ct/visual/prototype.spec.ts` with the gap reason. +F4 component migration adds isolated baselines using stable +`data-test-id` attributes on Vue components. + +--- + +## 5. CI integration + +**Status: deferred.** The repo currently has no CI runner configured. +Local development workflow: + +- Vitest (`pnpm test`) — tier 1, runs on demand +- Playwright Component (`pnpm test:component`) — tiers 2–4, runs on + demand +- Playwright E2E (`pnpm test:e2e`) — tier 5, runs on demand against a + developer-managed Laravel test server + +CI design (Gitea Actions vs. GitHub Actions decision, Linux runner +image with PHP+MySQL+Node+pnpm, screenshot-diff artifact upload, +label-gated nightly e2e) is captured as `TEST-INFRA-002` in +`dev-docs/BACKLOG.md`. + +When CI lands: + +- Pre-commit (lefthook): Vitest unit only. Fast, no Playwright launch. +- PR-CI: Vitest unit + Playwright component + visual. Slower but full + coverage. +- Nightly / label-gated: Playwright e2e against real Laravel + MySQL. + Most expensive tier. + +--- + +## 6. Conventions + +- **Test file naming:** `*.spec.ts` for Playwright (CT + e2e), + `*.test.ts` for Vitest. The runner config glob keeps them apart. +- **`@visual` tag:** required on all visual-regression tests so + `--grep @visual` filters them. +- **Provider stack for CT:** wired in `apps/app/playwright/index.ts`'s + `beforeMount` hook, not at mount call time. Tests forward + per-test overrides via `hooksConfig` (see + `tests/playwright-ct/utils/mountWithProviders.ts`). +- **E2E test isolation:** `globalSetup` runs `migrate:fresh + seed` + once per `pnpm test:e2e` invocation. Tests within one run share DB + state. Re-run = fresh DB. +- **Pixel tolerance:** `maxDiffPixelRatio: 0.001` default + (`playwright-ct.config.ts`). Per-test exceptions allowed if + documented inline. +- **Auth in e2e tests:** Bearer-via-cookie (`api/.../SetAuthCookie.php`). + POST `/api/v1/auth/login` returns `crewli_app_token` httpOnly cookie. + No CSRF dance, no Sanctum stateful flow. baseURL must be + `localhost:8001` (matching the cookie's `domain=localhost`), + **not** `127.0.0.1:8001`. + +### Anti-patterns to avoid + +1. **Mocking the same data shape that the schema validates** — + creates self-confirming bias. Use real backend for contract tests + (TEST-CONTRACT-001 catches this class of bug). +2. **Updating baselines silently** without diff review or a UX- + justified PR description. +3. **Adding Playwright tests for pure logic** that Vitest can cover + in 20 ms. Reserve Playwright for tests that need the browser. +4. **Treating "small" UX changes as not needing visual updates** — + there is no small visual change in an enterprise product; the + user notices. +5. **Brittle locators** by data values (artist names, stage names) + instead of stable test IDs. F4 will add `data-test-id` to Vue + components for this reason. + +--- + +## 7. Vuetify in test infrastructure during the PrimeVue migration + +`apps/app/playwright/index.ts`'s `beforeMount` hook registers Vuetify +as a Vue plugin. This is **intentional temporary state**. + +### Why + +The current SPA still ships Vuetify. Component-level Playwright CT +tests must mount components against the same UI framework the live +app uses, otherwise they would test a non-existent surface. Stripping +Vuetify from test infra now would make CT tests un-runnable until +F3 lands PrimeVue. + +### When it ends + +F3 (PrimeVue foundation, RFC-WS-FRONTEND-PRIMEVUE §6) replaces the +Vuetify plugin line in `playwright/index.ts` with PrimeVue and +updates `tests/playwright-ct/components/sanity-vuetify.spec.ts` to +its PrimeVue equivalent. Estimated effort: ~2 hours (mechanical +swap, no architecture change). + +### Why not abstract + +The instinct of "abstract the UI framework provider so we can swap +without touching test code" is a **deferred-cost trap** here: + +1. We are NOT retaining Vuetify post-F3. The abstraction would itself + need to be removed in F4 alongside the framework swap. +2. The swap is mechanical (~2 hours). An abstraction layer would take + longer to design well than the swap itself takes. +3. Reviewers seeing "Vuetify in test infra in a PrimeVue migration + sprint" should read this section + the JSDoc on + `mountWithProviders.ts` for context. + +The forbidden pattern: do not propose "let's make a `UIFrameworkPlugin` +interface and dependency-inject the provider per test" during F2/F3. +That's exactly the abstraction this section forbids. + +--- + +## 8. Host setup requirements + +For Playwright tests to run, the host must have: + +- **Node v22+** with **pnpm 10+** (matching `apps/app/`'s expectations) +- **Chromium** installed via `pnpm exec playwright install chromium` + (downloads to `~/Library/Caches/ms-playwright` on macOS) +- **Git LFS** installed (`brew install git-lfs` on macOS) and active + (`git lfs install --skip-repo` to avoid hook conflict with lefthook; + the LFS pre-push step is delegated through `lefthook.yml`) +- **MySQL 8** running locally via `make services` for e2e tests, with + the `crewli_test` database created via `make test-db-create` +- **PHP 8.2+ + composer** for the Laravel test server in e2e tests +- **`api/.env`** present with valid `APP_KEY` (e2e `globalSetup` + inherits this; only `DB_DATABASE` is overridden to `crewli_test` on + the command line) + +### Known risks + +- **`unpkg.com` dependency** — the prototype HTML loads React + Babel + from unpkg.com via `