# Crewli — Test Architecture > Authoritative reference for test-tier choices in the SPA. Read this > before adding a new test. Linked from `CLAUDE.md`. This document describes: 1. The test pyramid Crewli uses, and what each tier is for 2. When to use which tier (decision tree) 3. Mock-vs-real-backend rules 4. Visual baseline workflow 5. CI integration status 6. Conventions and anti-patterns 7. Vuetify-during-PrimeVue-migration: the temporary state in test infra 8. Host setup requirements 9. Deferred work (BACKLOG references) --- ## 1. Test pyramid and scope per layer Crewli runs five test tiers in the SPA. Each has a narrow purpose; overlap is wasted work, gaps are silent risk. Pick the tier whose purpose matches what you're actually verifying. ### Tier 1 — Unit (Vitest + happy-dom) **Run via:** `pnpm test` (filtered by `tests/unit/**`) **Environment:** Node + happy-dom, single module graph **Cost:** ~20 ms per test **For:** Pure logic, schema parsing, store reducers, isolated composable behaviour. No DOM. Fastest tier; safe for pre-commit if we ever add it. ### Tier 2 — Component (Playwright Component Testing) **Run via:** `pnpm test:component` **Environment:** Real Chromium via `@playwright/experimental-ct-vue` **Cost:** ~300 ms per test (incl. Chromium reuse) **For:** Single-component verification. DOM rendering, click/keyboard, prop propagation, slot rendering, CSS resolution. Mocks API at axios layer. Provider stack (Vuetify [TEMP], Pinia, TanStack Query, Router) is wired in `apps/app/playwright/index.ts`'s `beforeMount` hook. ### Tier 3 — Integration (Playwright CT, multi-component) **Run via:** Same `pnpm test:component` runner; placement convention distinguishes integration from single-component. **Cost:** ~500 ms per test **For:** Page-level mounting with mocked API responses. Tests cross-component coordination (drag from Wachtrij → canvas, popover → mutation flow). Same provider stack as Tier 2. ### Tier 4 — Visual regression (Playwright CT, `@visual` tag) **Run via:** `pnpm test:visual` (verify), `pnpm test:visual:update` (regenerate baselines) **Environment:** Real Chromium driving the canonical prototype HTML served by a tiny static-server fixture (`tests/playwright-ct/visual/ static-server.mjs`). **Cost:** ~1.2 s per test **For:** Pixel baselines against canonical visual sources. The prototype HTML at `resources/Crewli - Artist Timetable Management/ crewli-timetable.html` is the source of truth for Artist Management surfaces. F4 (component migration) extends visual coverage to live SPA components against the prototype. ### Tier 5 — E2E (Playwright) **Run via:** `pnpm test:e2e` **Environment:** Real Laravel test server (`php artisan serve --port= 8001`, DB `crewli_test`) + real Chromium browser context. **Cost:** ~5 s for the suite (includes migrate:fresh + seed) **For:** Contract verification end-to-end. Real network, real auth, real DB transactions. Currently only the 409-conflict optimistic- locking contract test (TEST-CONTRACT-001). Add tests sparingly — this is the most expensive tier. --- ## 2. When to use what — decision tree ``` Is the thing under test pure logic with no DOM? └─ YES → Unit (Vitest + happy-dom) Is it a single component? (props, events, slots, CSS, keyboard) └─ YES → Component (Playwright CT) Is it cross-component coordination, but no real backend? └─ YES → Integration (Playwright CT) Is it a contract between SPA and backend (request/response shape)? └─ YES → E2E (Playwright + Laravel) Is it visual fidelity to a canonical baseline? └─ YES → Visual (Playwright CT, @visual tag) ``` **Don't pick by speed.** Pick by what you're verifying. A unit test that mocks the backend cannot catch a contract-drift bug; an e2e test for pure logic is wasted CI time. --- ## 3. Mock-vs-real-backend choice rules ### Mock when - The test verifies SPA behaviour given a known response shape - Backend availability would slow the test below the relevant tier's cost budget - The path under test is independent of transactional / auth semantics ### Real backend when - The test verifies the contract between frontend and backend (Zod schema vs. PHP Resource shape) - Authentication or authorisation flows are involved - Optimistic-locking, idempotency, or other multi-request semantics matter **Anti-pattern: matching mocks to schemas.** Don't mock with the same shape your Zod schema validates — that creates self-confirming bias where both sides agree but neither matches reality. This is the exact failure mode TEST-CONTRACT-001 was created to catch (timetable- stabilization B5). --- ## 4. Visual baseline workflow ### Capturing baselines ```bash pnpm test:visual:update ``` Reviews PNG diffs in PRs. Baselines live at: ``` apps/app/tests/playwright-ct/__screenshots__/visual//.png ``` Tracked via Git LFS (see `.gitattributes`). Pixel tolerance: `maxDiffPixelRatio: 0.001` (0.1%) per `playwright-ct.config.ts`. ### Updating baselines (intentional UX change) 1. Make the UX change (component edit, token edit, …) 2. Run `pnpm test:visual:update` locally 3. Review the diff PNG manually — does the new baseline match the intended UX? 4. Commit baseline + UX change in the **same PR**. Reviewer can compare baseline change against the UX intent. 5. Never update baselines to "make tests pass" without a UX-justified reason in the PR description. ### Updating baselines (unintentional diff in CI) 1. Determine if the diff is environmental (font hinting, OS rendering, timezone-based date formatting) or a real regression. 2. Environmental → consider tightening determinism (lock fonts, fake timers, fixed locale) before tweaking tolerance. 3. Real regression → fix the regression, not the baseline. ### Composite-over-isolated strategy (B3 baselines) Some surfaces enumerated in RFC §A.3's baseline list are captured as composite views rather than individual block-state baselines. Reason: the prototype's DOM exposes status only via inline `style.background`, no `data-*` attributes. Isolated locators (e.g. by artist name) lock the test to specific seed data and silently rot if data changes. The current 5 baselines cover the visual vocabulary: | File | Captures | | ----------------------------- | ------------------------------------------------------- | | `canvas-friday.png` | Status colors, b2b indicators, multi-lane stacking | | `canvas-saturday.png` | Conflict ring, capacity warning | | `stage-row-multilane.png` | First row in isolation | | `wachtrij-populated.png` | Sidebar list rendering, status badges, counts | | `popover.png` | Block-click popover layout | 9 additional surfaces are documented as `test.skip()` in `tests/playwright-ct/visual/prototype.spec.ts` with the gap reason. F4 component migration adds isolated baselines using stable `data-test-id` attributes on Vue components. --- ## 5. CI integration **Status: deferred.** The repo currently has no CI runner configured. Local development workflow: - Vitest (`pnpm test`) — tier 1, runs on demand - Playwright Component (`pnpm test:component`) — tiers 2–4, runs on demand - Playwright E2E (`pnpm test:e2e`) — tier 5, runs on demand against a developer-managed Laravel test server CI design (Gitea Actions vs. GitHub Actions decision, Linux runner image with PHP+MySQL+Node+pnpm, screenshot-diff artifact upload, label-gated nightly e2e) is captured as `TEST-INFRA-002` in `dev-docs/BACKLOG.md`. When CI lands: - Pre-commit (lefthook): Vitest unit only. Fast, no Playwright launch. - PR-CI: Vitest unit + Playwright component + visual. Slower but full coverage. - Nightly / label-gated: Playwright e2e against real Laravel + MySQL. Most expensive tier. --- ## 6. Conventions - **Test file naming:** `*.spec.ts` for Playwright (CT + e2e), `*.test.ts` for Vitest. The runner config glob keeps them apart. - **`@visual` tag:** required on all visual-regression tests so `--grep @visual` filters them. - **Provider stack for CT:** wired in `apps/app/playwright/index.ts`'s `beforeMount` hook, not at mount call time. Tests forward per-test overrides via `hooksConfig` (see `tests/playwright-ct/utils/mountWithProviders.ts`). - **E2E test isolation:** `globalSetup` runs `migrate:fresh + seed` once per `pnpm test:e2e` invocation. Tests within one run share DB state. Re-run = fresh DB. - **Pixel tolerance:** `maxDiffPixelRatio: 0.001` default (`playwright-ct.config.ts`). Per-test exceptions allowed if documented inline. - **Auth in e2e tests:** Bearer-via-cookie (`api/.../SetAuthCookie.php`). POST `/api/v1/auth/login` returns `crewli_app_token` httpOnly cookie. No CSRF dance, no Sanctum stateful flow. baseURL must be `localhost:8001` (matching the cookie's `domain=localhost`), **not** `127.0.0.1:8001`. ### Anti-patterns to avoid 1. **Mocking the same data shape that the schema validates** — creates self-confirming bias. Use real backend for contract tests (TEST-CONTRACT-001 catches this class of bug). 2. **Updating baselines silently** without diff review or a UX- justified PR description. 3. **Adding Playwright tests for pure logic** that Vitest can cover in 20 ms. Reserve Playwright for tests that need the browser. 4. **Treating "small" UX changes as not needing visual updates** — there is no small visual change in an enterprise product; the user notices. 5. **Brittle locators** by data values (artist names, stage names) instead of stable test IDs. F4 will add `data-test-id` to Vue components for this reason. --- ## 7. Vuetify in test infrastructure during the PrimeVue migration `apps/app/playwright/index.ts`'s `beforeMount` hook registers Vuetify as a Vue plugin. This is **intentional temporary state**. ### Why The current SPA still ships Vuetify. Component-level Playwright CT tests must mount components against the same UI framework the live app uses, otherwise they would test a non-existent surface. Stripping Vuetify from test infra now would make CT tests un-runnable until F3 lands PrimeVue. ### When it ends F3 (PrimeVue foundation, RFC-WS-FRONTEND-PRIMEVUE §6) replaces the Vuetify plugin line in `playwright/index.ts` with PrimeVue and updates `tests/playwright-ct/components/sanity-vuetify.spec.ts` to its PrimeVue equivalent. Estimated effort: ~2 hours (mechanical swap, no architecture change). ### Why not abstract The instinct of "abstract the UI framework provider so we can swap without touching test code" is a **deferred-cost trap** here: 1. We are NOT retaining Vuetify post-F3. The abstraction would itself need to be removed in F4 alongside the framework swap. 2. The swap is mechanical (~2 hours). An abstraction layer would take longer to design well than the swap itself takes. 3. Reviewers seeing "Vuetify in test infra in a PrimeVue migration sprint" should read this section + the JSDoc on `mountWithProviders.ts` for context. The forbidden pattern: do not propose "let's make a `UIFrameworkPlugin` interface and dependency-inject the provider per test" during F2/F3. That's exactly the abstraction this section forbids. --- ## 8. Host setup requirements For Playwright tests to run, the host must have: - **Node v22+** with **pnpm 10+** (matching `apps/app/`'s expectations) - **Chromium** installed via `pnpm exec playwright install chromium` (downloads to `~/Library/Caches/ms-playwright` on macOS) - **Git LFS** installed (`brew install git-lfs` on macOS) and active (`git lfs install --skip-repo` to avoid hook conflict with lefthook; the LFS pre-push step is delegated through `lefthook.yml`) - **MySQL 8** running locally via `make services` for e2e tests, with the `crewli_test` database created via `make test-db-create` - **PHP 8.2+ + composer** for the Laravel test server in e2e tests - **`api/.env`** present with valid `APP_KEY` (e2e `globalSetup` inherits this; only `DB_DATABASE` is overridden to `crewli_test` on the command line) ### Known risks - **`unpkg.com` dependency** — the prototype HTML loads React + Babel from unpkg.com via `