docs: ARCH-OBSERVABILITY.md (WS-8b)

Replaces the WS-6 skeleton with a full post-implementation reference
for the observability stack. Eleven sections covering scope, component
overview, tag taxonomy (replacing RFC §3.6 as source-of-truth), tag
binding architecture, scrubbing semantics, runtime context split, CSP
whitelist, sourcemap upload, GDPR + privacy, maintenance + extension
guidance, plus cross-references.

Form Builder exception classification from the old skeleton §3 is
preserved in §5.4 — concrete answer for which Crewli exception
classes do or do not go to GlitchTip.

Lengte: 730 regels markdown. Closes WS-7 acceptance criterion 8.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-07 19:46:32 +02:00
parent 5c42f27b26
commit 754222f74d

View File

@@ -1,56 +1,405 @@
# ARCH-OBSERVABILITY # ARCH — Observability (v1.0)
> Crewli's observability architecture — logging, monitoring, alerting, > **Source of truth** for Crewli's observability implementation
> metrics. > (sentry-laravel + @sentry/vue + GlitchTip). This document supersedes
> [`RFC-WS-7-OBSERVABILITY.md`](./RFC-WS-7-OBSERVABILITY.md) for tag
> taxonomy, binding semantics, and operational patterns. The RFC
> remains the historical implementation-spec; ARCH is the
> post-implementation reference for developers maintaining or extending
> the stack.
> >
> Status: SKELETON. Section §3 (`$dontReport`) is concrete; other > **Status:** WS-7 implementation complete. Code criteria 3, 4, 5, 6,
> sections are structured placeholders for WS-7 sessie 1 decisions. > 11, 12, 13 satisfied; documentation criteria 8, 14 satisfied via this
> ARCH plus the runbooks under [`runbooks/`](./runbooks/). Manual
> closure criteria (1, 2, 7, 9, 10) remain on Bert's checklist.
>
> **Version:** 1.0 (initial post-implementation reference, mei 2026,
> after PR-1 → PR-4 landed in `feat/ws-7-observability`).
>
> **Pre-WS-7 skeleton:** earlier versions of this document (v0.1,
> april 2026) carried placeholder sections for log levels, metrics,
> alerting and dashboards. Those decisions were taken during WS-7 and
> implemented as code; this document captures the as-built outcome.
> The historical skeleton sections about metrics (§5), alerting (§6),
> and dashboards (§7) are intentionally not carried forward — Crewli
> has settled on errors-only observability via GlitchTip; no Statsd /
> Prometheus / Grafana stack is planned (RFC §2 amendment B).
## Document history ---
- 2026-04-28 — v0.1 — Initial skeleton (WS-6 sessie 3b). Only §3 ## §1 Doel & scope
concrete; remainder placeholdered for WS-7.
## §1 — Logging strategy Observability in Crewli levert geautomatiseerde error-detection en
service-availability monitoring. Stack traces, tags, breadcrumbs en
release-correlation worden verzameld via [GlitchTip](https://glitchtip.com/),
self-hosted op `monitoring.hausdesign.nl`. GlitchTip is binary-compatible
met het Sentry event-protocol; we gebruiken `sentry/sentry-laravel` op de
backend en `@sentry/vue` op de frontend ongewijzigd.
[WS-7: define log levels with explicit criteria. Example questions to **Wel in scope:**
answer in WS-7 sessie 1:
- When does code use `Log::error` vs `Log::warning` vs `Log::info`? - Programmer errors uit Laravel controllers en queue jobs (`Throwable`,
- Are unhandled exceptions automatically `error`? `RuntimeException`, `TypeError`, `QueryException` etc.)
- Is `Log::debug` allowed in production, or stripped in deploy? - Infrastructure failures (database connection drop, redis unavailable,
- How do structured payload conventions tie to log keys (see §4)? external HTTP timeouts in Crewli's eigen client-code)
] - Frontend runtime errors uit Vue componenten en composables
- Unhandled promise rejections in de SPA
- Vue Router navigatie-context als breadcrumbs
## §2 — Sentry decisions **Niet in scope** (per [RFC §3.10](./RFC-WS-7-OBSERVABILITY.md)):
[WS-7: Sentry SDK install + configuration decisions. Skeleton: - **Performance / tracing / profiling.** Hard-pinned op 0.0 sample rate
in zowel `config/sentry.php` als `apps/app/src/observability/sentry.ts`.
- **Verwachte business-uitkomsten.** ValidationException,
AuthenticationException, AuthorizationException, sub-500
HttpExceptions worden bewust niet gecaptured. Die hebben eigen
audit-paden (`form_submission_action_failures`, `activity_log`).
- **Audit trails.** `activity_log`, `impersonation_audit_logs`,
`form_webhook_deliveries` blijven authoritative voor security en
compliance audit. GlitchTip is voor defectdetectie.
- **Replay / Web Vitals / User Feedback.** GlitchTip ondersteunt dit
niet; we gebruiken het ook niet als roadmap-item.
- **Metrics / dashboards / alerting beyond GlitchTip's email.** Geen
Statsd / Prometheus / Grafana. Alerting initieel email-naar-Bert via
GlitchTip's eigen rule-engine; Slack-integratie staat op BACKLOG.
- Which environments report to Sentry? (dev / staging / production) **Boundary met bestaande systemen:**
- Sample rate per environment?
- Source map upload to Sentry CI?
- User context injection (auth user ID + organisation ID, opt-in
redaction for PII)?
- Breadcrumbs strategy (which events generate breadcrumbs)?
- Release tagging convention (commit SHA? semver? both?)?
]
## §3 — `$dontReport` exceptions (concrete) | Systeem | Wat het doet | Wat GlitchTip NIET overneemt |
|---|---|---|
| Telescope (`/telescope`) | Dev-only debugging dashboard. Local + testing. | GlitchTip is voor production-incidents; Telescope blijft voor lokale debug. |
| `activity_log` (Spatie) | Audit trail van user-acties op tenant data. Authoritative voor "wie deed wat wanneer." | GlitchTip captured nooit business-events zoals `form.submitted`. |
| `form_webhook_deliveries` | Webhook delivery audit met retry / dead-letter. | Bij dead-letter NIET via GlitchTip; alleen als de dispatcher zelf een programmer-error gooit. |
| `form_submission_action_failures` | Apply-pipeline failures per submission, action, en organisatie. Org-admin operational handling via WS-6 admin UI. | GlitchTip ziet runtime apply-pipeline exceptions (zie §5.4) parallel — engineering visibility, niet operational fix UI. |
| `impersonation_audit_logs` | Wie impersoneerde wie wanneer (security audit). | GlitchTip tagt actieve impersonation als context op gecaptured events; vervangt audit niet. |
| Laravel default log channel | Operationele runtime logs (info/warning/error). | Beide systemen krijgen dezelfde events; correlation via `request_id`. |
The following exception classes are **expected business outcomes**, ---
not bugs. They are caught and handled in the application; reporting
them to Sentry would generate noise that drowns the signal.
When the Sentry SDK lands (WS-7), add the following classes to ## §2 Componenten-overzicht
Laravel's `app/Exceptions/Handler.php` `$dontReport` array:
| Class | Reason | ```
┌─────────────────────────────────────────────────────────────────────────┐
│ Crewli production / dev │
│ │
│ ┌─────────────────────┐ ┌──────────────────────────┐ │
│ │ Laravel API │ │ apps/app SPA (Vue 3) │ │
│ │ (api.crewli.app) │ │ (crewli.app) │ │
│ │ │ │ │ │
│ │ ┌───────────────┐ │ │ ┌────────────────────┐ │ │
│ │ │sentry-laravel │ │ │ │@sentry/vue 10.x │ │ │
│ │ │4.25 SDK │ │ │ │ │ │ │
│ │ └───────┬───────┘ │ │ └─────────┬──────────┘ │ │
│ │ │ │ │ │ │ │
│ │ ┌───────▼───────┐ │ │ ┌─────────▼──────────┐ │ │
│ │ │SentryEvent │ │ │ │scrubEvent │ │ │
│ │ │Scrubber (PHP) │ │ │ │(TypeScript) │ │ │
│ │ └───────┬───────┘ │ │ └─────────┬──────────┘ │ │
│ │ │ │ │ │ │ │
│ └──────────┼──────────┘ └────────────┼─────────────┘ │
│ │ │ │
│ │ HTTPS POST /api/<n>/envelope/ │ │
│ └────────────┬────────────────────┘ │
│ ▼ │
│ ┌───────────────────────────────────────┐ │
│ │ GlitchTip │ │
│ │ monitoring.hausdesign.nl (prod) │ │
│ │ localhost:8200 (dev) │ │
│ │ │ │
│ │ ┌─────────┐ ┌─────────┐ ┌───────┐ │ │
│ │ │ web │ │ worker │ │ pg/ │ │ │
│ │ │ (django)│ │ (celery)│ │ redis │ │ │
│ │ └─────────┘ └─────────┘ └───────┘ │ │
│ └───────────────────────────────────────┘ │
│ │
└──────────────────────────────────────────────────────────────────────────┘
```
**Twee projecten in één GlitchTip-instance:**
- `crewli-api` — Laravel events (`app=api` tag)
- `crewli-app` — SPA events (`app=app` tag)
DSN per project; beide keys liggen in 1Password vault onder
`Crewli / GlitchTip / DSNs`. Backend leest `SENTRY_DSN_BACKEND`,
frontend `VITE_SENTRY_DSN_FRONTEND`.
CSP `connect-src` whitelist voor de ingest-host is verplicht — zonder
deze whitelist blokkeert de browser elke `@sentry/vue`-egress stilletjes.
Zie [§7](#7-csp-whitelist-kritisch).
---
## §3 Tag-taxonomie
Deze tabel vervangt de tabel in [`RFC-WS-7-OBSERVABILITY.md §3.6`](./RFC-WS-7-OBSERVABILITY.md)
als source-of-truth. Wanneer een tag wordt toegevoegd of de bron-locatie
wijzigt, wordt deze tabel bijgewerkt; de RFC blijft historisch
document.
### 3.1 Backend tags (sentry-laravel)
| Tag | Locatie | Always / conditional | Bron |
|---|---|---|---|
| `app` | initial scope (`config/sentry.php`) of `BindSentryRouteContext` | always | constant `'api'` |
| `release` | sentry-laravel built-in | when `SENTRY_RELEASE` env set | `crewli-api@<short-sha>` injected by `deploy.sh` |
| `environment` | sentry-laravel built-in | always | `APP_ENV` |
| `route_name` | `BindSentryRouteContext` middleware | conditional (named routes only) | `$request->route()->getName()` |
| `http.method` | `BindSentryRouteContext` middleware | always (HTTP requests) | `$request->method()` |
| `actor_scope` | `AuthScopeContextListener` | always (authenticated events) | resolution chain — see §3.3 |
| `actor_type` | `AuthScopeContextListener` | always (authenticated events) | `ActorType::resolve()` — see §3.4 |
| `user_id` | `AuthScopeContextListener` (overridden by `HandleImpersonation`) | always (authenticated) | ULID |
| `username` | `AuthScopeContextListener` user object | always (authenticated) | ULID — RFC §3.8: never email |
| `organisation_id` | `AuthScopeContextListener` | conditional — only when `actor_scope=organisation` | route param / portal-token resolution — see §3.3 |
| `event_id` | `AuthScopeContextListener` (via `{event}` route param) | conditional | route binding |
| `impersonation.active` | `AuthScopeContextListener` baseline + `HandleImpersonation` override | **always** (authenticated) | binary `'true'`/`'false'` — RFC §3.6 invariant |
| `impersonation.impersonator_user_id` | `HandleImpersonation` middleware | conditional — only when impersonating | ULID |
| `impersonation.session_id` | `HandleImpersonation` middleware | conditional — only when impersonating | ULID |
| `queue.attempt` | `TagJobAttemptOnSentry` listener | conditional — within queue jobs | `$event->job->attempts()` |
### 3.2 Frontend tags (@sentry/vue)
| Tag | Locatie | Always / conditional | Bron |
|---|---|---|---|
| `app` | initial scope (`sentry.ts`) | always | constant `'app'` |
| `release` | sentry-vue built-in | when `VITE_SENTRY_RELEASE` set | `crewli-app@<short-sha>` injected by `deploy.sh` build-time |
| `environment` | sentry-vue built-in | always | `import.meta.env.MODE` |
| `route_name` | `installContextBinding` Vue Router guard | always | `route.name ?? 'unnamed'` |
| `actor_scope` | `installContextBinding` | always | one of `organisation` / `platform` / `user` / `portal` / `anonymous` |
| `actor_type` | `installContextBinding` | always | `super_admin` / `organizer_admin` / `org_member` / `portal_token` / `unauthenticated` |
| `user_id` | `installContextBinding` | conditional — **never** when `actor_scope=portal` | `useAuthStore().user.id` |
| `organisation_id` | `installContextBinding` | conditional — only when `actor_scope=organisation` | `useOrganisationStore().activeOrganisationId` |
`http.method` is afwezig in de frontend-tabel; Vue Router-routes zijn
page-level navigation events, niet HTTP-requests. De backend-tabel
heeft `http.method` per request; de frontend laat die over aan
fetch-breadcrumbs die `@sentry/vue` automatisch attached.
### 3.3 `actor_scope` resolution
Beide implementaties volgen dezelfde priority chain. De backend doet
de resolution in `AuthScopeContextListener::resolveTenantContext()`,
de frontend in `contextBinding.ts::bindScope()`.
| Priority | Backend signaal | Frontend signaal | Resulterende `actor_scope` |
|---|---|---|---|
| 1 | `{organisation}` route-param | route binding op `/organisations/:id` | `organisation` |
| 2 | `{event}` route-param | `{event}` route-param | `organisation` (via event.organisation_id) |
| 3 | `portal_event` request attribute (set by `PortalTokenMiddleware`) | `route.meta.public === true && route.meta.context === 'portal'` | backend: `organisation`, frontend: `portal` |
| 4 | `super_admin` role + route name starts with `admin.` | `super_admin` role + `route.path` starts with `/platform` | `platform` (no `organisation_id`) |
| 5 | Authenticated, no org context | Authenticated, no `activeOrganisationId` | `user` (no `organisation_id`) |
| 6 | Unauthenticated | Unauthenticated | (backend: not bound; frontend: `anonymous`) |
**Belangrijke noot voor frontend portal-zone:** `actor_scope=portal`
kent **geen** `user_id` of `username`. RFC §3.7 frontend-block punt 5
expliciet — token-based flows (artist advance, public form fill)
krijgen geen user-context omdat de identifier (ULID-token) zelf
gevoelig is en de bezoeker niet permanent met Crewli is verbonden.
**Multi-tenant invariant:** wanneer `actor_scope=organisation` MOET
`organisation_id` aanwezig zijn als valide ULID. Wanneer
`actor_scope=platform`, `user`, of `anonymous`, IS `organisation_id`
afwezig. Niet "altijd aanwezig" maar "altijd correct gerelateerd aan
`actor_scope`." Geverifieerd in
`AuthScopeContextListenerTest::test_organisation_id_present_when_actor_scope_is_organisation`.
### 3.4 `actor_type` enum
Backend: `App\Enums\Observability\ActorType` (PHP enum). Frontend:
inline string-mapping in `contextBinding.ts::resolveActorType()`. Beide
geven dezelfde waarden:
| Waarde | Wanneer |
|---|---| |---|---|
| `\App\Exceptions\FormBuilder\PublishGuardViolationException` | Publish-time validation: schema fails a guard. Returned as 422 with field-level errors. Not a system bug. | | `super_admin` | User has Spatie role `super_admin` |
| `\App\Exceptions\FormBuilder\PurposeRequirementsNotMetException` | Schema lacks required bindings for its purpose. Returned as 422. Not a system bug. | | `organizer_admin` | User has Spatie role `org_admin` |
| `\App\Exceptions\FormBuilder\IdempotencyConflictException` | Duplicate idempotency key on submission. Returned as 409. Not a system bug. | | `org_member` | Authenticated user, no admin role (covers volunteers — Crewli has no dedicated `volunteer` role today; see [BACKLOG OBS-1](./BACKLOG.md)) |
| `portal_token` | Token-based portal request (`portal_event` attribute / `route.meta.public` + `context=portal`) |
| `unauthenticated` | No auth (e.g. login page, public form fill) |
**Out of scope for `$dontReport` (these DO go to Sentry):** ---
## §4 Tag-binding architectuur
Drie patronen die we bewust hebben gekozen tijdens WS-7 implementation;
gedocumenteerd hier zodat toekomstige uitbreidingen consistent zijn.
### 4.1 Backend split — middleware × event-listener
Route-scope tags binden **per HTTP request** via middleware. Auth-scope
tags binden **per authenticatie-event** via een listener. Reden:
route-context bestaat alleen tijdens HTTP handling, auth-context wordt
geëmit door élke authenticator (Laravel's `SessionGuard`, Sanctum's
bearer-token Guard, toekomstige authenticators).
| Concern | Implementatie |
|---|---|
| Route-scope (`app`, `route_name`, `http.method`) | `App\Http\Middleware\BindSentryRouteContext` — registered globally on the api group via `$middleware->api(prepend: [...])` in `bootstrap/app.php` |
| Auth-scope (`user_id`, `actor_type`, `actor_scope`, `organisation_id`) | `App\Listeners\Observability\AuthScopeContextListener` — listens to BOTH `Illuminate\Auth\Events\Authenticated` (SessionGuard) AND `Laravel\Sanctum\Events\TokenAuthenticated` (Sanctum) |
| Impersonation override + escalation | `App\Http\Middleware\HandleImpersonation` — re-binds Sentry scope after the user-swap, sets `impersonation.active='true'` plus impersonator/session ids |
| Queue context (`queue.attempt`) | `App\Listeners\Observability\TagJobAttemptOnSentry` — listens to `Illuminate\Queue\Events\JobProcessing` |
**Waarom dual-event listener?** Crewli's HTTP-flow is bearer-token via
`CookieBearerToken` middleware → `auth:sanctum` → Sanctum's `Guard`
fires only `TokenAuthenticated`, NOT `Authenticated`. Listening only
to the Authenticated event would silently miss every authenticated
HTTP request. Discovered by the live smoke test that PR-3 follow-up
fixed (commit `adab3be`).
### 4.2 Frontend split — Vue Router guard + Pinia store reads
Vue heeft geen Sentry-equivalent van Laravel's Authenticated event; de
natuurlijke tag-binding momenten zijn **route-transitions**. Een
`router.beforeEach` guard in `apps/app/src/observability/contextBinding.ts`:
1. Roept `Sentry.getCurrentScope().clear()` aan op elke navigatie.
Voorkomt cross-zone leakage (e.g. user logt uit in portal-zone maar
Sentry houdt user_id van de organizer-context vast).
2. Leest `useAuthStore()` en `useOrganisationStore()` voor identity en
tenant-context.
3. Past dezelfde resolution chain toe als de backend (§3.3).
### 4.3 Default-in-listener / override-in-middleware pattern
Voor binary tags die altijd aanwezig moeten zijn maar door specifieke
middleware-stappen worden geëscaleerd, gebruiken we een twee-fase
pattern:
```
AuthScopeContextListener::bindForUser() → scope.setTag('impersonation.active', 'false')
HandleImpersonation::handle() → scope.setTag('impersonation.active', 'true')
scope.setTag('impersonation.impersonator_user_id', $admin->id)
scope.setTag('impersonation.session_id', $session->id)
```
De listener seedt **altijd** een baseline (`'false'`). Wanneer
impersonation actief is, draait `HandleImpersonation` ná auth en
overschrijft de scope met de target user en de escalation-tags. Als
toekomstige refactors per `actor_scope` branch shortcuts maken die de
baseline overslaan, vangt
`AuthScopeContextListenerTest::test_impersonation_active_default_false_across_every_actor_scope_branch`
de regressie.
Dit pattern is herbruikbaar voor andere binary signals; tot nu toe
alleen toegepast op `impersonation.active`.
### 4.4 Listener registration discipline
Laravel 12's listener auto-discovery is uitgeschakeld in
`bootstrap/app.php` via `->withEvents(discover: false)`. Reden:
auto-discovery + explicit `Event::listen()` veroorzaakt silent
double-registration (vandaag idempotent door scope-tag overwrite
semantics, morgen niet meer wanneer een listener additive operations
doet). Gevangen door
`tests/Feature/Observability/EventListenerRegistrationTest`.
**Voor élke nieuwe observability-listener:**
1. Maak listener-class in `app/Listeners/Observability/`.
2. Registreer **expliciet** in `AppServiceProvider::boot()` met
array-callable form `[Class::class, 'method']`. Class-string vorm
verbergt method-binding in `php artisan event:list`.
3. Voeg een case toe aan
`EventListenerRegistrationTest::test_*_listener_registered_exactly_once`
met de juiste event-class + method-naam.
---
## §5 Scrubbing semantics
### 5.1 Backend — `App\Services\Observability\SentryEventScrubber`
Geregistreerd als `before_send` hook in `config/sentry.php` via
array-callable static-method notation. Stateless; geen
container-resolution per event.
**Wat wordt gescrubt:**
1. **Request body keys** (recursief, key-name match, depth-limited):
`password`, `password_confirmation`, `current_password`, `token`,
`api_key`, `secret`, `webhook_secret`, `dsn`, `signature`,
`authorization`, `cookie`, `bearer`, `iban`, `bic`,
`passport_number`, `bsn`. Replace value met `[scrubbed]`.
2. **Request headers** (case-insensitive): `authorization`, `cookie`,
`set-cookie`, `x-api-key`, `x-impersonation-token`. Replace met
`[scrubbed]`.
3. **Form submissions:** élke payload-key `form_values` wordt
wholesale replaced met `[scrubbed_form_values]`. Reden: Crewli's
form-builder genereert dynamische form-values waar elke key PII
kan zijn (email, telefoon, dietary, medical). Selectief op key
matchen is niet veilig.
4. **URL query string:** `token=`, `api_key=` worden gescrubt.
5. **Cookies wholesale:** `event.request.cookies` wordt vervangen door
`[scrubbed]`.
6. **Max-depth guard** op recursie: na 10 levels wordt subtree
replaced met `['[max_depth]']` om malicious deeply-nested payloads
te beperken.
**Sub-500 HttpException filter:** wanneer
`$hint?->exception instanceof HttpException && $hint->exception->getStatusCode() < 500`,
returnt de scrubber `null` → event wordt niet gestuurd. Reden: 404,
403, 422 etc. zijn verwachte business-uitkomsten (RFC §3.10), niet
programmer-errors. `ignore_exceptions` in `config/sentry.php` doet
class-only filtering; status-based filtering moet hier.
### 5.2 Frontend — `apps/app/src/observability/scrubber.ts`
TypeScript port van de backend-scrubber met identieke semantics. Plus:
7. **Storage context strip:** `event.contexts.storage` wordt gestript.
Sentry doesn't add this by default but defensively. RFC §3.7
frontend point 2 — localStorage / sessionStorage **never** in event
context (Crewli's portal-state in sessionStorage MAG NIET lekken).
8. **`event.user.cookies` strip:** als sentry's BrowserSession
integration `document.cookie` exposure via user-context injecteert,
wordt het weggehaald.
9. **Cookies wholesale (typed shape):** `event.request.cookies` is
typed `Record<string, string>` in `@sentry/vue`. Replace met
`{ scrubbed: '[scrubbed]' }` in plaats van een string — preserves
the typed shape.
### 5.3 Boundary: business outcomes vs programmer/infra errors
| Exception class | Backend behaviour | Reden |
|---|---|---|
| `Throwable`, `RuntimeException`, `TypeError` | Captured | Programmer error |
| `QueryException`, `PDOException` | Captured | Infra error |
| `ValidationException` | NOT captured (`ignore_exceptions`) | Verwacht user-input error |
| `AuthenticationException` | NOT captured (`ignore_exceptions`) | Verwacht user-state error |
| `AuthorizationException` | NOT captured (`ignore_exceptions`) | Verwacht user-permission error |
| `HttpException` status `< 500` | NOT captured (scrubber returns `null`) | Verwacht 4xx outcome |
| `HttpException` status `>= 500` | Captured | Genuine server error |
`Integration::handles($exceptions)` in `bootstrap/app.php` is **niet
auto-registered** door sentry-laravel 4.x. Zonder deze regel runt
`report($e)` alleen door Laravel's default reporter (logs to channel)
en bereikt het Sentry niet. Gedekt door
`tests/Feature/Observability/ExceptionReportingTest`. Zie ook
[BACKLOG OBS-6](./BACKLOG.md).
**Voor élke nieuwe `$exceptions->render(...)` handler in
`bootstrap/app.php`:** Laravel's flow is `report()``render()`. Als
de handler een Throwable consumeert en een Response retourneert, zorgt
de framework-flow voor `report()` automatisch. **Render handlers MOGEN
NIET** `report()` hand-rollen of vroegtijdig short-circuiten — zie
[BACKLOG OBS-7](./BACKLOG.md) voor expansion plan.
### 5.4 Form Builder runtime exceptions (concrete classification)
Form Builder is Crewli's grootste runtime-domein met eigen
exception-hierarchy (zie [`ARCH-FORM-BUILDER.md`](./ARCH-FORM-BUILDER.md)).
De classificatie tussen "expected business outcome" en "programmer /
infra error" voor deze classes is concreet vastgelegd:
**Wel naar GlitchTip (programmer/infra errors):**
- `App\Exceptions\FormBuilder\PersonProvisioningException` — runtime - `App\Exceptions\FormBuilder\PersonProvisioningException` — runtime
failure during the apply pipeline. Caught by failure during the apply pipeline. Caught by
@@ -59,63 +408,323 @@ Laravel's `app/Exceptions/Handler.php` `$dontReport` array:
visibility into recurring patterns across orgs. visibility into recurring patterns across orgs.
- `App\Exceptions\FormBuilder\PurposeSubjectResolutionException` - `App\Exceptions\FormBuilder\PurposeSubjectResolutionException`
runtime resolution failure (no portal token, no auth user, etc.). runtime resolution failure (no portal token, no auth user, etc.).
Same dual-handling rationale: action-failures table for Same dual-handling rationale: action-failures table for org-admin
org-admin operational handling; Sentry for engineering visibility. operational handling; GlitchTip for engineering visibility.
- `App\Exceptions\FormBuilder\FormBindingApplicatorException` - `App\Exceptions\FormBuilder\FormBindingApplicatorException`
runtime applicator failure (no_transaction, no_schema, runtime applicator failure (`no_transaction`, `no_schema`,
unknown_purpose). These should never happen in production; if they `unknown_purpose`). These should never happen in production; if
do, they're systemic bugs — Sentry is the correct destination. they do, they're systemic bugs — GlitchTip is the correct
destination.
The dual recording (Sentry + `form_submission_action_failures` table) **Niet naar GlitchTip (expected business outcomes):**
is intentional: org admins fix specific failures via the WS-6 admin
UI; engineering identifies systemic issues across all orgs via
Sentry's aggregation.
## §4 — Structured logging conventions - `App\Exceptions\FormBuilder\PublishGuardViolationException`
publish-time validation: schema fails a guard. Returned as 422
with field-level errors. Not a system bug.
- `App\Exceptions\FormBuilder\PurposeRequirementsNotMetException`
schema lacks required bindings for its purpose. Returned as 422.
Not a system bug.
- `App\Exceptions\FormBuilder\IdempotencyConflictException`
duplicate idempotency key on submission. Returned as 409. Not a
system bug.
[WS-7: log key naming convention. Skeleton: Dual-handling voor de eerste groep is intentional: org-admins fixen
specifieke failures via de WS-6 admin UI; engineering identificeert
systemic issues across all orgs via GlitchTip's aggregation. De
"niet naar GlitchTip" groep is afgedekt door `ignore_exceptions` (voor
`PublishGuardViolationException` etc. die `HttpExceptionInterface`
implementeren via 422 response) en moet bij toevoeging van een nieuwe
expected-outcome class expliciet worden uitgezonderd in
`config/sentry.php`.
- Hierarchical dot-separated namespace tree ---
- Existing examples to align with:
- `form-builder.apply.transaction_rolled_back`
- `form-builder.identity-match.no_person_subject_post_apply`
- `form-webhook.delivery.exception`
Define the tree formally so future code discovers the right namespace ## §6 Runtime context-split (frontend)
deterministically.]
## §5 — Metrics Vie zones, gedecide per `route.path` en `route.meta`:
[WS-7: which counters / histograms / gauges? Namespace? ### 6.1 `actor_scope=organisation`
Statsd / Prometheus / OTel flavour? At minimum, candidate metrics:
- `form_submissions_total` (counter, tagged by purpose) - Organizer routes met active org context (`useOrganisationStore().activeOrganisationId !== null`)
- `form_submission_apply_status` (counter, tagged by status) - Tags: `actor_scope=organisation`, `organisation_id=<ULID>`, plus user-context
- `form_failures_open` (gauge per org) - Voorbeelden: `/organisations/:id/dashboard`, `/events/:id`, `/dashboard`
- `retry_attempts_total` (counter, tagged by outcome)
- `apply_pipeline_duration_seconds` (histogram)
]
## §6 — Alerting rules ### 6.2 `actor_scope=platform`
[WS-7: which thresholds trigger alerts? Where (Slack? PagerDuty? - super_admin op `/platform/*` paths
Email?). At minimum, candidate alerts: - Tags: `actor_scope=platform`, GEEN `organisation_id`
- **Geforceerde org-attribution zou misleidend zijn.** Platform-mode
events spannen impliciet over alle organisaties.
- "Open failures > X for > Y hours" ### 6.3 `actor_scope=user`
- "Apply pipeline error rate > X% in 1h window"
- "no_transaction guard fired" (immediate alert; should never happen
in production)
- "Webhook dead-letter rate > X%"
]
## §7 — Dashboards - Authenticated user op routes zonder org-scope (`/account-settings`,
`/portal/profiel`)
- Tags: `actor_scope=user`, GEEN `organisation_id`
- Reden: Crewli's User↔Organisation is many-to-many; geen reliable
single-org hint zonder route-context.
[WS-7: Grafana / Cloudwatch / similar. Panel layout, widget types, ### 6.4 `actor_scope=portal`
default time ranges. Skeleton later.]
## Related docs - Token-based portal flows: `route.meta.public === true && route.meta.context === 'portal'`
- Concrete routes: `/portal/advance/:token` (artist advance),
`/register/:public_token` (public form fill)
- Tags: `actor_scope=portal`, `actor_type=portal_token`
- **Geen `user_id`, geen `username`** — RFC §3.7 frontend point 5.
De ULID-token zelf is gevoelig; de bezoeker is niet permanent met
Crewli verbonden.
- Backend portal-token request resolves de organisation via de
matching artist/event row; frontend events correleren via
`request_id` back naar het backend-event dat wel `organisation_id`
heeft.
- `RFC-WS-6.md` — WS-6 binding pipeline design (the failures observed ### 6.5 `actor_scope=anonymous`
and recorded by §3's classes originate here)
- `ARCH-BINDINGS.md` — apply pipeline architecture - Public routes zonder auth: `/login`, `/forgot-password`, `/register`,
- `ARCH-FORM-BUILDER.md` — form-builder runtime including webhooks `/invitations/:token` (acceptance flow)
- Tags: `actor_scope=anonymous`, `actor_type=unauthenticated`
### 6.6 Cross-zone leakage prevention
`Sentry.getCurrentScope().clear()` wordt aangeroepen op élke
route-transitie in `installContextBinding`. Voorbeeld: user logt uit
in organizer-context, navigeert naar `/login`. Zonder clear zou het
volgende anonymous error-event nog `user_id` van de uitgelogde
gebruiker dragen. Met clear wordt het Sentry-scope reset; de
unauthenticated event krijgt alleen de zojuist gebonden anonymous-tags.
Test: `contextBinding.spec.ts::test_cross-zone_leak_guard`.
---
## §7 CSP whitelist (kritisch)
Crewli's strict CSP `connect-src` directive moet de GlitchTip
ingest-host expliciet whitelisten. Zonder deze entry blokkeert de
browser elke `@sentry/vue` POST stilletjes met *"Refused to connect
because it violates the following Content Security Policy directive"*
in DevTools Console — de SDK denkt dat het werkt, maar geen events
bereiken GlitchTip.
| Environment | CSP-locatie | `connect-src` entry |
|---|---|---|
| Dev | `apps/app/index.html` meta tag | `http://localhost:8200` |
| Prod organizer SPA | `deploy/nginx/csp-spa.conf` (Report-Only én Enforce regels) | `https://monitoring.hausdesign.nl` |
| API JSON responses | `api/config/security.php` — geen update | `default-src 'none'`; geen `connect-src` want JSON-context heeft geen fetch-origin |
**Bij introductie van een nieuwe environment** (bijv. staging — zie
[BACKLOG OBS-9](./BACKLOG.md)) MOET:
1. De bijbehorende GlitchTip ingest-host worden toegevoegd aan de
juiste CSP-locatie.
2. `tests/Feature/Security/CspConnectsToObservabilityTest` worden
uitgebreid met een staging-assertion zodat de regression-guard de
nieuwe environment dekt.
---
## §8 Sourcemap upload (frontend)
Vite produceert sourcemaps voor élke chunk (`build.sourcemap=true` in
`vite.config.ts`). `deploy.sh` uploadt ze naar GlitchTip én verwijdert
ze uit `dist/` vóór nginx ze serveert. RFC §3.5: **never**
public-mapped sources op productie.
```
vite build → apps/app/dist/assets/*.js + *.js.map
sentry-cli sourcemaps upload --org $SENTRY_ORG \
--project crewli-app \
--release $VITE_SENTRY_RELEASE \
--url-prefix "~/assets/" \
apps/app/dist/assets
find apps/app/dist -name '*.map' -type f -delete
nginx serves dist/
```
**Required env vars** (deploy host alleen, niet committed):
| Var | Beschrijving |
|---|---|
| `SENTRY_AUTH_TOKEN` | Per-project upload-only token in GlitchTip. Bert provisioned dit handmatig in `crewli-app` project settings. |
| `SENTRY_ORG` | GlitchTip organisation slug. Default in `deploy.sh`: `crewli`. |
| `VITE_SENTRY_DSN_FRONTEND` | Aanwezigheid is conditional — als deze ontbreekt skipt `deploy.sh` upload (soft fail) maar voert alsnog `*.map` strip uit. |
| `VITE_SENTRY_RELEASE` | Build-time injected door `deploy.sh`: `crewli-app@$(git rev-parse --short HEAD)`. |
**Soft-fail:** als upload faalt (GlitchTip unreachable, expired token),
gaat de deploy door en logt een warning. De `find … -delete` stap loopt
**altijd**. Beter unmapped stack traces in GlitchTip dan een
geblokkeerde deploy.
---
## §9 GDPR & privacy
### 9.1 Processing register
Crewli is **controller** voor GlitchTip-data (self-hosted op
Crewli-infra). Geen processor-relatie, geen DPA-uitbreiding nodig.
Processing register entry: zie
[`SECURITY_AUDIT.md`](./SECURITY_AUDIT.md), "WS-7 Observability —
finale audit".
### 9.2 Data na scrubbing
Wat een GlitchTip-event nog kan bevatten:
- ULIDs (user_id, organisation_id, event_id, request_id, session_id)
- Stack traces (zonder locals — `send_default_pii=false`)
- Route names en HTTP methods
- Gecureerde tags (zie §3)
- Breadcrumbs (input-text masked, console-integration off in prod)
Wat **niet**: emails, telefoonnummers, namen, IP-adressen, raw
form_values, raw cookies, raw headers (Authorization etc.).
### 9.3 Retention
90 dagen, daarna purged door GlitchTip's eigen partition-maintenance
loop (zie [`GLITCHTIP.md`](./GLITCHTIP.md) monitoring sectie).
Configurable via GlitchTip admin UI (settings → environment-config).
### 9.4 Right to erasure (Art. 17)
Initieel handmatig. Procedure: zie
[`runbooks/observability-erasure.md`](./runbooks/observability-erasure.md).
Geautomatiseerd erasure-script blijft op BACKLOG (referentie in de
RFC; nog niet als concrete entry in BACKLOG.md).
---
## §10 Onderhoud & uitbreiding
### 10.1 Een nieuwe tag toevoegen
Bepaal eerst de **bron** van de tag. Drie patronen:
| Bron | Pattern | Voorbeeld |
|---|---|---|
| HTTP request context (route, method, headers) | Middleware | `BindSentryRouteContext` |
| Auth context (user, role, org) | Listener op `Authenticated` + `TokenAuthenticated` | `AuthScopeContextListener` |
| Domain event (job processing, custom event) | Listener op het domain event | `TagJobAttemptOnSentry` |
| Static / build-time | `config/sentry.php` initial scope | `app=api` |
Voor élke nieuwe tag:
1. Voeg toe aan §3 tabel hierboven.
2. Implementeer in de gekozen locatie.
3. Bij listeners: registreer expliciet in `AppServiceProvider::boot()`
met array-callable form, en voeg case toe aan
`EventListenerRegistrationTest`.
4. Schrijf een feature-test die de tag op een live HTTP flow asserteert
(volg het pattern van `AuthScopeBindingHttpFlowTest`).
5. Frontend mirror: voeg toe aan
`apps/app/src/observability/contextBinding.ts` en aan
`contextBinding.spec.ts`.
### 10.2 Een nieuwe scrubbing-rule toevoegen
1. Backend: voeg key toe aan `SENSITIVE_BODY_KEYS` of
`SENSITIVE_HEADERS` in
`app/Services/Observability/SentryEventScrubber.php`.
2. Frontend: identieke wijziging in
`apps/app/src/observability/scrubber.ts`.
3. Voeg test-case toe aan beide:
`tests/Feature/Observability/PiiScrubbingTest.php` (PHP) en
`apps/app/src/observability/__tests__/scrubber.spec.ts` (TypeScript).
4. Beide testbestanden moeten de nieuwe key dekken — backend en
frontend zijn semantisch gelijk en moeten dat blijven.
### 10.3 Een nieuwe `$exceptions->render(...)` handler
Per [BACKLOG OBS-7](./BACKLOG.md): nieuwe render handlers MOGEN NIET
short-circuiten zonder `report($e)`. Laravel's flow is `report()`
`render()` automatisch; render handlers die een Response retourneren
hebben report al gehad.
Als de nieuwe handler een Throwable consumeert die niet via
`Integration::handles()` zou gaan (e.g. een eigen `$exception->report()`
methode op een custom exception), voeg een case toe aan
`ExceptionReportingTest` die bewijst dat het event alsnog gecaptured
wordt.
### 10.4 Een nieuwe environment (staging, demo, …)
Zie [BACKLOG OBS-9](./BACKLOG.md). Vereist:
1. GlitchTip-project provisioning + DSN naar 1Password.
2. CSP whitelist update (`apps/app/index.html` voor dev-style env, of
nieuwe nginx-config voor prod-style env).
3. `tests/Feature/Security/CspConnectsToObservabilityTest` uitbreiden
met assertion voor de nieuwe environment.
4. `deploy.sh` aanpassen als de release-tag-vorm verandert (default:
`crewli-app@<short-sha>`).
### 10.5 Een nieuwe Form Builder exception class
Zie §5.4. Bij toevoeging van een nieuwe FormBuilder exception:
- Als het een **expected business outcome** is: voeg toe aan
`ignore_exceptions` in `config/sentry.php` als de class niet via
`HttpException` of `ValidationException` afhandeling al geignored
wordt. Documenteer in §5.4.
- Als het een **programmer/infra error** is: niets toevoegen, de class
flowt automatisch via `Integration::handles($exceptions)`.
---
## §11 Verwijzingen
**Implementatie:**
- [`api/app/Services/Observability/SentryEventScrubber.php`](../api/app/Services/Observability/SentryEventScrubber.php)
- [`api/app/Listeners/Observability/AuthScopeContextListener.php`](../api/app/Listeners/Observability/AuthScopeContextListener.php)
- [`api/app/Listeners/Observability/TagJobAttemptOnSentry.php`](../api/app/Listeners/Observability/TagJobAttemptOnSentry.php)
- [`api/app/Http/Middleware/BindSentryRouteContext.php`](../api/app/Http/Middleware/BindSentryRouteContext.php)
- [`api/app/Http/Middleware/HandleImpersonation.php`](../api/app/Http/Middleware/HandleImpersonation.php)
- [`api/config/sentry.php`](../api/config/sentry.php)
- [`api/bootstrap/app.php`](../api/bootstrap/app.php)
- [`apps/app/src/observability/sentry.ts`](../apps/app/src/observability/sentry.ts)
- [`apps/app/src/observability/scrubber.ts`](../apps/app/src/observability/scrubber.ts)
- [`apps/app/src/observability/contextBinding.ts`](../apps/app/src/observability/contextBinding.ts)
- [`apps/app/index.html`](../apps/app/index.html)
- [`deploy/nginx/csp-spa.conf`](../deploy/nginx/csp-spa.conf)
- [`deploy.sh`](../deploy.sh)
**Tests (regression guards):**
- `tests/Feature/Observability/PiiScrubbingTest.php`
- `tests/Feature/Observability/AuthScopeContextListenerTest.php`
- `tests/Feature/Observability/AuthScopeBindingHttpFlowTest.php`
- `tests/Feature/Observability/BindSentryRouteContextTest.php`
- `tests/Feature/Observability/ExceptionReportingTest.php`
- `tests/Feature/Observability/RequestIdRoundTripTest.php`
- `tests/Feature/Observability/EventListenerRegistrationTest.php`
- `tests/Feature/Database/ActivityLogIndexesTest.php`
- `tests/Feature/Security/CspHeaderTest.php`
- `tests/Feature/Security/CspConnectsToObservabilityTest.php`
- `apps/app/src/observability/__tests__/scrubber.spec.ts`
- `apps/app/src/observability/__tests__/contextBinding.spec.ts`
**Documenten:**
- [`RFC-WS-7-OBSERVABILITY.md`](./RFC-WS-7-OBSERVABILITY.md) —
historische implementation-spec
- [`GLITCHTIP.md`](./GLITCHTIP.md) — operational runbook
- [`runbooks/observability-triage.md`](./runbooks/observability-triage.md) —
incoming-issue triage procedure
- [`runbooks/observability-erasure.md`](./runbooks/observability-erasure.md) —
GDPR Art. 17 procedure
- [`SECURITY_AUDIT.md`](./SECURITY_AUDIT.md) — A13-9 (CSP) + WS-7
finale entry (processing register, security controls)
- [`BACKLOG.md`](./BACKLOG.md) — OBS-* entries (active + resolved)
- [`ARCH-FORM-BUILDER.md`](./ARCH-FORM-BUILDER.md) — Form Builder runtime
(consumer of §5.4 exception classification)
- [`ARCH-BINDINGS.md`](./ARCH-BINDINGS.md) — apply pipeline (origin of
the runtime exceptions captured in §5.4)
- [`RFC-WS-6.md`](./RFC-WS-6.md) — WS-6 binding pipeline design