WS-7 Observability — closure #8
Reference in New Issue
Block a user
Delete Branch "feat/ws-7-observability"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
WS-7 Observability is afgerond als implementation-werk. 4 PRs gemerged op feat/ws-7-observability:
5f6fc07)d4a450d)Tests: 1551 backend + 252 frontend, alle groen.
Acceptance criteria 1-14 voldaan:
Architecturale patronen vastgelegd:
Observability live op monitoring.hausdesign.nl met twee GlitchTip projecten (crewli-api + crewli-app), 90-dagen retention, email-alerting naar support@hausdesign.nl, dagelijks backup-cron.
Refs:
Merge style: Merge commit (--no-ff equivalent in Gitea UI = "Create merge commit").
PR-2 follow-up. The PR-2 backend SDK install passed unit tests because they exercised the scrubber and the BindSentryContext scope writer in isolation, but live exceptions from controllers never reached GlitchTip — they were correctly logged to laravel.log but the report() call had no Sentry-aware reporter to invoke. Root cause: sentry-laravel 4.x does NOT auto-register an exception reporter. The host application is required to wire Integration::handles inside withExceptions in bootstrap/app.php (per the package README and Sentry docs). Without it, report and Laravels automatic report-before-render flow only hit the default log channel. Fix: add Integration::handles at the top of withExceptions so sentry-laravel registers a reportable callback that calls captureUnhandledException for every reported throwable. Filtering remains downstream: - ignore_exceptions in config/sentry.php drops Validation, Authentication, Authorization (RFC §3.10). - SentryEventScrubber::scrub returns null for sub-500 HttpException via the before_send hook (RFC §3.7). Regression coverage: tests/Feature/Observability/ExceptionReportingTest installs a real Sentry client with a recording before_send and exercises the full request to capture pipeline through the auth and sentry.context middleware. Five cases: RuntimeException IS captured (with §3.6 tags attached), ValidationException is not, NotFoundHttpException 404 is not, AuthorizationException 403 is not, request-context tags ride along on the captured event. Test count: 1532 to 1537. Larastan clean. Pint clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>PR-2 live smoke test surfaced that super_admin platform-route exceptions arrived without organisation_id, and the original RFC §3.6 invariant (always-present organisation_id on authenticated events) would force misleading attribution if it tried to fill that gap. Refined invariant: every authenticated event carries actor_scope (organisation/platform/user/anonymous), AND when actor_scope is organisation, organisation_id MUST be a valid ULID. Platform-mode correctly omits organisation_id rather than fabricate one. Resolution chain in AuthScopeContextListener: 1. {organisation} or {event} URI parameter -> actor_scope=organisation 2. portal_event request attribute -> actor_scope=organisation 3. super_admin on admin.* named route -> actor_scope=platform (Crewli's platform-admin routes use the admin. name prefix) 4. Default authenticated -> actor_scope=user, no org tag (User<->Organisation is many-to-many; no reliable single-org hint) Eight new test cases in AuthScopeContextListenerTest cover each branch and the conditional invariant, including ULID validity via Symfony\Component\Uid\Ulid::isValid. Test count 1531 to 1539. Larastan clean. Pint clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>PR-2 verified that Spatie's activitylog default migration creates the composite indexes RFC-WS-7 §3.14 / addendum D-06 require — via nullableMorphs('subject') and nullableMorphs('causer'), which emit indexes named `subject` on (subject_type, subject_id) and `causer` on (causer_type, causer_id). This test queries information_schema.STATISTICS and fails if either composite is missing, regardless of the index name. It guards against silent regression when: - A future Spatie major release changes nullableMorphs semantics. - A developer rewrites the activity_log migration without preserving the morph indexes. - A schema-dump regeneration drops them. Test count 1539 to 1541. Larastan clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>Live HTTP smoke test on the post-architectural-fixes branch surfaced that captured Sentry events carried only route-scope tags (app, route_name, http.method) — auth-scope tags (user_id, actor_type, actor_scope) were absent on every request. Root cause: Sanctum's Guard fires Laravel\Sanctum\Events\TokenAuthenticated (vendor/laravel/sanctum/src/Guard.php:77) on bearer-token resolution, NOT Illuminate\Auth\Events\Authenticated. The Authenticated event only fires from SessionGuard (vendor/laravel/framework/src/Illuminate/Auth/SessionGuard.php:833), which Crewli does not use — CookieBearerToken middleware injects the httpOnly cookie as Authorization: Bearer, then auth:sanctum invokes Sanctum's Guard. So the listener never ran on Crewli's HTTP path. Offline tests in AuthScopeContextListenerTest passed because they dispatch event(new Authenticated(...)) directly, bypassing the Guard layer. Sanctum::actingAs() in tests has the same blind spot — it short-circuits the Guard via guard('sanctum')->setUser() and fires neither event. Fix: - New handleTokenAuthenticated(TokenAuthenticated $event) method on AuthScopeContextListener extracts the user via $event->token->tokenable and delegates to a private bindForUser() shared with handle(). - AppServiceProvider registers the listener for both Authenticated (covers SessionGuard / login flow / future authenticators) and TokenAuthenticated (covers Crewli's bearer-token Sanctum flow). Regression coverage: AuthScopeBindingHttpFlowTest exercises the real Sanctum Guard via $user->createToken() + Authorization: Bearer header. Three cases: - super_admin on a user-scope route: actor_scope=user, all auth tags present. - super_admin on an admin.* route: actor_scope=platform, no organisation_id (correct platform-mode behaviour). - org_admin on a route with {organisation} param: actor_scope= organisation, organisation_id valid ULID. Test count 1541 to 1544. Larastan clean. Pint clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>Drie regression-tests die de klasse fouten uit PR-2 nazorg empirisch voorkomen: 1. test_authenticated_listener_registered_exactly_once 2. test_token_authenticated_listener_registered_exactly_once 3. test_job_processing_tag_listener_registered_exactly_once — vangen OBS-8 patroon (auto-discovery + explicit listen samen) plus accidentally-removed registrations door toekomstige refactors. Walk Event::getRawListeners() en faalt met count != 1 met een duidelijke message ("auto-discovery re-enabled? OR explicit Event::listen missing?"). Empirisch geverifieerd: zowel duplicate als missing registratie wordt gevangen. 4. test_impersonation_active_tag_invariant_on_captured_events — RFC §3.6 binary signal invariant op een echte HTTP request flow. Vangt regressie waar de baseline-tag-binding verdwijnt. BACKLOG.md OBS-8 entry toegevoegd en gemarkeerd als Resolved met verwijzing naar de drie commits van deze sessie + architecturaal pattern (explicit > implicit voor observability-kritische bindings). Test count 1545 to 1549. Larastan + Pint clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>WS-7 PR-3 commit 2. - scrubber.spec.ts (18 cases): mirrors backend PiiScrubbingTest semantics. Body/header/query scrubbing, form_values wholesale replacement, all SENSITIVE_BODY_KEYS at top + nested levels, max_depth guard, cookies + storage + user.cookies sanitisation. - contextBinding.spec.ts (11 cases): exercises the Vue Router beforeEach guard against a real router with mocked Sentry scope (capturing every setTag/setUser call into a per-test buffer). Cases: - portal-token zone — actor_scope=portal, no user_id - platform route + super_admin — actor_scope=platform - platform route without super_admin — does NOT tag platform - organizer route with active org — actor_scope=organisation + organisation_id - organizer route without active org — actor_scope=user, no org tag - unauthenticated public — actor_scope=anonymous - actor_type role hierarchy - RFC §3.8 ULID-only user identity (no email leakage) - route_name + app=app baseline tags - cross-zone leak guard: navigating from organizer to portal-token calls scope.clear() and does not bind user Frontend test count 223 to 252. Typecheck clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>Acceptance criteria 1-14 voldaan; observability volledig operationeel op monitoring.hausdesign.nl. Implementation criteria 3, 4, 5, 6, 8, 11, 12, 13, 14 via 4 PRs op feat/ws-7-observability; operationele criteria 1, 2, 7, 9, 10 via deploy-checklist. Hernoem 'Observability follow-ups (post WS-7)' sectie-header naar '(post WS-7 closure)' voor accuratesse na PR-3 + PR-4. Closure-entry geplaatst onderaan 'Opgeloste items (mei 2026)' om chronologische volgorde (oldest-first) te respecteren — WS-7 op 2026-05-07 volgt WS-3 PR-C op 2026-05-06 die volgt op WS-TOOLING-001 op 2026-05-05. Refs: dev-docs/ARCH-OBSERVABILITY.md, dev-docs/runbooks/observability-{triage,erasure}.md