# ARCH-OBSERVABILITY > Crewli's observability architecture — logging, monitoring, alerting, > metrics. > > Status: SKELETON. Section §3 (`$dontReport`) is concrete; other > sections are structured placeholders for WS-7 sessie 1 decisions. ## Document history - 2026-04-28 — v0.1 — Initial skeleton (WS-6 sessie 3b). Only §3 concrete; remainder placeholdered for WS-7. ## §1 — Logging strategy [WS-7: define log levels with explicit criteria. Example questions to answer in WS-7 sessie 1: - When does code use `Log::error` vs `Log::warning` vs `Log::info`? - Are unhandled exceptions automatically `error`? - Is `Log::debug` allowed in production, or stripped in deploy? - How do structured payload conventions tie to log keys (see §4)? ] ## §2 — Sentry decisions [WS-7: Sentry SDK install + configuration decisions. Skeleton: - Which environments report to Sentry? (dev / staging / production) - Sample rate per environment? - Source map upload to Sentry CI? - User context injection (auth user ID + organisation ID, opt-in redaction for PII)? - Breadcrumbs strategy (which events generate breadcrumbs)? - Release tagging convention (commit SHA? semver? both?)? ] ## §3 — `$dontReport` exceptions (concrete) The following exception classes are **expected business outcomes**, not bugs. They are caught and handled in the application; reporting them to Sentry would generate noise that drowns the signal. When the Sentry SDK lands (WS-7), add the following classes to Laravel's `app/Exceptions/Handler.php` `$dontReport` array: | Class | Reason | |---|---| | `\App\Exceptions\FormBuilder\PublishGuardViolationException` | Publish-time validation: schema fails a guard. Returned as 422 with field-level errors. Not a system bug. | | `\App\Exceptions\FormBuilder\PurposeRequirementsNotMetException` | Schema lacks required bindings for its purpose. Returned as 422. Not a system bug. | | `\App\Exceptions\FormBuilder\IdempotencyConflictException` | Duplicate idempotency key on submission. Returned as 409. Not a system bug. | **Out of scope for `$dontReport` (these DO go to Sentry):** - `App\Exceptions\FormBuilder\PersonProvisioningException` — runtime failure during the apply pipeline. Caught by `ApplyBindingsOnFormSubmit` and recorded as `FormSubmissionActionFailure`, but the engineering team needs visibility into recurring patterns across orgs. - `App\Exceptions\FormBuilder\PurposeSubjectResolutionException` — runtime resolution failure (no portal token, no auth user, etc.). Same dual-handling rationale: action-failures table for org-admin operational handling; Sentry for engineering visibility. - `App\Exceptions\FormBuilder\FormBindingApplicatorException` — runtime applicator failure (no_transaction, no_schema, unknown_purpose). These should never happen in production; if they do, they're systemic bugs — Sentry is the correct destination. The dual recording (Sentry + `form_submission_action_failures` table) is intentional: org admins fix specific failures via the WS-6 admin UI; engineering identifies systemic issues across all orgs via Sentry's aggregation. ## §4 — Structured logging conventions [WS-7: log key naming convention. Skeleton: - Hierarchical dot-separated namespace tree - Existing examples to align with: - `form-builder.apply.transaction_rolled_back` - `form-builder.identity-match.no_person_subject_post_apply` - `form-webhook.delivery.exception` Define the tree formally so future code discovers the right namespace deterministically.] ## §5 — Metrics [WS-7: which counters / histograms / gauges? Namespace? Statsd / Prometheus / OTel flavour? At minimum, candidate metrics: - `form_submissions_total` (counter, tagged by purpose) - `form_submission_apply_status` (counter, tagged by status) - `form_failures_open` (gauge per org) - `retry_attempts_total` (counter, tagged by outcome) - `apply_pipeline_duration_seconds` (histogram) ] ## §6 — Alerting rules [WS-7: which thresholds trigger alerts? Where (Slack? PagerDuty? Email?). At minimum, candidate alerts: - "Open failures > X for > Y hours" - "Apply pipeline error rate > X% in 1h window" - "no_transaction guard fired" (immediate alert; should never happen in production) - "Webhook dead-letter rate > X%" ] ## §7 — Dashboards [WS-7: Grafana / Cloudwatch / similar. Panel layout, widget types, default time ranges. Skeleton later.] ## Related docs - `RFC-WS-6.md` — WS-6 binding pipeline design (the failures observed and recorded by §3's classes originate here) - `ARCH-BINDINGS.md` — apply pipeline architecture - `ARCH-FORM-BUILDER.md` — form-builder runtime including webhooks