feat: complete person identity matching system with fuzzy detection, revert, and manual link

Implements the full identity matching engine: email matching (HIGH confidence),
fuzzy name matching with Levenshtein distance (MEDIUM confidence, upgradable to
HIGH with DOB tiebreaker), manual link/unlink, revert confirmed matches, and
automatic detection via PersonObserver. Includes 33 comprehensive tests, frontend
integration with confirm/dismiss/unlink UI, and match indicators in the persons list.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-04-14 08:44:24 +02:00
parent 7932e53daf
commit eb1a0ac666
30 changed files with 1941 additions and 399 deletions

View File

@@ -348,17 +348,52 @@ Validates:
## Identity Matches
### Endpoints
- `GET /organisations/{org}/identity-matches` — list pending matches for the organisation (paginated, 25 per page)
- `GET /organisations/{org}/persons/{person}/identity-match` — show pending match for a specific person
- `POST /organisations/{org}/identity-matches/{match}/confirm` — confirm a match (links `person.user_id`)
- `POST /organisations/{org}/identity-matches/{match}/dismiss` — dismiss a match (hidden, person stays unlinked)
- `POST /organisations/{org}/identity-matches/{match}/confirm` — confirm a match (links `person.user_id`, dismisses other pending matches, syncs tags)
- `POST /organisations/{org}/identity-matches/{match}/dismiss` — dismiss a match (hidden, person stays unlinked, not re-suggested)
- `POST /organisations/{org}/identity-matches/{match}/revert` — revert a confirmed match (unlinks `person.user_id`, status → `reverted`)
- `POST /organisations/{org}/identity-matches/bulk-confirm` — bulk confirm multiple matches
- `POST /organisations/{org}/events/{event}/persons/{person}/manual-link` — manually link a person to a user account (body: `{ "user_id": "ulid" }`)
- `POST /organisations/{org}/events/{event}/persons/{person}/unlink` — unlink a person from their user account
### Match Types (`IdentityMatchMethod`)
| Value | Description | Confidence |
| ------------ | ------------------------------------ | ---------- |
| `email` | Exact email match within org | `high` |
| `name_fuzzy` | Levenshtein fuzzy name match | `medium` (or `high` if DOB also matches) |
| `manual` | Organiser-initiated manual link | `high` |
### Match Confidence (`IdentityMatchConfidence`)
| Value | Description |
| -------- | -------------------------------------------------------- |
| `high` | High certainty — exact email, or fuzzy name + DOB match |
| `medium` | Moderate certainty — fuzzy name match without DOB |
### Match Status (`IdentityMatchStatus`)
| Value | Description |
| ----------- | ------------------------------------------------- |
| `pending` | Awaiting organiser review |
| `confirmed` | Organiser confirmed — `person.user_id` is linked |
| `dismissed` | Organiser dismissed — not re-suggested |
| `reverted` | Previously confirmed, then unlinked |
### Detection
Matches are created automatically:
- When a person is created (via `POST /organisations/{org}/events/{event}/persons`) with an email matching an existing user → pending match created
- When a new user account is created (invitation acceptance) with an email matching unlinked persons → pending matches created
Matches are detected automatically via `PersonObserver`:
- **On Person create**: if person has no `user_id` and has an email or name, `PersonIdentityService::detectMatches()` runs
- **On Person email update**: if person's email changed and person is unlinked, detection re-runs
- **On user creation**: `PersonIdentityService::detectMatchesForUser()` finds all unlinked persons with matching email
Detection strategies (in priority order):
1. **Exact email** within same organisation → `email` / `high`
2. **Fuzzy name** (Levenshtein distance ≤2 for short names, ≤3 for longer) → `name_fuzzy` / `medium`
3. **Fuzzy name + DOB match** → upgrades to `high` confidence
No silent auto-linking. Every identity link requires explicit confirmation.
@@ -368,19 +403,24 @@ No silent auto-linking. Every identity link requires explicit confirmation.
Body: `{ "match_ids": ["ulid1", "ulid2", ...] }` (max 100)
Response: `{ "confirmed": 2, "errors": [{ "match_id": "ulid3", "error": "User already has a person record in this event." }] }`
Response: `{ "confirmed": 2, "errors": [{ "match_id": "ulid3", "error": "User already has a person record with this crowd type in this event." }] }`
### PersonResource enrichment
`GET /organisations/{org}/events/{event}/persons` includes `pending_identity_match` inline when a pending match exists:
`GET /organisations/{org}/events/{event}/persons` now includes:
```json
{
"has_user_account": true,
"user_account": { "id": "ulid", "email": "jan@example.nl", "full_name": "Jan de Vries" },
"pending_identity_match": {
"match_id": "ulid",
"matched_user": { "id": "ulid", "first_name": "Jan", "last_name": "", "full_name": "Jan", "email": "jan@example.nl" },
"matched_user": { "id": "ulid", "first_name": "Jan", "last_name": "de Vries", "full_name": "Jan de Vries", "email": "jan@example.nl", "date_of_birth": "1990-01-01" },
"matched_on": "email",
"confidence": "exact"
"matched_on_label": "E-mail match",
"confidence": "high",
"confidence_label": "Hoge zekerheid",
"match_details": { "matched_fields": ["email"], "..." : "..." }
}
}
```

View File

@@ -769,32 +769,47 @@ $effectiveDate = $shift->end_date ?? $shift->timeSlot->date;
## 3.5.5c Person Identity Matching
> **v1.8:** Enterprise-grade identity resolution with three steps: detect → suggest → confirm.
> No silent auto-linking. When a person is created with an email matching an existing user,
> or when a new user account is created with an email matching unlinked persons, the system
> creates pending match records for organisers to review.
> **v1.8+:** Enterprise-grade identity resolution with three steps: detect → suggest → confirm.
> No silent auto-linking. Supports email matching (HIGH confidence), fuzzy name matching
> (MEDIUM confidence, upgradable to HIGH with DOB match), manual linking, and revert/unlink.
> PersonObserver triggers detection automatically on Person create/update.
### `person_identity_matches`
| Column | Type | Notes |
| ---------------------- | ------------------ | ---------------------------------------------------------------------------- |
| `id` | ULID | PK — `HasUlids` trait. Entity with its own lifecycle, not a pure pivot |
| `person_id` | ULID FK | → persons. `constrained()->cascadeOnDelete()` |
| `matched_user_id` | ULID FK | → users. Named `matched_user_id` (not `user_id`) to avoid confusion with `persons.user_id`. `constrained()->cascadeOnDelete()` |
| `matched_on` | string | Enum: `email\|phone\|manual` (`IdentityMatchMethod`) |
| `confidence` | string | Enum: `exact\|fuzzy` (`IdentityMatchConfidence`). `exact` = deterministic match, `fuzzy` = algorithmic |
| `status` | string | Enum: `pending\|confirmed\|dismissed` (`IdentityMatchStatus`), default `pending` |
| `resolved_by_user_id` | ULID FK nullable | → users (who confirmed or dismissed). `constrained()->nullOnDelete()` |
| `resolved_at` | timestamp nullable | When the match was confirmed or dismissed |
| `created_at` | timestamp | |
| Column | Type | Notes |
| ----------------------- | ------------------ | ---------------------------------------------------------------------------- |
| `id` | ULID | PK — `HasUlids` trait. Entity with its own lifecycle, not a pure pivot |
| `person_id` | ULID FK | → persons. `constrained()->cascadeOnDelete()` |
| `matched_user_id` | ULID FK | → users. Named `matched_user_id` (not `user_id`) to avoid confusion with `persons.user_id`. `constrained()->cascadeOnDelete()` |
| `matched_on` | string | Enum: `email\|name_fuzzy\|manual` (`IdentityMatchMethod`) |
| `confidence` | string | Enum: `high\|medium` (`IdentityMatchConfidence`). `high` = exact email or fuzzy+DOB, `medium` = fuzzy name only |
| `status` | string | Enum: `pending\|confirmed\|dismissed\|reverted` (`IdentityMatchStatus`), default `pending` |
| `match_details` | JSON nullable | Snapshot of matched fields, emails, names, DOB at detection time |
| `confirmed_by_user_id` | ULID FK nullable | → users (who confirmed). `constrained()->nullOnDelete()` |
| `confirmed_at` | timestamp nullable | When the match was confirmed |
| `dismissed_by_user_id` | ULID FK nullable | → users (who dismissed). `constrained()->nullOnDelete()` |
| `dismissed_at` | timestamp nullable | When the match was dismissed |
| `reverted_by_user_id` | ULID FK nullable | → users (who reverted/unlinked). `constrained()->nullOnDelete()` |
| `reverted_at` | timestamp nullable | When a confirmed match was reverted |
| `resolved_by_user_id` | ULID FK nullable | → users (legacy, set on confirm/dismiss). `constrained()->nullOnDelete()` |
| `resolved_at` | timestamp nullable | When the match was resolved (legacy) |
| `created_at` | timestamp | |
**Design notes:**
- No `updated_at`: status transitions are captured by `resolved_at`. Model sets `const UPDATED_AT = null;`.
- Single `resolved_by`/`resolved_at` pair: status enum is exclusive (pending → confirmed OR pending → dismissed). Spatie activity log records the full audit trail.
- No `updated_at`: status transitions tracked via specific `*_at` columns. Model sets `const UPDATED_AT = null;`.
- Specific `confirmed_by`/`dismissed_by`/`reverted_by` columns track each action separately, enabling a match lifecycle of: pending → confirmed → reverted.
- `resolved_by`/`resolved_at` retained for backward compatibility (set on confirm/dismiss).
- Detection strategies: (1) Exact email within org → HIGH, (2) Fuzzy name (Levenshtein ≤2/3) → MEDIUM, (3) Fuzzy name + DOB match → HIGH.
**Unique constraint:** `UNIQUE(person_id, matched_user_id)` — prevent duplicate match records
**Indexes:** `(person_id, status)`, `(matched_user_id, status)`, `(status)`
**Foreign keys:** `person_id` → persons (cascade delete), `matched_user_id` → users (cascade delete), `resolved_by_user_id` → users (null on delete)
**Foreign keys:** `person_id` → persons (cascade delete), `matched_user_id` → users (cascade delete), all `*_user_id` → users (null on delete)
### `users.date_of_birth`
| Column | Type | Notes |
| --------------- | ------------- | ---------------------------------------- |
| `date_of_birth` | date nullable | Added after `last_name`. Used as DOB tiebreaker for fuzzy name matching |
---