fix(form-builder): canonicalize JSON for byte-stable storage (WS-6)

MySQL 8.0 JSON columns may reorder associative-array keys on
round-trip. For audit-immutable values (schema snapshots, webhook
payloads, activity log diffs), this is corrupting: re-emits produce
different byte sequences for the same logical content.

Introduced JsonCanonicalizer (recursive ksort on associative arrays;
numeric-indexed lists preserve order) and applied at every writer
site that produces byte-stable JSON:

- FormSubmissionService: canonicalize the schema_snapshot array
  before storage (audit-immutable per ARCH §4.3, RFC-WS-6 v1.1).
- FormField::logFieldChange / FormSchema::logSchemaChange: canonicalize
  activity-log properties before withProperties() so old/new diffs
  read back byte-stable.
- BindingActivityLogger: canonicalize both the pass-level and
  per-binding activity properties.
- FormWebhookDispatcher: canonicalize payload_snapshot before
  storage (delivery-time HMAC re-encodes the same canonical bytes).
- DeliverFormWebhookJob: switched json_encode to
  JsonCanonicalizer::encode for the HMAC-signed body, so the
  signature is byte-stable across re-deliveries and reproducible by
  receivers from the same logical payload.

Sites NOT canonicalized (deliberate):
- form_schemas.settings — opaque UI config; key order has no
  semantic meaning, no byte-stability requirement.
- form_schemas.translations / form_fields.translations — read by
  display layer; key order doesn't matter.
- form_templates.schema_snapshot — user-supplied input via store/
  update; user is the source of truth, not audit-immutable in the
  same way as form_submissions.schema_snapshot.

Reverted the 7 assertEquals workarounds from session 2.6:
- ConditionalLogicActivityLogPayloadTest
- ConditionalLogicBackfillTest::test_rollback_reconstructs_canonical_json
- FormFieldBindingMigrationTest::test_rollback_reconstructs_json_and_drops_table
- FormFieldOptionServiceAndScopeTest::test_replace_options_emits_activity_log_on_field_only
- FormFieldOptionsActivityLogTest::test_field_updated_payload_contains_options_diff_when_options_change
- FormFieldOptionsBackfillTest::test_forward_migration_backfills_rows_strips_translations_and_rewrites_snapshot
- FormFieldOptionsSnapshotAndStrictRequestTest::test_submission_snapshot_embeds_rich_shape_options

Each now uses assertSame on JsonCanonicalizer::encode of both sides —
byte-stable comparison meaningful regardless of MySQL JSON storage
behavior.

New regression test SchemaSnapshotByteStableAcrossReemitsTest
exercises the contract end-to-end: complex schema with bindings,
validation rules, options, conditional logic, submitted; reads
schema_snapshot via three roads (Eloquent cast, fresh model, raw
bytes) and asserts the canonical encode is identical.

ARCH-FORM-BUILDER.md §4.6.1 gets a "Byte-stability" sub-section
explaining what's canonicalized and why.

Test count: 1388 → 1400 (+11 JsonCanonicalizer unit, +1 snapshot
regression). Larastan clean. Rector dry-run unchanged at 355.

Refs: WS-6 session 2.6 deviation #4 cleanup, RFC-WS-6 v1.1

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-04-28 13:51:38 +02:00
parent 0afbd36bf7
commit a791a276fa
17 changed files with 488 additions and 82 deletions

View File

@@ -0,0 +1,126 @@
<?php
declare(strict_types=1);
namespace Tests\Feature\FormBuilder\Schema;
use App\Enums\FormBuilder\FormFieldType;
use App\Enums\FormBuilder\FormFieldValidationRuleType;
use App\Enums\FormBuilder\FormPurpose;
use App\Models\FormBuilder\FormField;
use App\Models\FormBuilder\FormSchema;
use App\Models\FormBuilder\FormSubmission;
use App\Models\Organisation;
use App\Services\FormBuilder\FormSubmissionService;
use App\Support\Json\JsonCanonicalizer;
use Illuminate\Foundation\Testing\RefreshDatabase;
use Illuminate\Support\Facades\DB;
use Tests\TestCase;
/**
* Regression test for MySQL JSON key-order non-determinism.
*
* Session 2.6 surfaced that MySQL JSON columns may reorder
* associative-array keys on round-trip. Session 2.7 introduced
* JsonCanonicalizer to stabilize the writers. This test asserts
* the contract end-to-end: a submission's schema_snapshot serialized
* to canonical JSON must match the canonical serialization of the
* raw bytes the DB returns on a separate read.
*/
final class SchemaSnapshotByteStableAcrossReemitsTest extends TestCase
{
use RefreshDatabase;
public function test_snapshot_bytes_are_stable_across_reads(): void
{
$org = Organisation::factory()->create();
$schema = FormSchema::factory()->create([
'organisation_id' => $org->id,
'purpose' => FormPurpose::EVENT_REGISTRATION,
'snapshot_mode' => 'on_submit',
'is_published' => true,
'public_token' => (string) \Illuminate\Support\Str::ulid(),
]);
// Email field with entity binding + validation rule.
FormField::factory()
->withValidationRule(FormFieldValidationRuleType::MaxLength, ['value' => 100])
->withEntityBinding('person', 'email')
->create([
'form_schema_id' => $schema->id,
'field_type' => FormFieldType::EMAIL->value,
'slug' => 'contact_email',
'label' => 'E-mail',
]);
// Number field with min/max validation + conditional logic.
FormField::factory()
->withValidationRule(FormFieldValidationRuleType::MinValue, ['value' => 18])
->withValidationRule(FormFieldValidationRuleType::MaxValue, ['value' => 99])
->withConditionalLogic([
'operator' => 'all',
'children' => [
['field_slug' => 'contact_email', 'operator' => 'not_empty'],
],
])
->create([
'form_schema_id' => $schema->id,
'field_type' => FormFieldType::NUMBER->value,
'slug' => 'leeftijd',
'label' => 'Leeftijd',
]);
// Select field with options + translations.
FormField::factory()
->withOptions(['XS', 'S', 'M', 'L'])
->create([
'form_schema_id' => $schema->id,
'field_type' => FormFieldType::SELECT->value,
'slug' => 'shirtmaat',
'label' => 'Shirtmaat',
]);
// Submit so schema_snapshot materializes.
$service = resolve(FormSubmissionService::class);
$draft = $service->createDraft($schema, null, null, []);
$service->submit($draft, null);
// First read: through Eloquent cast (decode → assoc array).
$first = FormSubmission::query()->withoutGlobalScopes()->findOrFail($draft->id);
$snapshotA = $first->schema_snapshot;
// Second read: a fresh model instance (no cached attributes).
$second = FormSubmission::query()->withoutGlobalScopes()->findOrFail($draft->id);
$snapshotB = $second->schema_snapshot;
// Third read: raw column bytes via the query builder, decoded once.
$rawJson = (string) DB::table('form_submissions')
->where('id', $draft->id)
->value('schema_snapshot');
$snapshotC = json_decode($rawJson, true);
// All three roads must produce byte-identical canonical JSON.
$this->assertSame(
JsonCanonicalizer::encode($snapshotA),
JsonCanonicalizer::encode($snapshotB),
);
$this->assertSame(
JsonCanonicalizer::encode($snapshotA),
JsonCanonicalizer::encode($snapshotC),
);
// And the canonical encode of every JSON-bearing nested fragment
// must be byte-identical too — covers each field's options /
// validation_rules / configs / bindings / conditional_logic
// in one assertion via the whole-snapshot canonical encode.
$this->assertNotEmpty($snapshotA['fields']);
foreach ($snapshotA['fields'] as $idx => $fieldA) {
$fieldC = $snapshotC['fields'][$idx];
$this->assertSame(
JsonCanonicalizer::encode($fieldA),
JsonCanonicalizer::encode($fieldC),
"field #{$idx} ({$fieldA['slug']}) drifted across reads",
);
}
}
}