Files

bert.hausmans 37af961b3e fix(lefthook): remove duplicate git-lfs pre-push command

Lefthook v2 runs `git lfs pre-push` internally for pre-push hooks (per
docs/usage/features/git-lfs.md; confirmed in internal/run/controller/
lfs.go where the internal handler invokes `git lfs pre-push <remote>
<url>` with a buffered `cachedStdin`). Our manual `git-lfs:` command
in lefthook.yml was a second invocation against the same remote; the
duplicate is directly visible in `LEFTHOOK_VERBOSE=1` output as
`[git-lfs] executing hook` (internal) followed by `[lefthook] run:
git lfs pre-push` (manual).

The previous fix attempt (piped: true, commit 1b06804) was based on a
wrong understanding of `piped`'s semantics — `piped` controls
fail-fast behavior, not stdin routing or sequencing. Default lefthook
behavior is already sequential per docs/configuration/parallel.md.
That "fix" was placebo; incident 2 (F2 push, zero LFS objects, commit
99eedb6) proved it.

Phase A investigation: documentary + source confirmation that lefthook
owns the LFS pre-push call. Phase B sandbox test against a filesystem
remote confirmed the duplicate execution in logs but did NOT reproduce
the production hang — likely because the duplicate manual call against
a local remote has no LFS server to interact with. A network-y remote
(Gitea over SSH/HTTPS) appears to be part of the trigger. Two
mechanisms remain plausible (H1: PTY-stdin without EOF in
`while read` loop per docs/configuration/use_stdin.md; H4: server-side
LFS interaction on the duplicate call). Both are eliminated by the
same fix: remove the manual command. LFS uploads continue to work via
lefthook's internal handler (verified in sandbox post-fix).

Regression coverage: scripts/test-lefthook-pre-push.sh asserts exactly
one internal LFS invocation, zero manual ones, and `Uploading LFS
objects: 100%` present, against a disposable sandbox.

See dev-docs/ADR-LEFTHOOK-LFS-INTEGRATION.md for full context, both
misconceptions to prevent regression, and the alternative-scenarios
playbook if Phase E ever regresses.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

2026-05-11 00:18:56 +02:00

12 KiB

Raw Permalink Blame History

title: "ADR — Lefthook + Git-LFS integration: own the hook layer, don't duplicate it" status: Accepted date: 2026-05-11 owner: infra supersedes: commit 1b06804 (the piped: true placebo fix)

ADR — Lefthook + Git-LFS integration

Status

Accepted, 2026-05-11.

Context

Crewli's pre-push git hook is managed by lefthook v2 (introduced in WS-3, April 2026, for .claude-sync/ post-commit propagation and non-blocking sync-staleness warnings on push). Git LFS was added in TEST-INFRA-001 (May 2026, commit b8d18e6) to store visual regression baseline PNGs without bloating the git pack.

The interaction between the two tools is therefore two days old. For the first weeks of the project, only one of the two was active at a time — the bug below did not exist because the trigger was not yet present.

Symptom: `git push` hangs in pre-push hook

Two incidents are on record. Both manifest as git push blocking indefinitely inside the pre-push hook with no path to recover other than --no-verify:

Incident	Commit pushed	LFS objects in push	Symptom
1	`b8d18e6` (TEST-INFRA-001)	5 PNGs	`Uploading LFS objects: 100% (5/5), 265 KB, done.` → hang
2	`99eedb6` (F2 docs)	0	Spinner stuck at `waiting: git-lfs`

The previous fix attempt — commit 1b06804, which added piped: true to the pre-push block — appeared to succeed on its own push but failed on the next non-LFS push (incident 2). It was based on a wrong understanding of piped's semantics (see "Two misconceptions" below).

Decision

Lefthook owns the pre-push hook for both sync-check AND LFS. LFS upload is delegated to lefthook's built-in pre-push LFS handler. The lefthook.yml pre-push block contains one user-defined command (sync-check) and nothing else.

Concretely:

# lefthook.yml
pre-push:
  commands:
    sync-check:
      run: bash .githooks/pre-push

No git-lfs: command. No piped: true. No use_stdin: true. No skip_lfs: true.

Rationale

Lefthook already runs git-lfs internally for pre-push

docs/usage/features/git-lfs.md (verbatim):

Lefthook runs LFS hooks internally for the following hooks: post-checkout, post-commit, post-merge, pre-push. Errors are suppressed if git LFS is not required for the project.

Source confirmation in internal/run/controller/lfs.go:

err = c.cmd.RunWithContext(
    ctx,
    append([]string{"git", "lfs", hookName}, args...),
    "",
    c.cachedStdin,   // ← buffered cached stdin with proper EOF
    out, errOut,
)

The internal handler reads git's pre-push stdin once, caches it in a bytes.Buffer, then invokes git lfs pre-push <remote> <url> with that buffer as stdin — which delivers EOF cleanly when consumed.

The manual command was a literal duplicate, directly visible in logs

Sandbox reproduction with the pre-fix lefthook.yml (filesystem remote, LEFTHOOK_VERBOSE=1) shows both invocations in the same push:

│  [git-lfs] executing hook: git lfs pre-push origin /tmp/...     ← internal
│  [git-lfs] stdout: Uploading LFS objects: 100% (1/1), ..., done.
│  [lefthook] job: git lfs pre-push origin /tmp/...               ← duplicate
│  [lefthook] run: git lfs pre-push origin /tmp/...
┃  git-lfs ❯

After the fix (same sandbox, same data), the second invocation is gone and the LFS upload still happens via the internal call:

│  [git-lfs] executing hook: git lfs pre-push origin /tmp/...
│  [git-lfs] stdout: Uploading LFS objects: 100% (1/1), ..., done.
│  [lefthook] run: bash .githooks/pre-push
┃  sync-check ❯

Sandbox falsifiability test: result and limitation

A /tmp/lefthook-lfs-test/ sandbox with a bare-repo file remote was built per the Phase B spec. Both the pre-fix LFS-object push and the pre-fix docs-only push completed in ~0.3 seconds; the hang did not reproduce, despite the duplicate execution being clearly visible in the logs.

This means the duplicate execution is necessary but possibly not sufficient for the production hang. A network remote (Gitea over SSH/HTTPS) appears to be a required additional ingredient — the sandbox's filesystem remote requires no LFS server handshake, so the manual second invocation exits immediately with nothing to verify.

Two mechanisms are plausible; the same fix addresses both:

H1 (Phase A): PTY-stdin without EOF in the manual command's while read loop (docs/configuration/use_stdin.md verbatim: "without this option lefthook will hang: lefthook uses pseudo TTY by default, and it doesn't close stdin when all data is read").
H4 (Phase B): Server-side LFS interaction on the duplicate call — locking, exhausted SSH session, or a stalled HTTPS response on the verify path — that only manifests against a real Gitea LFS endpoint.

Because removal of the manual command is zero-cost (it is redundant by construction; the internal call already does the work), the fix is correct on architectural grounds regardless of which mechanism is the proximate cause.

Two misconceptions to prevent regression

1. The `--skip-repo` comment was technically correct but conceptually misleading

The pre-fix lefthook.yml carried this comment on the manual command:

git-lfs:
  # `git lfs install --skip-repo` only sets up global clean/smudge
  # filters — it does NOT install the per-repo pre-push hook
  # (which would conflict with lefthook). We delegate the LFS
  # upload step here so screenshot baselines tracked via LFS get
  # pushed alongside their commits.
  run: git lfs pre-push {1} {2}

The first two lines are accurate. git lfs install --skip-repo does set the global clean/smudge filters without writing .git/hooks/pre-push. That part of the reasoning is fine and remains the right setup for a lefthook-managed repo.

The third line — "we delegate the LFS upload step here" — is wrong. Lefthook delegates the LFS upload step internally, without needing either the per-repo hook OR the manual command. A future reviewer who reads only the first two lines may find them compelling and revert to the manual command — that would re-introduce the bug.

This ADR is the authoritative correction: no manual command. No per-repo hook. Lefthook's built-in handles it.

2. `piped: true` does NOT control sequencing or stdin routing

The pre-fix lefthook.yml justified piped: true as follows (commit 1b06804):

piped: true forces serial execution. Both sync-check and git-lfs read from stdin (git pipes the push refspec to pre-push hooks); default parallel execution deadlocks with two stdin readers.

This is wrong on two counts:

Default lefthook execution is already sequential. docs/configuration/parallel.md verbatim: "Lefthook runs commands and scripts sequentially by default." piped: true does not change ordering vs. the default.
piped: true has no documented relationship to stdin handling. docs/configuration/piped.md verbatim: "Stop running commands and scripts if one of them fail." That is the entire semantic.

The fix in 1b06804 was a placebo. It worked on its own push by coincidence and failed on the next non-LFS push (incident 2). A future reviewer should not assume piped: true solves any stdin-related concern.

Consequences

LFS uploads continue to happen on git push — automatically, via lefthook's internal pre-push LFS handler. No configuration required.
If you ever need to disable lefthook's LFS handling (e.g., in a sub-repo or CI context that handles LFS separately), set skip_lfs: true at the root of lefthook.yml per docs/usage/features/git-lfs.md. Do not remove the internal handler by adding a manual command — that re-introduces the duplicate.
Adding new pre-push commands that read stdin requires use_stdin: true on exactly one command per docs/configuration/use_stdin.md: "With many commands or scripts having use_stdin: true, only one will receive the data. The others will have nothing." Lefthook's internal LFS handler is already that one consumer for its cached stdin buffer, so any new stdin-reading command would need to coordinate (most likely: rework so only one consumer exists, or accept that the new command does not see the refspec).
The non-blocking pre-push sync-check (.githooks/pre-push) remains the only user-defined command. It does not read stdin, so this constraint does not affect it.

Alternative scenarios (if Phase E regresses)

If the real git push of fix/lefthook-lfs-deadlock hangs after this fix is applied, the architectural change is correct but a different proximate cause is in play. In that case:

Do not revert this ADR. The duplicate execution is wrong regardless.
Capture LEFTHOOK_VERBOSE=1 GIT_TRACE=1 GIT_CURL_VERBOSE=1 output from the hanging push.
Build a network-y sandbox (local Gitea container or an HTTP stub server that responds to LFS verify-objects requests with controlled timing) to reproduce against. The Phase B sandbox (filesystem remote) is insufficient for this class of bug.
The investigation surface widens to git-lfs's interaction with Gitea's LFS endpoint, SSH session lifetimes, and the lefthook process tree around the cachedStdin buffer. Hypothesis H4 above is the entry point.

How to verify

Sandbox procedure (Phase B reproduction; proves duplicate execution is gone)

# Disposable sandbox
rm -rf /tmp/lefthook-lfs-test
mkdir -p /tmp/lefthook-lfs-test
git init --bare /tmp/lefthook-lfs-test/remote.git -q
git init -q /tmp/lefthook-lfs-test/work
cd /tmp/lefthook-lfs-test/work
git config user.email "t@t" && git config user.name "t"
git remote add origin /tmp/lefthook-lfs-test/remote.git

# Mirror Crewli's config
cp <crewli-root>/lefthook.yml .
mkdir .githooks
cp <crewli-root>/.githooks/{pre-push,post-commit} .githooks/
chmod +x .githooks/*

# Install hooks; LFS filter-only
LEFTHOOK_BIN=<path-to>/lefthook lefthook install
git lfs install --skip-repo
git lfs track "*.png"

# Add an LFS-tracked file, commit, push
# (any small PNG works)
git add .gitattributes lefthook.yml .githooks <some>.png
LEFTHOOK=0 git commit -q -m "init"
LEFTHOOK_VERBOSE=1 git push -u origin master

Pass criteria:

Push completes within a few seconds (target: <5s on local FS).
Logs show exactly one [git-lfs] executing hook: git lfs pre-push line.
Logs show no [lefthook] run: git lfs pre-push line (only [lefthook] run: bash .githooks/pre-push).
Uploading LFS objects: 100% (...) appears, proving the internal handler did the work.

Production procedure (Phase E; proves the real hang is gone)

# In the crewli repo, on any feature branch with at least one new commit:
git push -u origin <branch>          # NO --no-verify