crewli/dev-docs/ADR-LEFTHOOK-LFS-INTEGRATION.md

---
title: "ADR — Lefthook + Git-LFS integration: own the hook layer, don't duplicate it"
status: Accepted
date: 2026-05-11
owner: infra
supersedes: commit 1b06804 (the `piped: true` placebo fix)
---

# ADR — Lefthook + Git-LFS integration

## Status

**Accepted, 2026-05-11.**

## Context

Crewli's `pre-push` git hook is managed by **lefthook v2** (introduced
in WS-3, April 2026, for `.claude-sync/` post-commit propagation and
non-blocking sync-staleness warnings on push). **Git LFS** was added
in TEST-INFRA-001 (May 2026, commit `b8d18e6`) to store visual
regression baseline PNGs without bloating the git pack.

The interaction between the two tools is therefore **two days old**.
For the first weeks of the project, only one of the two was active at
a time — the bug below did not exist because the trigger was not yet
present.

### Symptom: `git push` hangs in pre-push hook

Two incidents are on record. Both manifest as `git push` blocking
indefinitely inside the `pre-push` hook with no path to recover other
than `--no-verify`:

| Incident | Commit pushed | LFS objects in push | Symptom |
|---|---|---|---|
| 1 | `b8d18e6` (TEST-INFRA-001) | 5 PNGs | `Uploading LFS objects: 100% (5/5), 265 KB, done.` → hang |
| 2 | `99eedb6` (F2 docs) | 0 | Spinner stuck at `waiting: git-lfs` |

The previous fix attempt — commit `1b06804`, which added
`piped: true` to the pre-push block — appeared to succeed on its own
push but failed on the next non-LFS push (incident 2). It was based
on a wrong understanding of `piped`'s semantics (see "Two misconceptions"
below).

## Decision

**Lefthook owns the `pre-push` hook for both sync-check AND LFS.** LFS
upload is delegated to lefthook's **built-in** pre-push LFS handler.
The `lefthook.yml` `pre-push` block contains **one** user-defined
command (`sync-check`) and nothing else.

Concretely:

```yaml
# lefthook.yml
pre-push:
  commands:
    sync-check:
      run: bash .githooks/pre-push
```

No `git-lfs:` command. No `piped: true`. No `use_stdin: true`. No
`skip_lfs: true`.

## Rationale

### Lefthook already runs git-lfs internally for pre-push

`docs/usage/features/git-lfs.md` (verbatim):

> Lefthook runs LFS hooks internally for the following hooks:
> post-checkout, post-commit, post-merge, **pre-push**.
> Errors are suppressed if git LFS is not required for the project.

Source confirmation in `internal/run/controller/lfs.go`:

```go
err = c.cmd.RunWithContext(
    ctx,
    append([]string{"git", "lfs", hookName}, args...),
    "",
    c.cachedStdin,   // ← buffered cached stdin with proper EOF
    out, errOut,
)
```

The internal handler reads git's pre-push stdin **once**, caches it in
a `bytes.Buffer`, then invokes `git lfs pre-push <remote> <url>` with
that buffer as stdin — which delivers EOF cleanly when consumed.

### The manual command was a literal duplicate, directly visible in logs

Sandbox reproduction with the pre-fix `lefthook.yml` (filesystem
remote, `LEFTHOOK_VERBOSE=1`) shows both invocations in the same
push:

```
│  [git-lfs] executing hook: git lfs pre-push origin /tmp/...     ← internal
│  [git-lfs] stdout: Uploading LFS objects: 100% (1/1), ..., done.
│  [lefthook] job: git lfs pre-push origin /tmp/...               ← duplicate
│  [lefthook] run: git lfs pre-push origin /tmp/...
┃  git-lfs ❯
```

After the fix (same sandbox, same data), the second invocation is
gone and the LFS upload still happens via the internal call:

```
│  [git-lfs] executing hook: git lfs pre-push origin /tmp/...
│  [git-lfs] stdout: Uploading LFS objects: 100% (1/1), ..., done.
│  [lefthook] run: bash .githooks/pre-push
┃  sync-check ❯
```

### Sandbox falsifiability test: result and limitation

A `/tmp/lefthook-lfs-test/` sandbox with a bare-repo file remote was
built per the Phase B spec. Both the pre-fix LFS-object push and the
pre-fix docs-only push **completed in ~0.3 seconds; the hang did not
reproduce**, despite the duplicate execution being clearly visible in
the logs.

This means the duplicate execution is **necessary but possibly not
sufficient** for the production hang. A network remote (Gitea over
SSH/HTTPS) appears to be a required additional ingredient — the
sandbox's filesystem remote requires no LFS server handshake, so the
manual second invocation exits immediately with nothing to verify.

Two mechanisms are plausible; the same fix addresses both:

- **H1 (Phase A):** PTY-stdin without EOF in the manual command's
  `while read` loop (`docs/configuration/use_stdin.md` verbatim:
  *"without this option lefthook will hang: lefthook uses pseudo TTY
  by default, and it doesn't close stdin when all data is read"*).
- **H4 (Phase B):** Server-side LFS interaction on the duplicate
  call — locking, exhausted SSH session, or a stalled HTTPS response
  on the verify path — that only manifests against a real Gitea LFS
  endpoint.

Because removal of the manual command is **zero-cost** (it is
redundant by construction; the internal call already does the work),
the fix is correct on architectural grounds regardless of which
mechanism is the proximate cause.

## Two misconceptions to prevent regression

### 1. The `--skip-repo` comment was technically correct but conceptually misleading

The pre-fix `lefthook.yml` carried this comment on the manual command:

```yaml
git-lfs:
  # `git lfs install --skip-repo` only sets up global clean/smudge
  # filters — it does NOT install the per-repo pre-push hook
  # (which would conflict with lefthook). We delegate the LFS
  # upload step here so screenshot baselines tracked via LFS get
  # pushed alongside their commits.
  run: git lfs pre-push {1} {2}
```

**The first two lines are accurate.** `git lfs install --skip-repo`
does set the global clean/smudge filters without writing
`.git/hooks/pre-push`. That part of the reasoning is fine and remains
the right setup for a lefthook-managed repo.

**The third line — "we delegate the LFS upload step here" — is
wrong.** Lefthook delegates the LFS upload step **internally**,
without needing either the per-repo hook OR the manual command.
A future reviewer who reads only the first two lines may find them
compelling and revert to the manual command — that would re-introduce
the bug.

This ADR is the authoritative correction: **no manual command. No
per-repo hook. Lefthook's built-in handles it.**

### 2. `piped: true` does NOT control sequencing or stdin routing

The pre-fix `lefthook.yml` justified `piped: true` as follows
(commit `1b06804`):

> piped: true forces serial execution. Both sync-check and git-lfs
> read from stdin (git pipes the push refspec to pre-push hooks);
> default parallel execution deadlocks with two stdin readers.

This is **wrong on two counts**:

- Default lefthook execution is **already sequential**.
  `docs/configuration/parallel.md` verbatim: *"Lefthook runs commands
  and scripts sequentially by default."* `piped: true` does not
  change ordering vs. the default.
- `piped: true` has **no documented relationship to stdin handling**.
  `docs/configuration/piped.md` verbatim: *"Stop running commands and
  scripts if one of them fail."* That is the entire semantic.

The fix in `1b06804` was a placebo. It worked on its own push by
coincidence and failed on the next non-LFS push (incident 2). A
future reviewer should not assume `piped: true` solves any
stdin-related concern.

## Consequences

- **LFS uploads continue to happen on `git push`** — automatically,
  via lefthook's internal pre-push LFS handler. No configuration
  required.
- **If you ever need to disable lefthook's LFS handling** (e.g.,
  in a sub-repo or CI context that handles LFS separately), set
  `skip_lfs: true` at the root of `lefthook.yml` per
  `docs/usage/features/git-lfs.md`. Do **not** remove the internal
  handler by adding a manual command — that re-introduces the
  duplicate.
- **Adding new `pre-push` commands that read stdin** requires
  `use_stdin: true` on **exactly one** command per
  `docs/configuration/use_stdin.md`:
  *"With many commands or scripts having `use_stdin: true`, only one
  will receive the data. The others will have nothing."*
  Lefthook's internal LFS handler is already that one consumer for
  its cached stdin buffer, so any new stdin-reading command would
  need to coordinate (most likely: rework so only one consumer
  exists, or accept that the new command does not see the refspec).
- **The non-blocking pre-push sync-check** (`.githooks/pre-push`)
  remains the only user-defined command. It does not read stdin, so
  this constraint does not affect it.

## Alternative scenarios (if Phase E regresses)

If the real `git push` of `fix/lefthook-lfs-deadlock` hangs after
this fix is applied, the architectural change is correct but a
**different proximate cause** is in play. In that case:

1. Do **not** revert this ADR. The duplicate execution is wrong
   regardless.
2. Capture `LEFTHOOK_VERBOSE=1 GIT_TRACE=1 GIT_CURL_VERBOSE=1` output
   from the hanging push.
3. Build a **network-y sandbox** (local Gitea container or an HTTP
   stub server that responds to LFS verify-objects requests with
   controlled timing) to reproduce against. The Phase B sandbox
   (filesystem remote) is insufficient for this class of bug.
4. The investigation surface widens to git-lfs's interaction with
   Gitea's LFS endpoint, SSH session lifetimes, and the lefthook
   process tree around the `cachedStdin` buffer. Hypothesis H4 above
   is the entry point.

## How to verify

### Sandbox procedure (Phase B reproduction; proves duplicate execution is gone)

```bash
# Disposable sandbox
rm -rf /tmp/lefthook-lfs-test
mkdir -p /tmp/lefthook-lfs-test
git init --bare /tmp/lefthook-lfs-test/remote.git -q
git init -q /tmp/lefthook-lfs-test/work
cd /tmp/lefthook-lfs-test/work
git config user.email "t@t" && git config user.name "t"
git remote add origin /tmp/lefthook-lfs-test/remote.git

# Mirror Crewli's config
cp <crewli-root>/lefthook.yml .
mkdir .githooks
cp <crewli-root>/.githooks/{pre-push,post-commit} .githooks/
chmod +x .githooks/*

# Install hooks; LFS filter-only
LEFTHOOK_BIN=<path-to>/lefthook lefthook install
git lfs install --skip-repo
git lfs track "*.png"

# Add an LFS-tracked file, commit, push
# (any small PNG works)
git add .gitattributes lefthook.yml .githooks <some>.png
LEFTHOOK=0 git commit -q -m "init"
LEFTHOOK_VERBOSE=1 git push -u origin master
```

**Pass criteria:**
- Push completes within a few seconds (target: <5s on local FS).
- Logs show exactly **one** `[git-lfs] executing hook: git lfs pre-push`
  line.
- Logs show **no** `[lefthook] run: git lfs pre-push` line (only
  `[lefthook] run: bash .githooks/pre-push`).
- `Uploading LFS objects: 100% (...)` appears, proving the internal
  handler did the work.

### Production procedure (Phase E; proves the real hang is gone)

```bash
# In the crewli repo, on any feature branch with at least one new commit:
git push -u origin <branch>          # NO --no-verify
```

**Pass criteria:**
- Push completes within ~30 seconds (well under the 30s timeout
  observed in the incidents).
- No spinner gets stuck at `waiting: git-lfs`.
- If the branch contains LFS-tracked changes:
  `Uploading LFS objects: 100% (N/N), ...` appears and the push
  completes.

**If verification fails:** see "Alternative scenarios" above. Do not
revert this ADR; investigate via H4 with a network-y sandbox.

## References

- `docs/usage/features/git-lfs.md` (lefthook) — internal LFS hook list
- `docs/configuration/use_stdin.md` (lefthook) — PTY-stdin EOF behavior
- `docs/configuration/piped.md` (lefthook) — actual semantics of `piped`
- `docs/configuration/parallel.md` (lefthook) — default sequential execution
- `internal/run/controller/lfs.go` (lefthook) — `cachedStdin` source
- `internal/git/lfs.go` (lefthook) — pre-push in `lfsHooks` map
- `git-scm.com/docs/githooks` — pre-push stdin spec
- Crewli commits: `b8d18e6` (LFS introduced), `1b06804` (failed first-attempt fix), `99eedb6` (incident 2)