Monorepo Partial Clone Has Stale Objects

Your partial clone of a monorepo is missing blobs or shows outdated files. Re-filter the sparse checkout and update the object cache correctly.

Your team uses a monorepo with thousands of packages. You cloned with --filter=blob:none and --sparse to check out only the packages/billing subtree, but after running git pull, some files in packages/billing show as empty, some show the old content from three months ago, and git status claims the working tree is clean when you can see a stale package.json in your editor. Partial clones and sparse checkouts interact with the object transport layer in ways that can leave the local object store incomplete. This guide shows how to diagnose stale objects, re-fetch the missing blobs, and configure the clone so this does not recur.

Common causes

Ordered by hit rate, highest first.

1. Blobless clone was not updated with --update-shallow after a fetch

A --filter=blob:none clone defers blob downloads until checkout. When you git pull, new tree objects arrive but the corresponding blobs are not downloaded unless the checkout explicitly triggers them.

How to spot it: git ls-files --error-unmatch packages/billing/package.json returns without error (Git knows about the file) but cat packages/billing/package.json shows old content — the blob was fetched at clone time but not re-fetched after the update.

2. Sparse-checkout cone pattern does not include updated directory

The initial sparse checkout set packages/billing but a recent refactor moved some billing code into packages/shared/billing-utils. The cone pattern still only covers packages/billing so the moved files are never checked out.

How to spot it: git sparse-checkout list shows only packages/billing. Run git log --oneline -- packages/shared/billing-utils — if recent commits appear, the path is active but not in your sparse checkout.

3. The partial filter server-side cache is stale

On large monorepos hosted on GitHub Enterprise or GitLab, the server-side filter cache can lag behind the true HEAD. Clones made within minutes of a large force-push may have an inconsistent set of filtered objects.

How to spot it: git fsck --connectivity-only reports missing objects. The issue resolves itself after 5-10 minutes as the server cache catches up.

4. GIT_NO_LAZY_FETCH environment variable blocks on-demand blob downloads

Some CI environments set GIT_NO_LAZY_FETCH=1 to prevent unexpected network calls. In a partial clone, this blocks the lazy blob resolution that happens on first file read, leaving files as empty or missing.

How to spot it: echo $GIT_NO_LAZY_FETCH returns 1 in the build environment.

5. Object database fragmented by concurrent git gc and git fetch

In a shared CI agent that reuses a partial clone workspace, a git gc --prune ran concurrently with a git fetch, deleting recently downloaded blobs before they were referenced in the index.

How to spot it: git fsck --lost-found shows dangling blobs. The CI log shows a gc process running in parallel with the fetch.

6. Promisor remote URL changed but clone config not updated

The promisor remote (the source for lazy-fetched blobs) was moved to a new hostname, but the clone’s remote.origin.url still points to the old URL. Lazy fetches fail silently and return empty content.

How to spot it: git config remote.origin.url returns the old hostname. git fetch returns a connection error.

Shortest path to fix

Step 1: Diagnose missing and stale blobs

git fsck --connectivity-only 2>&1 | head -20
git status
git diff HEAD

If git fsck reports missing blob objects, proceed with Step 2. If git status is clean but files look wrong, the working tree may have stale content not reflected in the index — proceed to Step 3.

Step 2: Force re-fetch all blobs for your sparse paths

git fetch --filter=blob:none origin
git checkout HEAD -- packages/billing

The checkout command triggers lazy blob resolution for all files in the path.

Step 3: Update the sparse-checkout cone to include new paths

git sparse-checkout set --cone \
  packages/billing \
  packages/shared/billing-utils \
  packages/shared/types
git checkout HEAD -- packages/shared/billing-utils

Step 4: Unset GIT_NO_LAZY_FETCH if it is blocking fetches

unset GIT_NO_LAZY_FETCH
git checkout HEAD -- packages/billing/package.json

In CI, remove the variable from the environment configuration or add an explicit fetch step before the build:

GIT_NO_LAZY_FETCH=0 git fetch --filter=blob:none origin

Step 5: Update the promisor remote URL

git remote set-url origin https://new-git-host.example.com/org/monorepo.git
git fetch --filter=blob:none origin
git checkout HEAD -- packages/billing

Step 6: Verify the working tree is consistent

git fsck --connectivity-only
git diff HEAD -- packages/billing
git status

All three should show no errors and a clean working tree.

Prevention

  • After cloning with --filter=blob:none --sparse, immediately run a test checkout of your target paths to confirm blobs are reachable.
  • Add --recurse-submodules and --also-filter-submodules to partial clone commands in CI to prevent nested incomplete states.
  • Pin the sparse checkout cone paths in a project script (scripts/setup-sparse.sh) so all developers use the same cone definition.
  • Do not use GIT_NO_LAZY_FETCH=1 in environments that use partial clones — the lazy fetch is the mechanism that keeps partial clones functional.
  • Monitor CI build times for gradual increases caused by accumulated promisor remote calls — consider switching to a --filter=tree:0 clone if blob-only filtering is insufficient.
  • Configure a local or on-premises Git proxy cache (e.g., git-cache-proxy) for large teams to share partial clone objects without each developer hitting the upstream server.
  • When force-pushing to a monorepo main branch, wait 10 minutes before triggering CI so the server-side filter cache can catch up.

FAQ

Q: How do I check how many blobs are still missing from my partial clone? A: Run git rev-list --objects --all --missing=print | grep '^?' | wc -l. Each line starting with ? is a missing blob that will be fetched lazily on demand.

Q: Can I convert a partial clone to a full clone without re-cloning? A: Yes: git fetch --unshallow followed by git config remote.origin.partialclonefilter "" then git fetch --filter="" --unshallow fetches all missing objects. For very large repos this can take hours.

Q: Our monorepo has 50,000 files. Is sparse checkout actually faster than a full clone? A: For developer machines, sparse checkout dramatically reduces git status and git diff times within the cone. For CI, --filter=blob:none reduces initial clone time but each build still pays for lazy blob downloads. Benchmark both approaches for your specific workload.

Q: Why does git status show “clean” when files clearly have old content? A: The index matches HEAD (Git’s metadata layer is consistent), but the blob Git stored at that commit is the old version. This happens when HEAD itself was not updated — run git log --oneline -3 to confirm HEAD is at the expected commit.

Tags: #git #version-control #Troubleshooting