You deploy weekly. Your sitemap generator stamps lastmod with new Date().toISOString() at build time. From Google’s perspective, every page on your site changed yesterday, which means none of them did — that pattern is the textbook signature of a sitemap to ignore. Googlebot’s recrawl rate for genuinely updated pages drops because Google can no longer use lastmod to prioritize. Worst case, “Discovered — currently not indexed” piles up. The fix is to derive lastmod from actual content change timestamps, not from build time.
Common causes
Ordered by hit rate.
1. lastmod set from new Date() at build time
The sitemap plugin or custom generator runs new Date().toISOString() once per build and uses that value for every entry. Build weekly → every URL “changed” weekly.
How to spot it: Fetch the live sitemap, grep lastmod, and confirm all values are within seconds of each other and match your last deploy.
2. lastmod derived from file mtime, but a build step rewrites every file
A pre-build step (format, transform, minify) rewrites every content file in place, updating mtime. The generator reads mtime as lastmod. Every file looks freshly edited each build.
How to spot it: ls -lt src/content/ shows every file modified within the last build window, even ones you have not touched in months.
3. lastmod derived from CI job timestamp
CI checks out the repo with --depth=1 or in a Docker layer that resets mtimes. The generator falls back to “now” because git commit dates are not available.
How to spot it: All lastmod values fall within a 30-second window per deploy. Different deploys produce different uniform values.
4. lastmod correct in source but rewritten by CDN / edge worker
You generate the right lastmod from frontmatter, but an edge worker rewrites sitemap responses (adding cache-busting, normalizing format) and clobbers the value.
How to spot it: Compare the sitemap fetched from origin against the one fetched from the CDN. Differences in lastmod are the smoking gun.
5. lastmod rounded to the day, but every entry shares the same day
The generator floors timestamps to midnight UTC to “normalize.” If most edits happen in one window per deploy day, every entry collapses to the same date and Google treats it as suspicious.
How to spot it: All lastmod values are the same date, with 00:00:00Z time.
6. Newly added URLs inherit “now” instead of their actual publish date
A new article gets added to the sitemap with lastmod of today, even though it was published two years ago. The same logic incorrectly bumps every old URL on every backfill run.
How to spot it: Old archive URLs that have not changed in years show a recent lastmod.
Before you start
- Capture a snapshot of your live sitemap (or top sitemap index) before making changes. You will compare before and after.
- Look at Search Console → Sitemaps → report details for any “valid but warning” messages — Google will sometimes warn explicitly about lastmod reliability.
- Decide your source of truth for
lastmod: typically frontmattermodifiedAt(falling back topublishedAt). - If you have an existing audit of dates from Article Date and JSON-LD Date Mismatch, align with the same source.
Information to collect
- The current
lastmodvalue distribution: are most values within a 5-minute window, or spread across months? - The generator code or plugin that emits the sitemap.
- Frontmatter fields available per content type (
publishedAt,modifiedAt,updatedAt). - Recent Search Console “Crawl stats” trend — has total crawl request count dropped despite content additions?
- Whether your sitemap is split (one per content type) or a single monolith.
Step-by-step fix
Ordered by cost.
Step 1: Inspect the current sitemap
curl -s https://example.com/sitemap.xml | \
grep -oE '<lastmod>[^<]+</lastmod>' | sort | uniq -c | sort -rn | head -10
If the top entry covers thousands of URLs and matches your last deploy time, the bug is confirmed.
Step 2: Source lastmod from real content dates
Replace the build-time generator:
// before
{ loc: url, lastmod: new Date().toISOString() }
// after
{
loc: url,
lastmod: (article.modifiedAt ?? article.publishedAt),
}
Omit lastmod entirely rather than fake it if no real timestamp exists.
Step 3: Backfill from git history when frontmatter is missing
git log --diff-filter=AM --follow --format=%aI -- "src/content/articles/en/${slug}.mdx" \
| head -1
That returns the most recent commit that added or modified the file. Use it as modifiedAt for articles without explicit dates.
Step 4: Stop pre-build steps from touching mtime
If a formatter rewrites every file every build, switch to in-memory transforms or guard with content hashing:
const before = await readFile(path);
const after = format(before);
if (after !== before) await writeFile(path, after);
writeFile only fires on real changes. Mtimes stop spuriously bumping.
Step 5: Verify CI preserves git timestamps
If you rely on commit dates, ensure CI clones with full history:
- uses: actions/checkout@v4
with:
fetch-depth: 0
Then read commit dates instead of mtime in the sitemap generator.
Step 6: Validate the edge layer is not rewriting
diff <(curl -s https://origin.example.com/sitemap.xml) \
<(curl -s https://example.com/sitemap.xml)
If lastmod differs, the CDN or edge worker is rewriting. Bypass or fix it.
Step 7: Resubmit and watch crawl response
In Search Console, resubmit the sitemap. Over 2-3 weeks watch the “Crawl stats” graph for a shift toward URLs that actually changed. The number of “Discovered — currently not indexed” should slowly decrease for genuinely updated pages.
Verify
- Sitemap
lastmoddistribution shows real variance over weeks and months, not a uniform value per deploy. - New articles get a
lastmodmatching theirpublishedAt, not today. - Search Console → Crawl stats shows recrawl prioritization for genuinely edited URLs.
- “Discovered — currently not indexed” trend reverses for important content.
- Your CMS edits flow into
lastmodwithin one deploy cycle.
Long-term prevention
- Document team-wide that
lastmodreflects real content change, never build time. - CI assertion: fail the build if more than 30% of sitemap entries share the same
lastmodminute. - Pair
lastmodsource with the same source that drivesdateModifiedin JSON-LD — one source of truth. - For static archive pages that genuinely never change, omit
lastmodrather than emit a stale value. - Audit edge / CDN response transformations quarterly to catch silent rewrites.
Common pitfalls
- Removing
lastmodentirely “to be safe.” Reallastmodvalues help Google prioritize. Only omit when you have no reliable source. - Bumping
lastmodon every page when you change navigation or the footer. Site-wide template changes are not per-page content changes. - Updating
lastmodto “now” when you fix a typo. Trivial edits should leavelastmodalone; align with the same “meaningful edit” rule you use fordateModified. - Splitting
lastmodanddateModifiedbetween two independent generators. They will drift; consolidate. - Resubmitting the sitemap five times in one day to “force a re-crawl.” That has no positive effect and can flag the sitemap.
FAQ
Q: Does Google actually penalize spammy lastmod patterns?
Google has stated they will ignore unreliable lastmod values for crawl prioritization. The penalty is being treated as if you provided no signal at all — which costs you re-crawl opportunities on real updates.
Q: Should I include lastmod for every URL or only changed ones?
Include it whenever you have a reliable timestamp. Omit when you do not. An honest gap is better than a fabricated value.
Q: My CMS only tracks “edited date.” Is that enough?
Yes, as long as “edited date” only changes on real edits and not on auto-save or template changes. Verify by editing nothing and confirming “edited date” did not change.
Q: How fine-grained should lastmod be? Day, hour, second?
Match the precision of your tracking. Day-level (2026-05-24) is fine for content sites. Second-level is appropriate for fast-moving feeds. Do not invent precision you do not have.