sitemap.xml Not Found — Integration / Site Fix

/sitemap.xml or /sitemap-index.xml returns 404 — integration off or `site` missing.

You submit https://yourdomain.com/sitemap.xml to Search Console and it comes back “Couldn’t fetch” or “404”. This is a near-universal first deploy bug on Astro / Next / Hugo sites. Astro’s @astrojs/sitemap only runs if you explicitly register it in astro.config.mjs, and it strictly requires the site field. Miss either and the build doesn’t error — it just silently doesn’t produce a sitemap-index.xml.

This article lists 5 causes by hit rate, with checks you can run directly against dist/.

Common causes

Ordered by hit rate, highest first.

1. @astrojs/sitemap not registered in astro.config

npm install @astrojs/sitemap isn’t enough — you have to add it to the integrations array:

// astro.config.mjs
import { defineConfig } from 'astro/config';
import sitemap from '@astrojs/sitemap';

export default defineConfig({
  site: 'https://yourdomain.com',
  integrations: [sitemap()],
});

Without integrations: [sitemap()], dist/sitemap-index.xml is never generated.

How to spot it:

grep -n "sitemap" astro.config.mjs

If you see the import but no call, it’s not registered.

2. site field is empty

@astrojs/sitemap uses site to build absolute URLs for every entry. With site missing, the integration logs a build-time warning (which most people miss) and skips generation entirely. No file ends up in dist/.

How to spot it: Run npm run build and grep the output. If you see [sitemap] \site` option is required`, that’s it.

3. Host rewrite catching /sitemap* as unmatched

Same trap as RSS — Vercel / Netlify SPA fallbacks like /* /index.html swallow the sitemap. Pure-static Astro deploys usually escape this, but if you’ve hand-edited vercel.json rewrites, double-check.

How to spot it:

curl -I "https://yourdomain.com/sitemap.xml"
curl -I "https://yourdomain.com/sitemap-index.xml"

200 with content-type: text/html → the SPA fallback took over.

4. Wrong URL — it’s sitemap-index.xml, not sitemap.xml

@astrojs/sitemap produces sitemap-index.xml plus one or more sitemap-0.xml. It does not produce sitemap.xml. If you only submitted /sitemap.xml to Search Console, you’ll get 404s.

How to spot it:

ls dist/sitemap*

If you see sitemap-index.xml and sitemap-0.xml but no plain sitemap.xml, submit the index URL.

5. Base path offset

If your Astro config has base: '/blog', the sitemap emits to dist/blog/sitemap-index.xml, and hitting /sitemap.xml (without the base) 404s.

How to spot it:

grep -n "base:" astro.config.mjs
find dist -name "sitemap*"

If the actual location includes a base prefix, that’s the bug.

Shortest path to fix

Ordered by ROI. The first three usually solve 80% of cases.

Step 1: Register @astrojs/sitemap in the config

Minimum working config:

// astro.config.mjs
import { defineConfig } from 'astro/config';
import sitemap from '@astrojs/sitemap';

export default defineConfig({
  site: 'https://yourdomain.com', // required, absolute URL
  integrations: [
    sitemap({
      filter: (page) => !page.includes('/admin/'),
      changefreq: 'weekly',
      priority: 0.7,
    }),
  ],
});

Two things to confirm:

  • site is an absolute https:// URL with no trailing slash
  • sitemap() actually appears (with parentheses!) inside integrations

Step 2: Build locally and verify the artifact

rm -rf dist/
npm run build
ls dist/sitemap*
head -20 dist/sitemap-index.xml

Expected:

dist/sitemap-0.xml
dist/sitemap-index.xml

sitemap-index.xml should reference https://yourdomain.com/sitemap-0.xml. If site was wrong, you’ll see it here — e.g. references to http://localhost:4321/sitemap-0.xml, which Google can’t fetch in production.

Step 3: Curl-verify in production

curl -I "https://yourdomain.com/sitemap-index.xml"
curl -s "https://yourdomain.com/sitemap-index.xml" | head -10

Pass criteria:

  • Status 200
  • content-type: application/xml
  • Body contains a <sitemapindex element

200 returning HTML → a rewrite is intercepting. Check vercel.json / netlify.toml for the SPA fallback.

Step 4: Submit the right URL to Search Console

Search Console → Sitemaps, submit:

https://yourdomain.com/sitemap-index.xml

Note the -index (unless you’ve customized via customPages + a rename). Status should flip to “Success” within minutes to a few hours.

Also reference it from robots.txt so other crawlers find it:

Sitemap: https://yourdomain.com/sitemap-index.xml

That helps Bing, Yandex, etc. auto-discover.

Step 5: CI smoke test after deploy

#!/usr/bin/env bash
set -e
URL="https://yourdomain.com/sitemap-index.xml"
status=$(curl -s -o /dev/null -w "%{http_code}" "$URL")
[[ "$status" == "200" ]] || { echo "Sitemap status: $status"; exit 1; }
curl -s "$URL" | grep -q "<sitemapindex" || { echo "Not a sitemap index"; exit 1; }
echo "Sitemap OK"

Wire it into your deploy-success hook. Any commit that accidentally removes the sitemap() call gets caught immediately.

Prevention

  • Add a CI assertion that dist/sitemap-index.xml exists post-build (or post-deploy smoke test)
  • After Search Console submission, audit the status weekly to catch URL count drops
  • Reference the sitemap from robots.txt so Bing / Yandex auto-discover it
  • Don’t hardcode sitemap paths in templates — trust the integration’s default output
  • Whenever you change site or base, re-run the sitemap test as part of the same diff

Tags: #Hosting #Debug #Troubleshooting #Sitemap #SEO