Search Console suddenly shows hundreds of URLs in “Excluded by ‘noindex’ tag” — pages you want indexed. site:yourdomain.com/url-path returns nothing. You view source and there it is: <meta name="robots" content="noindex">. The tag works exactly as advertised: once Google sees it, the URL is fast-tracked out of the index. The bad news: removal is instant. The good news: once you remove the tag and force a re-crawl, recovery happens at normal crawl pace (days to weeks).
Most cases trace to a starter-template default that was never flipped, or a staging-to-production environment leak. This article covers detection and recovery.
Common causes
Ordered by hit rate, highest first.
1. Template scaffolding shipped with noindex default
Many Astro / Next starter templates have a layout like:
<meta name="robots" content={import.meta.env.PROD ? "index,follow" : "noindex"}>
If PROD isn’t set correctly in your build, every page emits noindex. Or the dev gets used to “everything is noindexed locally” and ships without removing it.
How to spot it:
curl -s https://yoursite.com/ | grep -i 'meta name="robots"'
If the production homepage emits noindex, that’s it.
2. Staging / preview deployment leaked to production
You configured Vercel previews to noindex. Then you mapped your production domain to a preview by mistake, or the production build inherited the staging env var.
How to spot it: Vercel/Netlify → check which deployment is currently serving your production domain. Compare its env vars to staging.
3. CMS / framework default was changed retroactively
You changed a CMS field default (indexable: false) thinking it was for new posts only, but existing posts inherited the new default.
How to spot it: Search Console → Excluded by noindex. Cluster by date — if a sudden spike on one day correlates with your config change, that’s it.
4. Migration script copied noindex from old domain
When migrating, your script duplicated the entire HTML head from old pages, including a noindex that was deliberately set on the old domain.
How to spot it: Check <head> content in your migration source. If old pages had noindex (because the migration was in progress), new pages may have inherited it.
5. CDN / WAF rule injects X-Robots-Tag: noindex
Cloudflare or a WAF rule meant for staging injects X-Robots-Tag: noindex HTTP header. The HTML doesn’t show it, so view-source looks fine — but Google sees the header.
How to spot it:
curl -sI https://yoursite.com/ | grep -i x-robots-tag
If you see X-Robots-Tag: noindex, the CDN is injecting it.
6. SSR conditional accidentally noindexes valid pages
A buggy condition: if (page.category === 'draft') matches more than intended (e.g., 'drafts' substring match). Public pages get noindexed.
How to spot it: Find the conditional in your layout. Test with edge-case article slugs / categories.
Shortest path to fix
Step 1: Confirm the scope
Search Console → Pages → “Excluded by ‘noindex’ tag” → Export the list. Count: dozens? hundreds? all?
curl -s https://yoursite.com/sitemap.xml | grep -oP '<loc>\K[^<]+' | while read url; do
if curl -s "$url" | grep -q 'noindex'; then
echo "$url"
fi
done | wc -l
Step 2: Find the source
Check in order:
- Template:
grep -rn 'noindex' src/layouts src/components - Environment variable:
env | grep -i robots - HTTP header:
curl -sI https://yoursite.com/ | grep -i x-robots-tag - CMS config: check default indexability setting
- CDN: Cloudflare → Rules → look for response header transforms
Step 3: Fix it
Once located:
- Template: hardcode
<meta name="robots" content="index, follow">for production - Env var: set
INDEXABLE=truein prod or fix the conditional - HTTP header: remove the
X-Robots-Tagfrom CDN config - CMS: change default to
indexable: true, manually flag old non-public pages
Step 4: Deploy and verify with view-source
After deploy, manually check at least 5 sample URLs:
for url in $(head -5 affected_urls.txt); do
echo "=== $url ==="
curl -s "$url" | grep -i 'meta name="robots"'
curl -sI "$url" | grep -i x-robots-tag
done
No noindex should appear anywhere.
Step 5: Request re-indexing on top URLs
For your most important 10-20 URLs (highest traffic / business value before the noindex), use Search Console → URL Inspection → “Request Indexing.” This triggers a fast re-crawl.
For larger sets, resubmit your sitemap and wait for natural crawl.
Step 6: Monitor recovery
Over 1-4 weeks, “Excluded by noindex” count should drop in Search Console. Concurrently, Indexed URLs count should rise.
If after 4 weeks some URLs are still missing, check:
- Were they 404 during the noindex period? Google may have de-prioritized them.
- Do they have unique value, or were they thin? Google may not re-index thin pages.
Step 7: Add a CI check
# In CI after deploy
curl -s https://yoursite.com/ | grep -q 'meta name="robots" content="index'
[ $? -eq 0 ] || { echo "Production has noindex!"; exit 1; }
Run on every deploy to prevent regression.
When this is not on you
Recovery time depends on Google’s crawl frequency. The longer pages were noindexed, the slower the recovery (Google de-prioritizes URLs it removed). Don’t expect overnight; weeks is normal.
Easy to misdiagnose as
People assume removing noindex is enough — but they also need to ensure the page is re-crawled. Without a fresh crawl, Google’s index stays at “removed” until it next visits.
Prevention
- Never make
noindexthe production default. Makeindex, followthe default and use environment-aware logic to opt-into noindex only for staging. - CI check: assert that at least 5 sample production URLs emit
index, followpost-deploy. - Validate
<meta name="robots">in CI on a sample of pages per environment. - For staging hosts, use
X-Robots-Tag: noindexat the platform level (Vercel/Netlify env-based header), not in source code. - Audit Search Console → Pages monthly for unexpected “Excluded by noindex” spikes.
FAQ
- Will Google re-index automatically? Yes, after the next crawl. URL Inspection’s “Request Indexing” accelerates the most important URLs.
- Can I noindex only the preview deployment? Yes — set noindex via environment-aware logic at the platform level, not in source code.
Related
- Robots meta vs sitemap conflict
- Noindex vs robots.txt
- Robots.txt not working
- Page indexed but not ranking
Tags: #SEO #Troubleshooting #Debug #Structured data #noindex