I Set Noindex But the Page Is Still in Search Results

You added `<meta name="robots" content="noindex">` weeks ago but the page is still in Google. The most common reasons.

Three weeks ago you added <meta name="robots" content="noindex"> to a thank-you page, an internal dashboard, or a staging copy that leaked to production. Today you run site:yourdomain.com/that-url and Google still returns it. The meta tag is in View Source. So what is happening?

The single most useful thing to know: Google can only act on noindex after re-crawling the page. If anything is preventing the crawl — robots.txt Disallow, JS-rendered noindex, CDN caching the old HTML — Google never sees the directive and the URL stays indexed. The fix flow below walks through every reason in hit-rate order.

How to identify which case you’re in

Case 1: robots.txt is blocking the crawl

This is the most common failure. Symptom in Search Console: URL Inspection shows “Indexed, though blocked by robots.txt” or “Blocked by robots.txt”.

How to spot it:

curl -s https://yourdomain.com/robots.txt | grep -i your-path
# Disallow: /private/

Why it happens: someone thought “to keep a page out of Google, block it in robots.txt.” That stops the crawl but does not remove the URL from the index — external links keep it there with no snippet. The meta noindex you added is invisible to Google because Google never fetches the page.

Fix: remove the Disallow line for that URL. Google must crawl the page to see the noindex and process removal. After removal, the URL exits the index typically within 1–4 weeks.

# robots.txt — before
User-agent: *
Disallow: /private/

# robots.txt — after (allow crawl so meta noindex takes effect)
User-agent: *

For permanent exclusion: keep the noindex meta, remove the Disallow. They are mutually exclusive — noindex requires crawl access.

Case 2: Noindex is rendered by client-side JavaScript

How to spot it:

# Server response, raw HTML
curl -s https://yourdomain.com/path | grep -i "name=\"robots\""
# (no result — meta tag missing in initial HTML)

Then in Chrome DevTools → Elements (which shows the rendered DOM after JS runs), the meta tag is present. Or in Search Console URL Inspection → “View crawled page” → the “HTML” tab shows no meta, but “Rendered HTML” shows it.

Why it happens: a client-side framework (React, Vue) injects the robots meta after hydration. Googlebot may or may not execute JS on a given crawl. When it doesn’t, it sees no noindex and keeps the URL indexed.

Fix: render noindex in the SSR HTML. For SPAs, ensure the initial server response includes the meta tag in <head>, not appended later.

Case 3: CDN is serving stale HTML

How to spot it:

# Add a cache-busting query
curl -s "https://yourdomain.com/path?cb=$(date +%s)" | grep -i robots
# noindex appears

# Plain request
curl -s "https://yourdomain.com/path" | grep -i robots
# noindex missing

Different result between cache-busted and plain requests = CDN cache.

Why it happens: Cloudflare, Vercel Edge, CloudFront, etc. cache HTML responses. If you added the noindex meta after the page was already cached, the CDN serves the stale version to Googlebot for the cache TTL (often 24h–7d, sometimes longer).

Fix: purge the CDN cache for that URL.

# Cloudflare
curl -X POST "https://api.cloudflare.com/client/v4/zones/$ZONE/purge_cache" \
  -H "Authorization: Bearer $CF_TOKEN" \
  -H "Content-Type: application/json" \
  --data '{"files":["https://yourdomain.com/path"]}'

Vercel: deploy invalidates edge cache. Netlify: netlify cache:clear. Then curl the URL again to confirm the meta tag is now in the response.

Case 4: Google has not re-crawled yet

How to spot it: in Search Console → URL Inspection → “Last crawled” date is older than when you added noindex.

Why it happens: Google’s crawl frequency for low-traffic pages can be weeks to months. The meta is in the HTML but Googlebot hasn’t fetched it since you added it.

Fix: URL Inspection → “Request indexing”. Yes, you can request indexing of a noindex page — it triggers a re-crawl, Google sees the noindex, and the URL exits the index. Typical timeline: 1–7 days for high-traffic sites, 2–4 weeks for low-traffic.

Case 5: Conflicting signals across meta and X-Robots-Tag

How to spot it:

curl -sI https://yourdomain.com/path | grep -i x-robots
# X-Robots-Tag: index, follow   <-- conflicts with meta noindex

Why it happens: meta in the HTML says noindex, but an HTTP header says index. Google merges the two and takes the most restrictive, so this case should actually work — but if a CDN strips or overwrites headers, the meta tag may be missing too.

Fix: see Meta Robots vs X-Robots-Tag — which one wins for the full conflict resolution. In short: align both, or rely on only one.

Case 6: site: operator quirk vs actual indexing

How to spot it:

# In Google search
site:yourdomain.com/that-url
# Shows the URL

# But URL Inspection in Search Console says:
# "URL is not on Google"

Why it happens: site: results sometimes lag actual indexing state. The URL may already be removed from active search results but still appear in site: queries for a few days.

Fix: rely on URL Inspection, not site:, as the source of truth. If URL Inspection says “URL is not on Google,” the page is no longer in search results — the site: result will catch up.

Shortest fix path

In hit-rate order:

  1. Check robots.txt for a Disallow on the URL → 40% of cases. Remove the Disallow so Google can crawl and see noindex.
  2. curl the URL and confirm the meta tag is in the raw HTML, not added by JS → 25% of cases.
  3. Purge CDN cache for the URL → 15% of cases.
  4. URL Inspection → “Request indexing” → 15% of cases (just slow re-crawl).
  5. For urgent removal: use the Removals tool → temporary 6-month suppression while the above works.

Using the Removals tool correctly

Search Console → Removals → New Request → URL prefix (or exact URL):

  • Effect: hides the URL from Google search for ~6 months.
  • Important: this is suppression, not removal. The URL still exists in Google’s index. After 6 months, if noindex hasn’t propagated, the URL reappears.
  • Use this only as a stopgap while the underlying noindex propagates.

Common misuse: people submit a Removals request and then think they are done. Without noindex (or robots.txt Disallow plus 410 on the page), the URL comes back after 6 months. See URL Removals tool confusion.

Permanent removal flow (the right sequence)

For a page you want completely out of Google forever:

  1. Ensure the URL returns 200 (not 404/410 yet — Google must be able to crawl).
  2. Add <meta name="robots" content="noindex"> in SSR HTML, or X-Robots-Tag: noindex header.
  3. Ensure robots.txt does NOT Disallow the URL.
  4. URL Inspection → “Request indexing” to trigger crawl.
  5. Wait for Google to re-crawl (URL Inspection’s “Last crawled” date updates).
  6. URL Inspection now reports “Excluded by ‘noindex’ tag” — success.
  7. (Optional) Once Google has confirmed noindex, you can 410 or 404 the URL to fully retire it.

If you 404/410 before Google sees the noindex, Google may keep the URL in the index for a long time because all it sees is “this URL stopped responding,” which is not the same as “the owner says don’t index.”

Prevention

  • Render noindex in SSR HTML — never via client JS.
  • Never combine noindex with robots.txt Disallow on the same URL.
  • After changing robots meta, purge CDN cache for the affected paths.
  • Use X-Robots-Tag for non-HTML responses (PDFs, images) where meta isn’t available.
  • For pages you never want indexed (admin, internal tools), put them behind auth — that’s stronger than any robots directive.

FAQ

Q: I added noindex weeks ago. How long until the URL disappears from search? A: Typical timeline: 1–4 weeks after Google’s first re-crawl. If you have not seen the URL leave search after 8 weeks, something is blocking the crawl — check robots.txt, JS-rendered meta, or CDN cache.

Q: Can I combine noindex and robots.txt Disallow for extra safety? A: No — that combination breaks noindex. The Disallow prevents the crawl, so Google never sees the meta tag. The URL stays indexed (often with the note “Indexed, though blocked by robots.txt”). Pick one: noindex (crawlable, removed from index) OR robots.txt Disallow (uncrawled, may stay in index from external signals).

Q: Does noindex lose backlink equity? A: A noindex, follow page passes link equity through outbound links but does not itself rank. noindex, nofollow blocks both. Most use cases want noindex, follow — the default when you write noindex.

Q: My URL shows “Excluded by noindex tag” in Search Console but still appears in site: results. Is it actually removed? A: Yes. “Excluded by noindex tag” is the success state. site: operator lag is normal and resolves within days. URL Inspection is the source of truth.

Q: Can the Removals tool permanently remove a URL? A: No — it’s a 6-month suppression. For permanent removal you still need noindex (or auth, or 410). The Removals tool buys time while the underlying mechanism propagates.

Tags: #SEO #Troubleshooting #Debug #Structured data #noindex