Search Console "Not Indexed": Decode It by Status

Q: What's the difference between `noindex` and `robots.txt`?

`robots.txt` blocks *crawling*; `noindex` blocks *indexing*. To keep a page out of Google, use `noindex` and let Google crawl it — if you block it in `robots.txt`, Google can't see the `noindex` and the URL can still show up. See [noindex vs robots.txt](/en/articles/noindex-vs-robots-txt/).

Q: Should I just use the Indexing API to force pages in?

No. The Indexing API is only for `JobPosting` and `BroadcastEvent` pages. For normal content, use internal links, a clean sitemap, Validate Fix, and a few manual Request indexing calls.

The 9 most common "not indexed" statuses in Search Console and the exact fix for each, with current 2026 labels and timelines.

Published: May 17, 2026 Updated: Jun 21, 2026 Author: AI Productivity Guide Team 🌐 查看中文版本

Fastest fix: “Not indexed” in Search Console’s Pages report (Indexing → Pages) is an umbrella term. The “Why pages aren’t indexed” table underneath it lists a dozen specific statuses, and each one has a completely different fix. Open that table, find which status holds the most URLs, and jump to the matching section below. The two that usually matter most are Crawled - currently not indexed (a content-quality problem) and Discovered - currently not indexed (a crawl-budget / internal-linking problem). The rest are often informational and need no action.

This is the cheat sheet for the 9 most common statuses, each with “how to identify + how to fix + how long it takes.”

Heads up (as of June 2026): Google renamed two labels. The old “Excluded by ‘noindex’ tag” now reads URL marked ‘noindex’, and “Blocked by robots.txt” now reads URL blocked by robots.txt. The fixes are identical; only the wording changed.

Which bucket are you in?

Status (current label)	Root cause	Need to act?	Time to clear
Crawled - currently not indexed	Quality / thin / duplicate	Yes (highest ROI)	8-12 weeks
Discovered - currently not indexed	Crawl budget / weak internal links	Yes	4-8 weeks
Duplicate without user-selected canonical	No canonical declared	Yes	2-4 weeks
Duplicate, Google chose different canonical than user	Weak canonical signal	Maybe	4-8 weeks
Alternate page with proper canonical tag	hreflang / param variant	No (informational)	n/a
URL marked ‘noindex’	`noindex` directive present	Only if accidental	1-2 weeks
URL blocked by robots.txt	Disallow rule blocks crawl	Only if accidental	1-2 weeks
Page with redirect	URL 301/302s elsewhere	Usually no	n/a
Soft 404	Thin/empty page returns `200`	Yes	2-4 weeks

Common causes (9 statuses)

1. Discovered - currently not indexed

Meaning: Google knows the URL exists (from your sitemap or a link) but hasn’t sent a crawler yet. Almost always a crawl-budget or internal-linking signal, not a penalty.

Fix:

Add 3-5 internal links to the target URL from pages that are already indexed and get traffic, using descriptive anchor text.
Free up crawl budget by removing junk URLs (params, ?tag=, internal search results) — block them in robots.txt or canonicalize them.
Lift overall site authority (backlinks + genuine traffic).
Confirm the URL is in a submitted sitemap and returns 200 OK.

Time to effect: 4-8 weeks

Deep dive: Discovered - currently not indexed

2. Crawled - currently not indexed

Meaning: Google fetched the page and decided it wasn’t worth indexing. This is a quality verdict. Note: a large share of URLs in this bucket were previously indexed and later dropped, so treat it as “earn it back,” not “wait it out.”

Fix:

Thicken the content with information a competing top-10 result doesn’t have: original data, screenshots, a worked example, a comparison.
Rewrite the first paragraph to answer the query immediately and densely; cut the warm-up.
Delete or 301-merge near-duplicate sibling pages so one strong URL replaces several weak ones.
Verify the page renders fast and fully — slow LCP or render timeouts cause partial crawls and low-quality verdicts. Check it in PageSpeed Insights and the Core Web Vitals report.
Build a few relevant backlinks.

Time to effect: 8-12 weeks

Deep dive: Crawled - currently not indexed

3. Duplicate without user-selected canonical

Meaning: Google found this page near-identical to others, but you never declared a canonical, so Google picked one for you.

Fix:

Add an explicit <link rel="canonical" href="..." /> in the <head> of every page.
Default to self-canonical (each page points to itself) unless it’s genuinely a duplicate.
If two URLs really are the same page, 301 one to the other.

Time to effect: 2-4 weeks

4. Duplicate, Google chose different canonical than user

Meaning: You set a canonical, but Google disagreed and indexed a different URL.

Fix:

Make your preferred URL the strongest signal: more internal links to it, more backlinks, longer/richer content, and make sure the sitemap lists that URL.
Align every signal to one master: sitemap entry, internal-link targets, and the canonical tag must all point to the same URL.
Or surrender: if Google’s pick is fine, 301 your version to it.

Time to effect: 4-8 weeks

Deep dive: Duplicate, Google chose different canonical

5. Alternate page with proper canonical tag (informational)

Meaning: This is a pagination page, a parameter variant, or an hreflang alternate, and Google is respecting your canonical by not indexing the duplicate. This is the expected, healthy state.

Fix: Usually no action needed. Confirm the canonical points where you intend:

curl -sL https://yourdomain.com/that-url | grep -oE '<link rel="canonical" href="[^"]+"'

If it points to the master version you actually want indexed, mark it “OK” and move on.

Deep dive: Alternate page with proper canonical tag

6. URL marked ‘noindex’ (formerly “Excluded by ‘noindex’ tag”)

Meaning: The page’s <head> contains <meta name="robots" content="noindex"> (or an X-Robots-Tag: noindex response header). Google is obeying your instruction.

Check whether it’s intentional:

# Check the HTML meta tag
curl -sL https://yourdomain.com/page | grep -i noindex
# Also check the response header (often missed)
curl -sI https://yourdomain.com/page | grep -i x-robots-tag
# Either hit means a noindex directive is really present

Fix:

Intentional (admin / preview / draft / thank-you pages): keep the noindex, but make sure those URLs are not in your sitemap — otherwise this status flags them forever.
Accidental: remove the meta tag or header, redeploy, then use URL Inspection → Request indexing on a sample URL and Validate Fix on the issue for the rest.

Time to effect: 1-2 weeks after the noindex is removed

7. URL blocked by robots.txt (formerly “Blocked by robots.txt”)

Meaning: A Disallow rule in robots.txt stops the crawler from reaching the page, so indexing is impossible. (A blocked page can still appear in results URL-only if it has external links — robots.txt controls crawling, not indexing. Use noindex to keep something out of the index.)

Diagnose:

curl -s https://yourdomain.com/robots.txt
# Find the Disallow rule matching the blocked path

Fix:

Block is correct: keep it, but remove those URLs from the sitemap.
Block is a mistake: delete the offending Disallow, redeploy, then in Search Console open Settings → robots.txt (the report that replaced the old standalone tester), confirm the URL is now allowed, and Request indexing.

Time to effect: 1-2 weeks after the block is lifted

8. Page with redirect

Meaning: The URL 301/302 redirects to another URL, so the redirecting URL itself isn’t indexed (the destination gets indexed instead). Usually expected.

Fix: Normally no action. Only act if the redirect is wrong (e.g., a page that should be live is redirecting), or if a redirected URL is still sitting in your sitemap — remove it from the sitemap so it stops being reported.

9. Soft 404

Meaning: The page returns 200 OK but looks empty, broken, or “not found” to Google (e.g., an out-of-stock product, an empty search-results page, or a JS page that failed to render).

Fix:

If the page is genuinely gone: return a real 404 or 410 status code instead of 200.
If it should exist: add real, substantive content so it no longer looks empty, and verify it renders without relying on a script Google can’t run.

Time to effect: 2-4 weeks

Shortest path to fix

Step 1: Group by status, prioritize by impact

Open Search Console → Indexing → Pages → “Why pages aren’t indexed.” Note each status’s URL count:

Crawled - currently not indexed: 320         <- quality issue, highest ROI
Discovered - currently not indexed: 180      <- crawl budget / authority
Alternate page with proper canonical: 95     <- informational, skip
Duplicate, Google chose different canonical: 12  <- weak canonical signal
URL marked 'noindex': 4                       <- check if intentional
URL blocked by robots.txt: 2                   <- same

Sort by (URL count) times (how much you care about those URLs). Hit the most valuable bucket first; ignore the informational ones.

Step 2: One status at a time so you can attribute results

Don’t change canonical, content, and robots.txt in the same week — you won’t know what worked. Suggested order:

Week 1: clear accidental noindex and robots.txt blocks (fast, high certainty).
Week 2-3: handle Duplicate statuses (canonical alignment + 301s).
Week 4-8: thicken content (resolves Crawled - not indexed).
After week 4: pursue backlinks + internal-link fixes (resolves Discovered - not indexed).

After each batch, wait 2-4 weeks before judging.

Step 3: Use “Validate Fix” — not 50 manual requests

Once you’ve fixed every instance of a status, open that status’s detail page and click Validate Fix. Google rechecks a sample immediately; if that passes, it recrawls the rest and the issue count drops to zero. Validation typically takes up to about two weeks (sometimes longer).

This is the scalable mechanism. URL Inspection → Request indexing is the manual one, and it’s rate-limited to roughly 10-12 URLs per day per property (the button greys out for 24 hours once you hit it). Reserve manual requests for a handful of high-value pages; use Validate Fix for everything else.

Speed tip: submit a sitemap containing only your priority URLs, then filter the Pages report by that sitemap before clicking Validate Fix — validation against a smaller set completes faster.

Do not reach for the Indexing API here. It’s only sanctioned for JobPosting and BroadcastEvent pages; using it for ordinary content won’t help and isn’t supported.

Step 4: Track the trend with one table

| Week | Discovered | Crawled | Duplicate | noindex | Total not indexed |
|------|------------|---------|-----------|---------|-------------------|
| W1   | 180        | 320     | 12        | 4       | 516               |
| W3   | 175        | 318     | 8         | 0       | 501               |
| W5   | 165        | 290     | 5         | 0       | 460               |
| W8   | 140        | 240     | 2         | 0       | 382               |

A downward total = you’re fixing the right things. Flat = re-diagnose. (Note: as of June 2026, the Pages report still shows a historical-data gap before December 15, 2025 — a leftover from a reporting-latency incident, not a crawling or ranking problem. Compare recent weeks only.)

How to confirm it’s fixed

For any individual URL, paste it into URL Inspection at the top of Search Console and read the verdict: “URL is on Google” means indexed. If it still says not indexed, click Test Live URL to confirm the page is currently crawlable with no noindex/robots block and the canonical you expect — the live test reflects the page right now, while the index status can lag 2-3 days and is sampled, so don’t trust a stale report over a live test.

Prevention

Don’t ship thin or duplicate content (avoids Crawled - not indexed).
Pre-launch, crawl every template and check three things on each: canonical, noindex, and robots.txt.
Comment every Disallow in robots.txt with why, so a future deploy doesn’t break it by accident.
Keep intentionally noindexed pages OUT of the sitemap, or “URL marked ‘noindex’” flags them permanently.
Scan the “not indexed” statuses monthly — problems caught early are cheap to fix.

FAQ

Is “Crawled - currently not indexed” a penalty? No. It’s a quality judgment, not a manual action. Google crawled the page and decided it didn’t add enough over what’s already indexed. The fix is to make the page genuinely more useful, not to file a reconsideration request.

How long after I click “Validate Fix” should I expect indexing? Usually up to about two weeks, sometimes longer for large sites. You’ll get email updates as the state moves through “Started” toward “Passed.” Don’t re-click it repeatedly — that restarts the clock.

Why is “Alternate page with proper canonical tag” so high? Is that bad? No, it’s healthy. Those are duplicate variants (hreflang alternates, paginated or parameter URLs) that correctly point to a canonical Google is indexing. A high count here is normal and needs no action.

What’s the difference between noindex and robots.txt? robots.txt blocks crawling; noindex blocks indexing. To keep a page out of Google, use noindex and let Google crawl it — if you block it in robots.txt, Google can’t see the noindex and the URL can still show up. See noindex vs robots.txt.

My index data before December 15, 2025 disappeared. Did I get deindexed? No. As of June 2026 the Pages report still has a historical gap before that date from a late-2025 reporting-latency incident. Crawling, indexing, and ranking were unaffected; only the dashboard’s history is short. Compare recent weeks.

Should I just use the Indexing API to force pages in? No. The Indexing API is only for JobPosting and BroadcastEvent pages. For normal content, use internal links, a clean sitemap, Validate Fix, and a few manual Request indexing calls.

Tags: #SEO #Google #Search Console #Indexing

Which bucket are you in?

Common causes (9 statuses)

1. Discovered - currently not indexed

2. Crawled - currently not indexed

3. Duplicate without user-selected canonical

4. Duplicate, Google chose different canonical than user

5. Alternate page with proper canonical tag (informational)

6. URL marked ‘noindex’ (formerly “Excluded by ‘noindex’ tag”)

7. URL blocked by robots.txt (formerly “Blocked by robots.txt”)

8. Page with redirect

9. Soft 404

Shortest path to fix

Step 1: Group by status, prioritize by impact

Step 2: One status at a time so you can attribute results

Step 3: Use “Validate Fix” — not 50 manual requests

Step 4: Track the trend with one table

How to confirm it’s fixed

Prevention

FAQ

Related

Related Articles

Hreflang "No Return Tags": Fix the Missing Reciprocal Link

JavaScript-Rendered Content Not Showing in Google Index

Indexing Dropped After Google Switched Your Site to Mobile-First

noindex,follow on Page 2+ Is Orphaning Your Deep Articles

Query-Parameter URLs Creating Duplicate Index Entries

robots.txt Blocks CSS/JS and Indexing Quality Drops