Does adding ` ` in the sitemap help?

No. Google ignores ` ` and ` `. Only ` ` and ` ` carry weight, and `lastmod` only if it is accurate.

Orphan Pages: No Internal Links, So No Indexing

A URL sits in your sitemap but no page links to it. Google treats it as unimportant and parks it at 'Discovered — currently not indexed.' Fix: add 2+ contextual internal links from indexed pages.

Published: May 19, 2026 Updated: Jun 21, 2026 Author: AI Productivity Guide Team 🌐 查看中文版本

A URL appears in sitemap.xml but no page on your site links to it — that is an “orphan.” Google knows it exists from the sitemap, but orphans collect zero “importance vote” from internal links, so Google parks them at the bottom of the crawl queue. They typically sit at Discovered - currently not indexed for weeks or months.

Fastest fix: add at least 2 contextual internal links to the orphan from pages that are already indexed and relevant, then request indexing once in Search Console. The single mental model behind everything below: the sitemap is a discovery signal; internal links are the importance signal. One does not substitute for the other.

Symptoms

URL appears in sitemap.xml but stays at Discovered - currently not indexed
An internal link audit shows 0 inbound internal links
The page may or may not have external backlinks — indexing lags either way
site:yourdomain.com/the-url returns nothing in Google

Which “not indexed” bucket are you in?

These two Search Console statuses look similar but need opposite fixes. Check URL Inspection first:

Status	What it means	Root cause	Fix direction
`Discovered - currently not indexed`	Google found the URL but has not crawled it yet	Crawl priority too low — usually weak/zero internal links	Add internal links, lower crawl depth (this article)
`Crawled - currently not indexed`	Google crawled it but chose not to index	Thin / duplicate / low-quality content	Improve the page content itself

Orphan pages almost always land in the first row. If you see Crawled - currently not indexed, internal links alone will not fix it — the page content is the problem.

Common causes

The new article was never added to the homepage “latest” section, does not appear on any category listing, and is in no other article’s “Related” module.

How to confirm (run from your repo root):

rg -l 'href="/your-orphan-url/?"' src/ | wc -l
# 0 = orphan

2. Reachable only by typing the URL

Can you reach this page in 3 clicks or fewer from the homepage? If not, treat it as an orphan. As of June 2026, pages within three clicks of the homepage get priority crawling and faster indexation; anything deeper competes for leftover crawl budget.

3. Linked only from old, low-authority URLs

Internal links exist, but the source pages are themselves zero-traffic, low-authority orphans, so the link equity they pass is close to nothing.

How to confirm: Screaming Frog or Ahrefs shows the source pages’ crawl depth and link authority.

4. Linked only via `nofollow` or JS-rendered widgets

Links exist but are all rel="nofollow", or they only appear after a client-side React useEffect runs. Google may not count nofollow links for crawl discovery, and JS-injected links are crawled less reliably than links present in the raw HTML. Put real <a href> tags in the server-rendered markup.

5. URL path changed but old internal links were not updated

The page was originally /blog/old-slug, later moved to /articles/new-slug. The 301 redirect is in place, but every internal link still points at the old URL, so the new URL never receives a direct internal link and stays an orphan. A link to a redirected URL is a weaker, indirect signal than a link straight to the final URL.

6. Pagination, tag, or archive structure orphans deep articles

Articles on /blog/page/15 are reachable only by paging through the archive. Within a few weeks newer posts push them past the point a crawler bothers to reach, and they become functional orphans.

Shortest path to fix

Step 1: Find every orphan

The reliable method: crawl the site to capture all internal links, then compare against the sitemap. The set difference (in sitemap, not reachable by crawl) is your orphan list.

# Mirror-crawl the site with wget
wget --spider --recursive --no-verbose --no-directories \
  --output-file=crawl.log https://yourdomain.com/

# Extract crawled URLs
grep "http" crawl.log | awk '{print $3}' | sort -u > crawled.txt

# Extract sitemap URLs
curl -s https://yourdomain.com/sitemap.xml \
  | grep -oE '<loc>[^<]+</loc>' | sed 's/<\/\?loc>//g' > sitemap.txt

# Set difference = sitemap-only URLs = orphans
sort sitemap.txt crawled.txt crawled.txt | uniq -u > orphans.txt

More thorough tools: Screaming Frog (free up to 500 URLs; its Sitemaps -> Orphan URLs report compares crawl to sitemap directly) or Sitebulb for clearer reporting. In Search Console itself, the Pages (Indexing) report listing under Discovered - currently not indexed is a strong orphan shortlist.

Step 2: Decide whether each orphan should exist

Open orphans.txt and triage each URL:

Should exist (a real, valuable page): add internal links (Step 3)
Should not exist (test page, duplicate, expired): remove it from the sitemap and either return 410 Gone or add noindex. Do not leave dead URLs in the sitemap — they waste crawl budget.

Step 3: Add 2+ contextual internal links from relevant indexed pages

For each orphan worth keeping:

# Find 3-5 of the most topically relevant existing articles
rg -l "related keyword" src/ | head -5

In each of those, add a contextual in-body link (inside a sentence, not a generic footer) whose anchor text contains the target query. Aim for at least 2 distinct source pages, ideally 3-5, and prefer sources that are themselves indexed and get traffic. Contextual links from relevant, indexed pages pass far more value than footer or sidebar links.

Step 4: Make orphans an automatic, structural concern

One-off link fixes regress. Close the gap at the source. For an Astro content site, auto-render a related-reading block at the end of every article:

---
import { getCollection } from 'astro:content';
const allPosts = await getCollection('posts');
const related = allPosts
  .filter(p => p.data.tags?.some(t => Astro.props.tags.includes(t)))
  .filter(p => p.slug !== Astro.props.slug)
  .slice(0, 5);
---
<aside>
  <h2>Related reading</h2>
  <ul>
    {related.map(p => <li><a href={`/articles/${p.slug}/`}>{p.data.title}</a></li>)}
  </ul>
</aside>

Also make sure a hub page (homepage, /articles/ index, or per-category index) links to every article, not just the latest 5. A complete, crawlable index page is the single most effective orphan-prevention measure.

Step 5: Resubmit the sitemap and trigger re-discovery

The old https://www.google.com/ping?sitemap=... endpoint was deprecated in June 2023 and now returns 404 — do not use it. Resubmit through Search Console instead:

Search Console -> Sitemaps -> remove the sitemap, then re-add and submit it.
Confirm an accurate <lastmod> date in the sitemap for the updated URLs. Since the ping endpoint went away, Google leans on lastmod to decide what to recrawl, so it must reflect the real last-modified time (not “now” on every build).
Pick 1-2 fixed orphans and use URL Inspection -> Request indexing to nudge them.

If you also want Bing, Yandex, and other IndexNow-participating engines to pick the changes up immediately, ping IndexNow. Note: Google does not support IndexNow as of June 2026 and still relies on sitemaps, internal links, and Search Console — so IndexNow helps Bing, not Google indexing.

Step 6: Wait 2-4 weeks and watch the right signals

~2 weeks: Search Console Crawl Stats (Settings -> Crawl stats) starts showing fresh crawl hits for these URLs.
~4 weeks: URL Inspection status flips from Discovered - currently not indexed to URL is on Google.

If a URL is still Discovered after 4 weeks, the internal link signal is still too weak — add links from more authoritative, higher-traffic pages and reduce its click depth from the homepage.

How to confirm it is fixed

Re-run the orphan diff from Step 1: the URL no longer appears in orphans.txt.
URL Inspection shows the page with at least one referring page under “Discovery,” and status URL is on Google.
site:yourdomain.com/the-url finally returns the page in Google.

Heads-up: a few orphans are normal

On large sites, a few orphans always appear during a redesign or category restructure. That is expected — catch them in the next monthly audit rather than treating every one as an emergency.

Easy to misdiagnose

Re-stuffing orphans into the sitemap: Google already knows they exist; the sitemap was never the missing piece.
Treating “Request indexing” as a cure: it is rate-limited (roughly 10-12 URLs per day per property as of June 2026, after which the button greys out for 24 hours) and does not fix the underlying link signal.
Assuming one internal link is enough: a single link rarely lifts an orphan out of the queue; use 2+ from distinct, indexed pages.
Assuming tag pages solve it: they only help if the tag pages are themselves indexed, well-linked, and not thin.
Confusing the two statuses: Crawled - currently not indexed is a content-quality problem, not a linking problem (see the table above).

Prevention

Before publishing, add internal links from at least 2 existing related articles and add the page to the homepage or a category index.
Keep the “Related articles” widget reaching deep into the archive, not just the latest 5.
Run a lightweight orphan audit monthly (Screaming Frog or the wget diff above) — it is a 10-minute job.
When you change URL structure, sync-update every internal link; do not rely on 301s alone.
Keep lastmod honest in the sitemap so recrawls are scheduled on real changes.

FAQ

Q: How many internal links does an orphan actually need? A: At least 2 contextual links from distinct, already-indexed pages, ideally 3-5. The quality of the source pages (indexed, relevant, getting traffic) matters more than the raw count.

Q: Are tag or category pages a fix for orphan articles? A: Only if the tag/category pages are themselves indexable, well-linked, and not thin. A thin, noindexed tag page passes almost nothing.

Q: Does adding <priority> in the sitemap help? A: No. Google ignores <priority> and <changefreq>. Only <loc> and <lastmod> carry weight, and lastmod only if it is accurate.

Q: Will deleting orphans with a 404 hurt my site’s authority? A: Removing genuinely thin pages is net-positive — it is one less low-value URL dragging on crawl budget. Use 410 Gone rather than 404 when you intend a page to stay gone; it is a clearer “do not come back” signal.

Q: I requested indexing — why is it still not indexed? A: “Request indexing” only re-queues a crawl; it does not override the importance signal. If internal links are still weak, Google will recrawl and re-park it. Fix the links first, then request indexing.

Tags: #SEO #Google #Search Console #Indexing #Troubleshooting #Orphan page

Symptoms

Which “not indexed” bucket are you in?

Common causes

1. Page was created but never linked from any list, category, or related-articles widget

2. Reachable only by typing the URL

3. Linked only from old, low-authority URLs

4. Linked only via nofollow or JS-rendered widgets

5. URL path changed but old internal links were not updated

6. Pagination, tag, or archive structure orphans deep articles

Shortest path to fix

Step 1: Find every orphan

Step 2: Decide whether each orphan should exist

Step 3: Add 2+ contextual internal links from relevant indexed pages

Step 4: Make orphans an automatic, structural concern

Step 5: Resubmit the sitemap and trigger re-discovery

Step 6: Wait 2-4 weeks and watch the right signals

How to confirm it is fixed

Heads-up: a few orphans are normal

Easy to misdiagnose

Prevention

FAQ

Related

Related Articles

Hreflang "No Return Tags": Fix the Missing Reciprocal Link

JavaScript-Rendered Content Not Showing in Google Index

Indexing Dropped After Google Switched Your Site to Mobile-First

noindex,follow on Page 2+ Is Orphaning Your Deep Articles

Query-Parameter URLs Creating Duplicate Index Entries

robots.txt Blocks CSS/JS and Indexing Quality Drops

4. Linked only via `nofollow` or JS-rendered widgets