Your internal-link audit shows 80 pages with zero inbound links. They exist — in the sitemap, on disk, technically reachable — but no other page on your site references them. Users only get there by typing the URL. Google sees these as low-importance: the site itself doesn’t bother to link to them, so why should anyone else?
Orphans aren’t just “indexing debt.” They’re a content quality signal to Google: a site that doesn’t link to its own content lacks topical structure. The fix is binary per orphan: either reintegrate (add inbound links + verify it deserves them) or remove (410 or noindex). Below: how to find orphans, decide per-page, and prevent new ones from being created.
Common causes
Ordered by hit rate, highest first.
1. “Just publish more” mentality without a cluster strategy
You scheduled 5 articles a month for two years. No editorial step asked “what existing articles should link to this one?” Each article shipped orphaned and the backlog grew.
How to spot it: Orphan articles cluster around batch publish dates. If many orphans share a publish week, the publishing pipeline created them.
2. Old re-prints or guest posts that never linked in
Articles from an old syndication deal or guest author imports. They were “added” but never integrated — no one wrote internal links from your existing content to them.
How to spot it: Orphan articles have a different author or come from a date range that doesn’t match your current publishing. Imports without integration.
3. Pages from a previous owner or direction
The site changed focus (from “general tech” to “AI productivity”). Old articles on unrelated topics still exist but aren’t relevant enough for current content to link to. They orphaned themselves through a topical pivot.
How to spot it: Orphan articles are on topics your site no longer covers. They’re “remnant” content.
4. Articles published in a category that has since shrunk
You launched a category with 5 articles, expected to grow it, never did. The category page is itself thin and gets little linking — articles inside are orphaned by their dying parent.
How to spot it: Orphan articles cluster in a specific category. The category page itself has low inbound. Withered branch.
5. URLs got renamed without updating internal links
You changed getting-started-with-x to x-quickstart. The new URL exists; nothing links to it because all internal links still point at the old. 301 redirects work for users but the new page is orphaned from a Google perspective.
How to spot it: 301-redirected URLs receive substantial inbound links, but the final destination URL has 0. Internal links need to point at the canonical destination directly.
6. Tagged pages, archive pages, or “category-only” pages
Pages that only show up via tag/archive navigation, never linked from body text. Tag pages can be orphans of body-text linking even though they’re auto-generated.
How to spot it: Tag/archive pages with 0 body-text references from any article. They exist via auto-nav but nothing intentionally points to them.
Shortest path to fix
Ordered by ROI. Step 1 detects; Step 2 decides per page.
Step 1: Crawl and identify true orphans
Use Screaming Frog / Sitebulb, or roll your own:
# scripts/find-orphans.mjs
import fs from "node:fs";
import path from "node:path";
import matter from "gray-matter";
// 1. Collect all article URLs
const articles = collectArticleUrls();
// 2. Collect all internal links from article bodies
const linkedTo = new Set();
for (const f of fs.readdirSync("src/content/articles/en/...")) {
const content = fs.readFileSync(f, "utf8");
// Match /en/articles/SLUG/ patterns
const matches = content.matchAll(/\/en\/articles\/([^/\s)]+)\//g);
for (const m of matches) linkedTo.add(m[1]);
}
// 3. Articles with no inbound = orphan
const orphans = articles.filter(a => !linkedTo.has(a.slug));
console.log("Orphans:", orphans.length);
console.log(orphans.map(o => o.slug).join("\n"));
Output: a clean list of orphans. Filter out intentional landing pages (paid ad targets, etc.) — they’re meant to be orphans.
Step 2: For each orphan, decide: integrate or remove
| Orphan type | Action |
|---|---|
| Recent (last 6 mo), on-topic, decent quality | Integrate: add 2-5 internal links |
| On-topic but old / thin | Upgrade content first, then integrate |
| Off-topic / remnant from past direction | 410 or noindex |
| Pivot remnant | If topic is fully dropped, 410 |
| Tag/archive page with no body refs | Acceptable if it's nav-only; otherwise noindex |
Don’t try to integrate everything — some orphans should just be removed.
Step 3: For integration, add 2-5 internal links per orphan
For each orphan you keep:
1. Find 5-10 existing articles on related topics
2. In 2-5 of those, add a body-text link to the orphan
3. Use descriptive anchor text (orphan's title or main topic)
4. Verify the link is in body content, not just a generated widget
Per orphan, ~5 inbound links from 5 different sources. After this round, no orphans exist.
Step 4: For removal, choose 410 or noindex deliberately
- 410 Gone — if the page should not exist (legacy, off-topic, duplicate)
- Faster Google deindexing than 404
- In your hosting: configure 410 response for those URLs
- Or in Astro: remove the file + add a 410 entry to firebase.json / similar
- noindex — if you want the URL accessible internally but invisible to search
- Add `<meta name="robots" content="noindex">` to the page
- Keep the file; just don't let Google rank it
- 301 redirect — if there's a clear successor article
- Redirect to the closest topical replacement
- Don't redirect everything to homepage (treated as soft-404)
Step 5: Update the “related articles” widget to surface orphans
Your widget probably surfaces high-inbound articles (network effect). Adjust the algorithm to boost low-inbound orphans:
function relatedScore(current, candidate) {
return sharedTags(current, candidate) * 3
+ sameCategory(current, candidate) * 5
+ (candidate.inboundCount < 3 ? 4 : 0); // boost orphans
}
Orphans get an extra slot in the “related” widget when relevance is high. Distribution rebalances naturally.
Step 6: Re-crawl monthly to catch new orphans early
Set a calendar reminder or CI job:
# Monthly: re-run orphan detection, output to dashboard
pnpm audit:orphans > reports/orphans-$(date +%Y-%m).txt
If the count is creeping up, your publishing workflow isn’t integrating new articles. Add an integration step.
Prevention
- Editorial pipeline: every new article requires “find 3 existing articles to link to it” before publish
- Monthly orphan audit; any orphan older than 60 days is either integrated or removed
- Related-articles widget boosts low-inbound pages; helps distribute over time
- For topical pivots, deliberately deindex the off-topic remnants instead of leaving them as orphans
- When renaming URLs, search and update all internal references; don’t rely on 301s to mask
- A site with 280 well-linked articles outperforms one with 500 articles where 80 are orphaned
Related
Tags: #Content ops #Site quality #Site audit #Troubleshooting #Orphan page #Internal link