Internal Link Rot: Articles Point to Renamed or Deleted Slugs

Q: Why is a host-level 301 better than an `astro.config.mjs` redirect?

In a static build, the `astro.config.mjs` `redirects` map is emitted as an HTML page with a ` ` tag, not a real HTTP 301 (as of June 2026). Meta refresh works for humans but is a weaker signal for search engines than a true permanent redirect. A `firebase.json` / `_redirects` entry returns an actual `301` status, which forwards link equity cleanly.

Q: Why does lychee flag all my absolute links as errors?

Because it cannot resolve `/en/articles/...` without knowing the site root. Pass `--root-dir "$(pwd)/dist"` so lychee maps absolute paths to files on disk. Without it, every absolute internal link is reported as broken.

Q: How do I catch broken anchor links, not just broken pages?

Run `lychee --include-fragments anchor-only`. It checks `#section` fragments against the rendered headings in the target page, catching the case where a heading was renamed and its generated anchor slug changed.

Half your internal links 404 because you renamed slugs without redirects. Add host-level 301s, run lychee in CI, and fail prebuild on any link that points at a slug that does not resolve.

Published: May 24, 2026 Updated: Jun 18, 2026 Author: AI Productivity Guide Team 🌐 查看中文版本

A reader clicks “Related: GPT Tips” at the bottom of your article and lands on a 404. You renamed that slug six months ago to chatgpt-tips but never went back to update the 47 articles that linked to the old gpt-tips URL. Search Console quietly logs the 404s under Pages -> Not indexed -> Not found (404). Google starts trusting the linking pages less because they keep handing readers (and crawlers) into dead ends. Your internal link graph rots from the inside.

Fastest fix: add a host-level 301 from the old slug to the new one (in firebase.json, public/_redirects, or your CDN), then add a prebuild script that builds a slug index from frontmatter and fails the build if any in-body internal link points at a slug that does not exist. The redirect stops the bleeding today; the prebuild check stops it from happening again. Run lychee in CI for a second, fragment-aware layer.

Internal link rot is silent. The build does not fail, the page renders fine, and the broken link sits at the bottom of the article where most readers never scroll. But every dangling link is a leaked authority signal and a worse user experience.

Which bucket are you in?

Symptom	Likely cause	The fix
Link worked, now 404s, target file was renamed	Slug rename, no redirect	Host-level 301 (Step 2)
Link 404s, target file is gone	Article deleted	Restore, or 301 to closest match (Step 4)
Link 404s, target never existed	Typo in the path	Fix the link; prebuild check (Step 3)
Page loads but jumps to top, not the section	Anchor renamed	`lychee --include-fragments` (Step 1)
Off-site link 302s to a homepage or 404s	External rot	Drop `--offline`, run weekly (Prevention)

Common causes

1. You renamed a slug and never updated linkers

You changed gpt-tips.mdx to chatgpt-tips.mdx because it ranks better. Every article that linked to /en/articles/gpt-tips/ now 404s. No build error, no redirect, no warning.

How to spot it: grep your content for the old slug.

grep -rc "/articles/gpt-tips/" src/content/articles/ | grep -v ':0'

2. You deleted an article without checking inbound links

You deprecated a thin article. Forty-three other articles still link to it. Those links now 404.

How to spot it: before deletion, run an inbound-link scan.

grep -rl "/articles/SLUG-TO-DELETE/" src/content/articles/

If anything prints, you must update or redirect before merging.

3. Typo in the link target

You wrote /en/articles/chatgpt-tipss/ (typo: double s). The page renders the link, the build does not validate it, the reader clicks and 404s.

How to spot it: only a real link checker or a prebuild validator catches this. Building a set of valid slugs from every frontmatter urlSlug gives a definitive allow-list to check links against (Step 3).

4. External link rotted (third-party site moved the page)

Less critical for SEO than an internal 404, but still bad UX. You linked to https://example.com/great-article/ and they restructured. Now it 302s to a homepage or 404s.

How to spot it: run lychee without --offline so it actually hits the network. There is no --check-external flag in lychee; the network check is the default, and --offline is what disables it. Scope the scan to outbound hosts and run it weekly, not per-PR (external checks are too flaky to gate a merge).

5. Anchor-only links to renamed sections

You link to /en/articles/foo/#step-3. The article got rewritten and ## Step 3 became ## Step 3: Verify, so the generated anchor is now step-3-verify. The fragment 404s silently: the page loads but jumps to the top instead of the section.

How to spot it: lychee --include-fragments anchor-only checks #section fragments against the rendered headings. (--include-fragments accepts none, anchor-only, text-only, or full; a bare flag defaults to anchor-only, as of June 2026.)

Shortest path to fix

Step 1: Run lychee across the built site

# https://github.com/lycheeverse/lychee
brew install lychee
NODE_OPTIONS=--max-old-space-size=8192 npm run build
lychee --offline --include-fragments anchor-only \
  --root-dir "$(pwd)/dist" \
  --output link-report.txt \
  "dist/**/*.html"

--offline keeps it to local files (fast, deterministic, no rate limits). --root-dir is required when your HTML contains absolute links like /en/articles/..., otherwise lychee flags every absolute path as an error. --include-fragments anchor-only catches the renamed-anchor case from cause #5.

For a JS-native alternative, linkinator crawls the built directory:

npx linkinator dist --recurse --format json \
  --skip "^https://(facebook|twitter|x)\.com" > link-report.json

Sort the output by how many pages reference each dead URL, and fix the highest-fan-in targets first.

Step 2: Add a host-level 301 for renamed slugs

This is the most common mistake to get wrong: a redirect in astro.config.mjs does not emit a real 301 in a static build. As of June 2026, when you run astro build in static (SSG) output, Astro writes an HTML file with a <meta http-equiv="refresh"> tag for each redirect. That is a client-side meta refresh, not an HTTP 301, so Google does not forward link equity the way a true permanent redirect does.

To get a real 301 that preserves authority, configure the redirect at the host. This site deploys on Firebase Hosting, so add it to firebase.json:

{
  "hosting": {
    "redirects": [
      { "source": "/en/articles/gpt-tips", "destination": "/en/articles/chatgpt-tips", "type": 301 },
      { "source": "/zh/articles/gpt-tips", "destination": "/zh/articles/chatgpt-tips", "type": 301 }
    ]
  }
}

Firebase applies the first matching rule, so order specific rules before wildcards. On Netlify or Cloudflare Pages, the equivalent lives in public/_redirects:

/en/articles/gpt-tips/  /en/articles/chatgpt-tips/  301
/zh/articles/gpt-tips/  /zh/articles/chatgpt-tips/  301

Once the 301 is live, the old links still work and Google forwards link equity, so you can fix the link text at your own pace (or leave it). Deploy with firebase deploy --only hosting and confirm with curl (Step 6).

Step 3: Fail prebuild on internal dangling links

A redirect catches renames; a prebuild check catches typos and deletions before they ship. Build a slug index from frontmatter, then validate every internal link in the body.

// scripts/check-internal-links.mjs
import fs from "node:fs";
import path from "node:path";
import matter from "gray-matter";

const root = "src/content/articles";
const langs = ["en", "zh"];
const validSlugs = new Set();

// 1. Walk every article and collect valid /lang/articles/<slug>/ URLs
function walk(dir) {
  return fs.readdirSync(dir, { withFileTypes: true }).flatMap((e) => {
    const p = path.join(dir, e.name);
    return e.isDirectory() ? walk(p) : p.endsWith(".mdx") ? [p] : [];
  });
}

for (const lang of langs) {
  for (const file of walk(path.join(root, lang))) {
    const { data } = matter(fs.readFileSync(file, "utf8"));
    if (data.urlSlug && data.lang) {
      validSlugs.add(`/${data.lang}/articles/${data.urlSlug}/`);
    }
  }
}

// 2. Check every in-body internal link against the index
let broken = 0;
for (const lang of langs) {
  for (const file of walk(path.join(root, lang))) {
    const txt = fs.readFileSync(file, "utf8");
    const links = [...txt.matchAll(/\]\((\/(?:en|zh)\/articles\/[^)#\s]+?\/?)(#[^)]*)?\)/g)];
    for (const m of links) {
      let url = m[1];
      if (!url.endsWith("/")) url += "/";
      if (!validSlugs.has(url)) {
        console.error(`BROKEN in ${path.basename(file)}: ${url}`);
        broken++;
      }
    }
  }
}
process.exit(broken > 0 ? 1 : 0);

Wire it into prebuild (or run it from npm run audit:content) so a PR cannot land a dangling internal link. Keep a small allow-list of slugs that are redirected-but-not-yet-files if you intend to keep an old URL alive only via host redirect.

Step 4: Fix the existing rot in batches

For each broken target:

Renamed: add the host-level 301 (Step 2). No content edit needed.
Deleted but still valuable: restore from git (git log --diff-filter=D -- <path> to find the deletion, then git checkout <commit>^ -- <path>).
Deleted on purpose: 301 it to the closest surviving article, or bulk-update linkers to point elsewhere or strip the link entirely with grep-and-sed.

Do not leave a 404 sitting. Either redirect it or rewrite the link.

Step 5: Resubmit the sitemap and request a recrawl

After fixing, resubmit your sitemap in Search Console (Indexing -> Sitemaps) and use URL Inspection -> Request Indexing on the worst-affected linking pages so Google re-crawls and sees clean neighborhoods. Recovery from the 404 reports is gradual; do not expect same-day clearing.

Step 6: Confirm it is actually fixed

Verify the redirect returns a true 301, not a 200 with a meta refresh:

curl -sI https://yoursite.com/en/articles/gpt-tips/ | grep -iE "^(HTTP|location)"
# Expect: HTTP/2 301  and  location: /en/articles/chatgpt-tips/

Then re-run the prebuild check and lychee. A clean exit code from scripts/check-internal-links.mjs (exit 0) plus zero internal errors from lychee means the rot is cleared. Spot-check one previously broken “Related” link in a browser and confirm it lands on real content.

Prevention

lychee runs in CI on every PR with --offline; the job fails on any dangling internal link or anchor.
The prebuild internal-link validator (Step 3) checks every in-body link against the frontmatter slug index.
Renaming a slug requires adding a host redirect in the same PR; enforce it with a lint rule or a checklist item in the PR template.
Deletion checklist: scan inbound links first (cause #2), then redirect or rewrite before merging.
External-link check runs weekly (drop --offline), not per-PR, with a tolerance for flaky timeouts.
Quarterly review of the redirect map: collapse chains so A -> B -> C becomes A -> C (chained redirects bleed link equity and slow the crawler).

FAQ

Why is a host-level 301 better than an `astro.config.mjs` redirect?

In a static build, the astro.config.mjs redirects map is emitted as an HTML page with a <meta http-equiv="refresh"> tag, not a real HTTP 301 (as of June 2026). Meta refresh works for humans but is a weaker signal for search engines than a true permanent redirect. A firebase.json / _redirects entry returns an actual 301 status, which forwards link equity cleanly.

Will broken internal links actually hurt my SEO, or just UX?

Both. A single 404 will not tank your site, but a pattern of internal links into dead pages tells Google your linking pages are poorly maintained, wastes crawl budget, and stops authority from flowing through your link graph. Internal links are how PageRank distributes across your own pages, so dangling ones are wasted signal.

Should I 301 a deleted article or return a 404?

If a similar, still-useful article exists, 301 to it so you keep the inbound equity and the reader gets a real page. If nothing relevant remains, a clean 404 (or 410 Gone) is correct, but first strip or repoint the internal links so you are not handing readers into the dead end on purpose.

Why does lychee flag all my absolute links as errors?

Because it cannot resolve /en/articles/... without knowing the site root. Pass --root-dir "$(pwd)/dist" so lychee maps absolute paths to files on disk. Without it, every absolute internal link is reported as broken.

How do I catch broken anchor links, not just broken pages?

Run lychee --include-fragments anchor-only. It checks #section fragments against the rendered headings in the target page, catching the case where a heading was renamed and its generated anchor slug changed.

Tags: #Content ops #Site quality #Site audit #Troubleshooting #broken-link