AI Check Your Hreflang Setup (Bilingual / Multi-Locale)

Use AI to audit hreflang implementation — slug mismatches, missing pairs, wrong codes.

Bilingual content sites silently leak ranking signals when hreflang is wrong — translationKey typos, one-way links, missing self-references, and the classic “zh” vs “zh-CN” bug. This tutorial gives you a 30-minute AI-assisted audit that catches the four most common errors on a 1000-page site without learning Search Console’s hreflang report. The result is a clean CSV of bad pairs you can fix in source, plus a working spot-check on rendered HTML.

What this covers

A scriptable audit pipeline: export per-article metadata, ask an LLM to find structural defects, fix in source, re-render, and resubmit. AI is doing the pattern-matching across thousands of rows that you would otherwise eyeball. We focus on hreflang as it appears in the head of HTML pages and in your sitemap — not on HTTP header hreflang (rare in content sites).

Who this is for

Owners and maintainers of bilingual or multi-locale content sites — anyone running EN/ZH, EN/JA, EN+ES+FR, etc. on Astro, Next.js, Hugo, or any static-ish stack. If you have a translationKey or i18n-key field linking translations, this workflow uses it directly. If you don’t, step 1 still works because the slug pair itself is the key.

When to reach for it

After every content batch (10+ new articles or translations), before any major SEO push (link campaign, sitemap resubmit), and immediately after renaming any locale code or slug. Hreflang errors are cheap to introduce and expensive to find with manual eyeballing — automate it.

When this is NOT the right tool

Skip AI auditing if you have fewer than 50 article pairs — Search Console’s International Targeting report finds the issues for free. Also skip if your site is single-language with no plans to expand; you do not need hreflang at all.

Before you start

  • Confirm what your hreflang strategy is. Most bilingual sites use <link rel="alternate" hreflang="en" href="..."/> plus an x-default. Write that down so you can verify against the rendered output.
  • Pick the locale codes you intend to support. Decide between zh, zh-CN, zh-Hans, or zh-Hant and use one consistently. Mixed codes are the #1 cause of hreflang clusters being ignored.
  • Choose a model with a long context window. A 1000-row CSV easily exceeds 100k tokens once you include the rendered tag column.

Step by step

  1. Generate a CSV with one row per article. Columns: slug, lang, translationKey, canonical_url, and the full rendered <link rel="alternate"> block for that page. A small Node script reading your content collection emits this in under a minute.
  2. Prompt the model with a precise audit checklist:
Audit this hreflang setup. For each row, flag:
1. Missing pair: article exists in one lang but not the other (group by translationKey)
2. translationKey mismatch: same slug but different keys, or same key but slugs do not align
3. Wrong locale code: anything not in [en, zh-CN] (or your allowed set)
4. Missing self-reference: row does not include hreflang for its own lang
5. Missing x-default
Return a TSV with columns: issue, slug, lang, suggested_fix
  1. Fix in source. Two real fixes: ship the missing translation, or correct the translationKey. Do not “fix” by deleting a hreflang link — that hides the issue from the audit but not from Google.
  2. Re-render the site and spot-check 5 representative pages in HTML view. Pick: a popular EN article, its ZH pair, an EN-only article (should still emit x-default and self-reference), the homepage, and one tag/index page.
  3. Submit the updated sitemap to Search Console. If you use a sitemap index, ping all child sitemaps. Watch the International Targeting report for 7-14 days; errors should trend toward zero.

First-run exercise

  1. Limit the first audit to one subdirectory or one tag — maybe 30-50 articles. The model is more accurate on small batches, and you learn the prompt’s failure modes cheaply.
  2. Compare AI’s findings to a known-good page and a known-bad page (deliberately break one before running). If it misses the planted bug, your prompt or CSV is too thin.
  3. Save the AI’s TSV alongside your fix commit. When you re-audit next month, diffing the two TSVs shows you exactly which class of bug is creeping back.
  4. For the second run, scale to the full site only after the prompt finds 100% of your planted bugs.

Quality check

  • After re-rendering, open three pages and inspect the head. Each page should have a self-referencing hreflang, all sibling locale alternates, and one x-default.
  • Validate the URLs in hreflang return 200, not 301 or 404. Models do not check this — a separate broken-link script does.
  • Spot-check the canonical URL on each page. Hreflang pointing to a non-canonical URL silently fails.

How to reuse this workflow

  • Save the audit prompt as part of your content pipeline. Run it in CI on every PR that touches more than 10 articles.
  • Maintain an “allowed locale codes” list as data, not in the prompt. Easier to update when you add a third language.
  • Keep a regression log of past errors. When the same translationKey breaks twice, automate the constraint in your build (assert each translationKey appears in exactly N locales).

CSV export → AI audit with explicit checklist → fix in source → re-render → spot-check HTML on 5 pages → resubmit sitemap → monitor Search Console for 14 days.

Common mistakes

  • Using zh instead of zh-CN or zh-Hans — Google treats them as different signals. Pick one and stick with it sitewide.
  • Forgetting x-default — without it, Google may pick a worst-case fallback for users in unexpected locales.
  • Hreflang in <head> but pointing to a non-canonical URL — the cluster is silently ignored.
  • Asymmetric pairs — EN article links to ZH but ZH does not link back. Hreflang clusters must be reciprocal or they are dropped.
  • Auditing only the live site and forgetting the sitemap — both must agree. Conflicting signals from the two sources cause Google to ignore both.
  • Trusting AI’s “fix” without spot-checking the HTML — models occasionally suggest correct-looking but wrong locale codes.

FAQ

  • Will Google penalize wrong hreflang?: Not directly, but it misroutes traffic to the wrong locale, which kills CTR and conversions in that market.
  • Do I need hreflang on every page or just articles?: Every indexable page that has a translation. Homepage, category pages, and individual articles all count.
  • What about pages that are only in EN?: Self-reference the EN page and emit x-default pointing to it. No need to invent a ZH alternate.
  • How often does the model hallucinate a hreflang issue?: Maybe 5% of flags are false positives on a clean prompt. Always verify before fixing — never blindly apply suggested fixes.
  • Can I do this without translationKey?: Yes, group rows by your slug-pair convention. The model handles “en/foo” and “zh/foo” pairing if you tell it the rule.

Tags: #Tutorial #SEO #AI coding #hreflang #Bilingual