Canonical URL Mismatch on Bilingual Site

Bilingual / multi-locale pages have canonicals pointing the wrong direction.

You have an EN and ZH version of every article. Search Console flags /en/your-article/ as “Duplicate without user-selected canonical” or worse, “Alternate page with proper canonical tag” — and the ZH version is the one indexed instead. The cause is almost always that the canonical link in your EN template ends up pointing to the ZH URL (or vice versa), or both versions point to a single language as the canonical. The rule for multilingual sites is simple: each language version is self-canonical, and hreflang declares them as alternates. Most template bugs in this area come from confusing these two mechanisms.

Common causes

Ordered by hit rate, highest first.

1. Template hardcoded one canonical URL

// Bad
<link rel="canonical" href="https://site.com/en/article" />

The same canonical is emitted on both the EN and ZH versions. Google sees ZH pointing to EN and de-duplicates by indexing only EN.

How to spot it:

curl -s https://site.com/zh/article/ | grep -i canonical
curl -s https://site.com/en/article/ | grep -i canonical

If both return the same URL, it’s hardcoded wrong.

2. translationKey used as canonical by mistake

Some Astro/Next templates use a shared translationKey to link versions and accidentally also use it to derive the canonical. Result: both versions canonical to one URL based on the key.

How to spot it: Look at your layout’s canonical derivation. If it uses translationKey instead of the actual page’s full URL (with locale prefix), this is it.

3. Default canonical falls back to default-language URL

// Bad
<link rel="canonical" href={`https://site.com/en/${slug}/`} />

The /en/ is hardcoded, so ZH pages get a canonical to the EN URL.

How to spot it: Search your template for hardcoded /en/ or /zh/ in canonical generation.

4. Canonical includes the locale param but with wrong locale

Off-by-one bug: ZH page sees locale = 'en' from a global variable. The canonical ends up wrong.

How to spot it: Add logging to your canonical helper. Compare locale value at render time to the page’s actual lang.

5. Conflicting canonical + hreflang

The canonical says /en/article/ and hreflang="zh" says /zh/article/, but the ZH page itself isn’t self-canonical. Google sees the conflict and picks one (often wrong).

How to spot it: ZH page’s <head> should have:

<link rel="canonical" href="https://site.com/zh/article/" />
<link rel="alternate" hreflang="en" href="https://site.com/en/article/" />
<link rel="alternate" hreflang="zh" href="https://site.com/zh/article/" />
<link rel="alternate" hreflang="x-default" href="https://site.com/en/article/" />

EN page mirrors this with EN as canonical.

6. Astro site config has wrong base URL

If site in astro.config.mjs is wrong, all canonicals derived from it are wrong.

How to spot it:

// astro.config.mjs
export default defineConfig({
  site: 'https://site.com', // must match production
});

curl -s https://site.com/ | grep canonical — should match site exactly.

Shortest path to fix

Step 1: Confirm each version is self-canonical

For each pair of EN/ZH articles:

for lang in en zh; do
  echo "=== /$lang/article/ ==="
  curl -s "https://site.com/$lang/article/" | grep -i canonical
done

Expected:

=== /en/article/ ===
<link rel="canonical" href="https://site.com/en/article/" />

=== /zh/article/ ===
<link rel="canonical" href="https://site.com/zh/article/" />

Both must self-canonical. If one points to the other, you have the bug.

Step 2: Fix the template

In your layout (Astro example):

---
const url = new URL(Astro.url.pathname, Astro.site).toString();
const lang = Astro.params.lang || 'en';
const translationKey = Astro.props.frontmatter?.translationKey;
const altLang = lang === 'en' ? 'zh' : 'en';
---
<link rel="canonical" href={url} />
<link rel="alternate" hreflang={lang} href={url} />
<link rel="alternate" hreflang={altLang} href={`${Astro.site}${altLang}/${translationKey}/`} />
<link rel="alternate" hreflang="x-default" href={`${Astro.site}en/${translationKey}/`} />

Canonical derives from the actual page URL, not from any hardcoded prefix.

Step 3: Validate at build time

Add a prebuild script:

import fs from 'node:fs';
import path from 'node:path';
import { parse } from 'node-html-parser';

const dist = 'dist';
let errors = 0;

function walk(dir) {
  for (const f of fs.readdirSync(dir)) {
    const p = path.join(dir, f);
    if (fs.statSync(p).isDirectory()) walk(p);
    else if (p.endsWith('.html')) {
      const html = fs.readFileSync(p, 'utf8');
      const root = parse(html);
      const canonical = root.querySelector('link[rel="canonical"]')?.getAttribute('href');
      const expected = 'https://site.com/' + p.replace(/^dist\//, '').replace(/index\.html$/, '');
      if (canonical !== expected) {
        console.error(`MISMATCH: ${p}\n  canonical: ${canonical}\n  expected:  ${expected}`);
        errors++;
      }
    }
  }
}

walk(dist);
process.exit(errors ? 1 : 0);

Step 4: Verify in Search Console

After deploy, in Search Console → URL Inspection → submit a few ZH URLs that were previously misconfigured. Wait 1-2 weeks. The page status should change from “Duplicate without user-selected canonical” to “Indexed.”

Step 5: Resubmit sitemap

If your sitemap was built with the broken canonicals, regenerate and submit. Sitemap should have one entry per URL (both EN and ZH versions appear as separate entries).

Step 6: Watch hreflang errors

Search Console → Legacy tools → International Targeting → Hreflang. After fix, the “no return tags” warnings should disappear over 1-2 weeks.

Prevention

  • Template rule: canonical = the page’s own full URL, always. No exceptions.
  • Keep canonical and hreflang as two independent mechanisms — canonical for “this version of this URL is the one”; hreflang for “this URL has translations elsewhere.”
  • CI check that every rendered HTML’s canonical equals the file path it serves at.
  • Don’t pre-render canonical based on translationKey or any shared identifier — derive it from the URL.
  • For new locales, test the canonical/hreflang output on a sample article before rolling out.

Tags: #Troubleshooting #SEO #Debug #Canonical #Bilingual