You have an EN and ZH version of every article. Search Console flags /en/your-article/ as “Duplicate without user-selected canonical” or worse, “Alternate page with proper canonical tag” — and the ZH version is the one indexed instead. The cause is almost always that the canonical link in your EN template ends up pointing to the ZH URL (or vice versa), or both versions point to a single language as the canonical. The rule for multilingual sites is simple: each language version is self-canonical, and hreflang declares them as alternates. Most template bugs in this area come from confusing these two mechanisms.
Common causes
Ordered by hit rate, highest first.
1. Template hardcoded one canonical URL
// Bad
<link rel="canonical" href="https://site.com/en/article" />
The same canonical is emitted on both the EN and ZH versions. Google sees ZH pointing to EN and de-duplicates by indexing only EN.
How to spot it:
curl -s https://site.com/zh/article/ | grep -i canonical
curl -s https://site.com/en/article/ | grep -i canonical
If both return the same URL, it’s hardcoded wrong.
2. translationKey used as canonical by mistake
Some Astro/Next templates use a shared translationKey to link versions and accidentally also use it to derive the canonical. Result: both versions canonical to one URL based on the key.
How to spot it: Look at your layout’s canonical derivation. If it uses translationKey instead of the actual page’s full URL (with locale prefix), this is it.
3. Default canonical falls back to default-language URL
// Bad
<link rel="canonical" href={`https://site.com/en/${slug}/`} />
The /en/ is hardcoded, so ZH pages get a canonical to the EN URL.
How to spot it: Search your template for hardcoded /en/ or /zh/ in canonical generation.
4. Canonical includes the locale param but with wrong locale
Off-by-one bug: ZH page sees locale = 'en' from a global variable. The canonical ends up wrong.
How to spot it: Add logging to your canonical helper. Compare locale value at render time to the page’s actual lang.
5. Conflicting canonical + hreflang
The canonical says /en/article/ and hreflang="zh" says /zh/article/, but the ZH page itself isn’t self-canonical. Google sees the conflict and picks one (often wrong).
How to spot it: ZH page’s <head> should have:
<link rel="canonical" href="https://site.com/zh/article/" />
<link rel="alternate" hreflang="en" href="https://site.com/en/article/" />
<link rel="alternate" hreflang="zh" href="https://site.com/zh/article/" />
<link rel="alternate" hreflang="x-default" href="https://site.com/en/article/" />
EN page mirrors this with EN as canonical.
6. Astro site config has wrong base URL
If site in astro.config.mjs is wrong, all canonicals derived from it are wrong.
How to spot it:
// astro.config.mjs
export default defineConfig({
site: 'https://site.com', // must match production
});
curl -s https://site.com/ | grep canonical — should match site exactly.
Shortest path to fix
Step 1: Confirm each version is self-canonical
For each pair of EN/ZH articles:
for lang in en zh; do
echo "=== /$lang/article/ ==="
curl -s "https://site.com/$lang/article/" | grep -i canonical
done
Expected:
=== /en/article/ ===
<link rel="canonical" href="https://site.com/en/article/" />
=== /zh/article/ ===
<link rel="canonical" href="https://site.com/zh/article/" />
Both must self-canonical. If one points to the other, you have the bug.
Step 2: Fix the template
In your layout (Astro example):
---
const url = new URL(Astro.url.pathname, Astro.site).toString();
const lang = Astro.params.lang || 'en';
const translationKey = Astro.props.frontmatter?.translationKey;
const altLang = lang === 'en' ? 'zh' : 'en';
---
<link rel="canonical" href={url} />
<link rel="alternate" hreflang={lang} href={url} />
<link rel="alternate" hreflang={altLang} href={`${Astro.site}${altLang}/${translationKey}/`} />
<link rel="alternate" hreflang="x-default" href={`${Astro.site}en/${translationKey}/`} />
Canonical derives from the actual page URL, not from any hardcoded prefix.
Step 3: Validate at build time
Add a prebuild script:
import fs from 'node:fs';
import path from 'node:path';
import { parse } from 'node-html-parser';
const dist = 'dist';
let errors = 0;
function walk(dir) {
for (const f of fs.readdirSync(dir)) {
const p = path.join(dir, f);
if (fs.statSync(p).isDirectory()) walk(p);
else if (p.endsWith('.html')) {
const html = fs.readFileSync(p, 'utf8');
const root = parse(html);
const canonical = root.querySelector('link[rel="canonical"]')?.getAttribute('href');
const expected = 'https://site.com/' + p.replace(/^dist\//, '').replace(/index\.html$/, '');
if (canonical !== expected) {
console.error(`MISMATCH: ${p}\n canonical: ${canonical}\n expected: ${expected}`);
errors++;
}
}
}
}
walk(dist);
process.exit(errors ? 1 : 0);
Step 4: Verify in Search Console
After deploy, in Search Console → URL Inspection → submit a few ZH URLs that were previously misconfigured. Wait 1-2 weeks. The page status should change from “Duplicate without user-selected canonical” to “Indexed.”
Step 5: Resubmit sitemap
If your sitemap was built with the broken canonicals, regenerate and submit. Sitemap should have one entry per URL (both EN and ZH versions appear as separate entries).
Step 6: Watch hreflang errors
Search Console → Legacy tools → International Targeting → Hreflang. After fix, the “no return tags” warnings should disappear over 1-2 weeks.
Prevention
- Template rule: canonical = the page’s own full URL, always. No exceptions.
- Keep canonical and hreflang as two independent mechanisms — canonical for “this version of this URL is the one”; hreflang for “this URL has translations elsewhere.”
- CI check that every rendered HTML’s canonical equals the file path it serves at.
- Don’t pre-render canonical based on
translationKeyor any shared identifier — derive it from the URL. - For new locales, test the canonical/hreflang output on a sample article before rolling out.