Canonical URL Points to the Wrong Page: Translations Canonicalize Back to English

ZH pages have a canonical link pointing at the EN version — Google deindexes the ZH variant. Per-page canonical from current URL, verify with curl + view-source.

You check Search Console and your ZH pages show “Alternate page with proper canonical tag” — meaning Google decided not to index them because their canonical points elsewhere. You view-source on a ZH article and find a <link rel="canonical" href="https://site.com/en/articles/foo/"> — the ZH page is canonicalizing itself to the EN version. To Google, ZH and EN are now the same URL; the ZH version disappears from the index. Half your bilingual investment goes invisible.

This usually comes from an SEO plugin set to “use the primary language version” or a layout that always emits the EN URL. The fix is conceptually trivial: each page’s canonical must be its own URL. The execution requires careful template work, a build-time check, and a one-time view-source verification across a sample of pages.

Common causes

1. SEO plugin configured to canonicalize all translations to primary

Some plugins (and some custom layouts) have a “consolidate authority on the primary language” option. Sounds reasonable; is actually wrong. Hreflang already handles the locale relationship; canonical should be self-referential.

How to spot it: view-source on a translated page; check the canonical link target.

curl -s https://site.com/zh/articles/foo/ | grep 'rel="canonical"'

If the href points at /en/, the plugin is mis-configured.

2. Hard-coded canonical in the layout

Someone wrote <link rel="canonical" href={https://site.com/en/articles/$\{slug\}/`} />` in the article layout. That worked for EN pages and silently broke ZH pages.

How to spot it: grep the layout for canonical.

grep -rn 'rel="canonical"' src/layouts/ src/components/

If the URL doesn’t include Astro.url or the current page’s locale, it’s hard-coded.

3. Canonical points at the URL without trailing slash (or vice versa)

Site serves /en/articles/foo/ but canonical declares /en/articles/foo (no trailing slash). Google sees two URLs, picks the one in canonical, indexes that one, and your real URLs are deprioritized.

How to spot it: compare canonical href to actual page URL. Trailing slash must match.

4. Canonical includes query strings or fragments

Author shared a URL with ?utm_source=twitter and that pattern got cached or hard-coded into a template. Now canonical includes query strings that fragment indexing.

How to spot it: canonical href contains ? or #.

5. Cross-domain canonical pointing to a republished version

You syndicated an article to Medium/Substack. Someone set the Medium canonical to your site (correct) but also set YOUR site’s canonical to the Medium URL (wrong). Now your own page tells Google “I’m a copy of Medium.”

How to spot it: any canonical hostname that doesn’t match the page hostname.

6. Canonical missing entirely

No canonical tag at all. Google chooses on its own — usually fine, but with URL variants (with/without trailing slash, with utm params) you lose deterministic indexing.

How to spot it: view-source for rel="canonical". Empty result — missing.

Shortest path to fix

Step 1: Make canonical self-referential, per page

In your article layout, compute the canonical from the current URL:

---
const { article } = Astro.props;
const SITE = "https://site.com";
const canonical = `${SITE}/${article.data.lang}/articles/${article.data.urlSlug}/`;
---
<link rel="canonical" href={canonical} />

This guarantees ZH canonicalizes to ZH, EN to EN. Hreflang separately tells Google the pages are alternates.

Step 2: Lock down trailing slash and casing

In astro.config.mjs:

export default defineConfig({
  trailingSlash: "always",
  build: { format: "directory" },
  site: "https://site.com",
});

Canonical URLs in layout must match what the server actually serves. Build canonicals from Astro.site (so it picks up the trailing slash from config) when possible.

Step 3: Verify with curl + view-source on samples

Sample one article in each language and one in each major category:

for url in \
  https://site.com/en/articles/foo/ \
  https://site.com/zh/articles/foo/ \
  https://site.com/en/articles/bar/ \
  https://site.com/zh/articles/bar/
do
  echo "=== $url ==="
  curl -s "$url" | grep -E 'rel="(canonical|alternate)"'
done

Each page’s canonical should match its own URL. Each pair’s alternates should reciprocate.

Step 4: Add a prebuild assertion

Catch regressions:

# scripts/audit-canonical.mjs
import fs from "node:fs";
import path from "node:path";

const distRoot = "dist";
let problems = 0;

function walk(dir) {
  for (const e of fs.readdirSync(dir, { withFileTypes: true })) {
    const p = path.join(dir, e.name);
    if (e.isDirectory()) { walk(p); continue; }
    if (!p.endsWith("index.html")) continue;
    const html = fs.readFileSync(p, "utf8");
    const m = html.match(/<link\s+rel="canonical"\s+href="([^"]+)"/);
    if (!m) { console.error(`MISSING canonical: ${p}`); problems++; continue; }
    // Reconstruct expected URL from path
    const rel = p.replace(/^dist/, "").replace(/index\.html$/, "");
    const expected = `https://site.com${rel}`;
    if (m[1] !== expected) {
      console.error(`WRONG canonical: ${p} -> ${m[1]} (expected ${expected})`);
      problems++;
    }
  }
}
walk(distRoot);
process.exit(problems > 0 ? 1 : 0);

Wire to a postbuild step.

Step 5: Request reindexing on previously deindexed pages

In Search Console, for pages stuck on “Alternate page with proper canonical tag,” request indexing once the canonical is fixed. The fix only takes effect on Google’s next crawl.

Prevention

  • Canonical computed from current page URL, never hard-coded
  • Trailing-slash policy locked in Astro config; canonical matches served URL exactly
  • Postbuild audit: every page has a canonical matching its own URL
  • SEO plugin (if any) configured to NOT consolidate translations
  • No query strings or fragments in canonical
  • Cross-domain syndication uses canonical-to-self on the original site; only the syndicated copy canonicalizes back
  • Sample view-source check after major template changes

Tags: #Content ops #Site quality #Site audit #Troubleshooting #Canonical