You audit tag pages and find 23 of them showing “No articles found.” The tags were used by articles you later deleted, or by drafts that never published. The empty pages still render, still appear in the sitemap, still get crawled by Google. Search Console flags them as “Crawled — currently not indexed” or worse “Soft 404.” They add nothing for users and drag down site quality signals.
Empty tag pages are the natural side effect of any content cleanup. The fix has two layers: backfill (audit existing empty tags, deindex or 410 them) and prevention (prebuild rule: a tag must have at least N published, non-draft articles to be rendered).
Common causes
1. Articles using a tag were deleted
You bulk-deleted thin articles. Some of them were the only users of a particular tag. The tag’s archive page now has zero content.
How to spot it: list all tags used in frontmatter and check article counts.
# scripts/count-tags.mjs
import fs from "node:fs";
import matter from "gray-matter";
const counts = new Map();
for (const f of /* walk articles */ []) {
const { data } = matter(fs.readFileSync(f, "utf8"));
if (data.draft) continue;
for (const t of (data.tags || [])) {
counts.set(t, (counts.get(t) || 0) + 1);
}
}
// And compare to tags declared in your tag config / generator
Any tag in your tag config with count 0 is orphan.
2. Articles using a tag were all set to draft
Bulk draft-set during a quality push left some tags with zero published articles. The tag pages render the “no articles” empty state.
How to spot it: same script, but exclude draft: true articles from counts.
3. Tag taxonomy has typos creating dupes
You have ai-tools and ai-tool as separate tags. The plural has articles, the singular doesn’t (or vice versa). One archive page is empty; the other has the content.
How to spot it: list tags side-by-side, sort alphabetically, scan for near-duplicates:
grep -rh '^tags:' src/content/articles/ | tr ',' '\n' | sort -u
4. Auto-generated tag pages for every frontmatter value
Your tag generator walks frontmatter and creates a page for every distinct tag value, with no minimum count. One-off tags (tag: "expirimental-feature" in a single article) get a tag page that will probably stay near-empty.
How to spot it: count articles per tag; any tag with 1 article is a “thin tag,” any with 0 is orphan.
5. Tag rename without redirect
You renamed chatgpt to chat-gpt (or back). The old tag page still exists but no article references it. Sitemap still lists it. 404 or soft-404.
How to spot it: tags in URL paths that don’t appear in any current article frontmatter.
Shortest path to fix
Step 1: Inventory orphan tags
Build a complete tag usage report:
# scripts/audit-tag-usage.mjs
import fs from "node:fs";
import path from "node:path";
import matter from "gray-matter";
const counts = new Map();
const seen = new Set();
function walk(dir) {
for (const e of fs.readdirSync(dir, { withFileTypes: true })) {
const p = path.join(dir, e.name);
if (e.isDirectory()) { walk(p); continue; }
if (!p.endsWith(".mdx")) continue;
const { data } = matter(fs.readFileSync(p, "utf8"));
if (data.draft) continue;
for (const t of (data.tags || [])) {
seen.add(t);
counts.set(t, (counts.get(t) || 0) + 1);
}
}
}
walk("src/content/articles");
for (const [tag, n] of [...counts.entries()].sort((a, b) => a[1] - b[1])) {
if (n <= 1) console.log(`THIN tag "${tag}": ${n} articles`);
}
Tags with 0 articles are orphans; 1 is “thin” and probably also should not have a tag page.
Step 2: Decide per tag — deindex, merge, or backfill
For each orphan/thin tag:
- Merge into a similar tag (rename in frontmatter across articles, redirect tag page)
- Deindex the tag page (noindex meta + drop from sitemap)
- Return 410 Gone from the server for the tag URL
- If valuable, hand-write 2+ articles to populate it
Merging is usually best — collapses near-duplicates and concentrates authority.
Step 3: Add a prebuild rule: tag must have N+ articles
Set a minimum (e.g. 3 published articles) for a tag to render its archive page:
// src/pages/[lang]/tags/[tag].astro
export async function getStaticPaths() {
const all = await getCollection("articles");
const MIN = 3;
const counts = new Map();
for (const a of all) {
if (a.data.draft) continue;
for (const t of (a.data.tags || [])) {
counts.set(`${a.data.lang}:${t}`, (counts.get(`${a.data.lang}:${t}`) || 0) + 1);
}
}
return [...counts.entries()]
.filter(([_, n]) => n >= MIN)
.map(([key]) => {
const [lang, tag] = key.split(":");
return { params: { lang, tag } };
});
}
Now only tags with 3+ articles get a page. Empty tag pages cease to exist.
Step 4: Drop deindexed/removed tags from the sitemap
Whatever generates the sitemap must also respect the MIN threshold:
// src/pages/sitemap.xml.ts (sketch)
const tagsToInclude = /* same filter as above */;
// Only emit <url> entries for those.
If a tag URL was previously indexed, also return a 410 from the server so Google removes it promptly. In Astro with static output, you’d add a redirect/410 in your hosting layer (Netlify _redirects, Cloudflare Pages _redirects, etc.):
/en/tags/deprecated-tag/ 410!
/zh/tags/deprecated-tag/ 410!
Step 5: Request reindex of the tag index (if you have one)
If you have a /tags/ overview page that lists all tags, request reindexing so Google sees the updated list. For the individual removed tag URLs, the 410 will trigger removal at next crawl.
Prevention
- Tag generator enforces min N articles per tag (3 is reasonable)
- Sitemap excludes any tag below threshold
- Removed tags return 410 from the hosting layer
- Tag rename requires updating ALL using articles in the same PR
- Quarterly tag audit; merge near-duplicate tags, prune thin ones
- Frontmatter tag values validated against a controlled vocabulary (optional but cleanest)
Related
- Too Many Tags Thin Archives
- Too Many Thin Pages
- Category Page Too Weak
- Search Console Low Value URLs
- Content Site Broken Internal Link Rot
- Orphan Content Pages
Tags: #Content ops #Site quality #Site audit #Troubleshooting #Tag page