You search site:vercel.app your-project and find dozens of preview URLs — your-site-git-feature-x.vercel.app, your-site-abc123.vercel.app, deployment hashes you forgot existed — indexed by Google. Worse, some of them rank for your brand keyword instead of your real domain. This happens because preview hosts on Vercel, Netlify, and Cloudflare Pages are public by default. Without an explicit noindex header or robots block keyed to the host, Googlebot will crawl whatever URL it finds linked from anywhere — a Slack share, a PR comment, a Notion doc, a stale link in a tweet — and treat that preview as a real page. Duplicate-content penalties and brand-cannibalization follow.
Common causes
Ordered by how each leaked into the index.
1. No X-Robots-Tag: noindex on preview hostnames
Vercel and Netlify do NOT add noindex to preview deployments by default. They are real, crawlable pages. If your codebase does not detect the preview hostname and emit the header, Google will index them.
How to spot it: curl -I https://your-site-abc123.vercel.app/ does not contain x-robots-tag: noindex.
2. Preview URL shared in a public Slack/Discord with link unfurling
Slack and Discord both fetch a URL preview when a link is posted. That fetch can be enough to seed Google’s crawl if the link is later shared in any public archive (a Discourse forum, a public Slack export, a GitHub issue).
How to spot it: Indexed preview URLs trace back to a Slack/Discord share log. The link appears in a Google site: search for the channel’s web archive.
3. PR comment with preview link visible in a public GitHub repo
The Vercel/Netlify GitHub integration auto-posts a preview URL as a comment. Public repos make these comments world-readable. Google crawls public GitHub, follows the link, indexes the preview.
How to spot it: Indexed previews correspond 1-to-1 with PRs in a public repo. site:vercel.app + your project name shows them in roughly PR-number order.
4. Production code accidentally outputs preview URLs in sitemap
A bug in the sitemap generator uses process.env.VERCEL_URL instead of a hardcoded canonical domain, so preview deploys emit a sitemap pointing at themselves.
How to spot it: curl https://your-site-abc123.vercel.app/sitemap.xml contains URLs starting with https://your-site-abc123.vercel.app/, not the canonical domain.
5. Preview deploys have password protection off
Vercel “Deployment Protection” (formerly “Password Protection”) can gate previews behind a password. On free / hobby plans, it is off by default. Even Pro plans need an explicit toggle.
How to spot it: Open an incognito browser, paste a preview URL — it loads with no auth prompt.
6. Canonical tags point at the preview host, not production
<link rel="canonical" href="${import.meta.env.SITE}/path"> where SITE is set per-deploy includes the preview domain. Google reads the canonical and treats the preview as canonical.
How to spot it: View source on a preview deploy. The <link rel="canonical"> href starts with your-site-abc123.vercel.app, not your real domain.
7. Old preview URLs still resolve and have inbound links
Vercel keeps deployment URLs forever by default (a “feature” for permalinks). If old preview URLs were posted publicly months ago, Google’s crawl queue still has them and will recheck periodically.
How to spot it: Indexed preview URLs include ones tied to deployments from many months ago. Even after fixing config today, old ones persist.
Before you start
- Run
site:vercel.app your-projectandsite:netlify.app your-project(or your provider’s preview suffix). Note the count. - Check Google Search Console for any “Duplicate without user-selected canonical” or “Crawled - currently not indexed” warnings.
- Determine if you have access to the deploy provider’s “Deployment Protection” setting.
- Verify what
process.env.VERCEL_URLorprocess.env.URLresolves to on a preview build (it changes per deploy). - Note whether your repo is public — that drastically changes the leak surface area.
Information to collect
- Sample of 5 indexed preview URLs and their corresponding deployment IDs.
- Output of
curl -Iagainst one preview URL: check forx-robots-tag,x-vercel-deployment-url. - Your canonical-URL generation code (search for
canonical,og:url,import.meta.env.SITE,VERCEL_URL). - Your sitemap generation code (often
scripts/build-sitemap.mjsor auto-generated). - The provider’s deployment protection settings.
- Search Console “Pages” report filtered by
vercel.appornetlify.appto count affected URLs.
Step-by-step fix
Ordered by what stops the bleeding first.
Step 1: Block preview hosts with a noindex header
For Vercel, add in vercel.json:
{
"headers": [
{
"source": "/(.*)",
"has": [
{ "type": "host", "value": "(?!your-site\\.com$).*\\.vercel\\.app" }
],
"headers": [
{ "key": "X-Robots-Tag", "value": "noindex, nofollow" }
]
}
]
}
For Netlify, in netlify.toml:
[[headers]]
for = "/*"
[headers.values]
X-Robots-Tag = "noindex, nofollow"
[context.deploy-preview.environment]
ROBOTS_NOINDEX = "true"
For Astro / framework-level approach:
---
const isPreview =
Astro.url.hostname.endsWith(".vercel.app") ||
Astro.url.hostname.endsWith(".netlify.app");
---
{isPreview && <meta name="robots" content="noindex, nofollow" />}
This stops new crawls. Existing indexed URLs need step 4 to evict.
Step 2: Fix canonical generation to always use the production hostname
Hardcode or env-pin the canonical:
// src/lib/site.ts
export const SITE = "https://your-site.com"; // never derived from VERCEL_URL
<link rel="canonical" href={new URL(Astro.url.pathname, SITE).toString()} />
Even on a preview deploy, the canonical now points at production. Google consolidates ranking signals there.
Step 3: Lock down preview deployments
Vercel: Project → Settings → Deployment Protection → enable “Vercel Authentication” or “Password Protection” for preview and development.
Netlify: Site → Site Configuration → Visitor Access → “Password protection” for branch deploys and deploy previews.
Now preview URLs require a login or password; Googlebot cannot crawl them.
Step 4: Submit the indexed preview URLs for removal
For URLs already in Google’s index:
- Google Search Console → Removals → New Request → URL prefix
https://your-site-abc123.vercel.app/. - Submit. Removal is temporary (6 months) but evicts immediately.
- For permanence, also serve
404or410from those hosts (combined with step 1’s noindex).
This is the only fast way to clear indexed previews — relying on natural recrawl can take weeks. See old deployment url in search for the long-tail removal flow.
Step 5: Audit your sitemap for leaked URLs
Run on a preview build:
npm run build
grep -E "vercel\.app|netlify\.app" dist/sitemap*.xml
If matches appear, your sitemap generator is using a per-deploy host. Replace it with the hardcoded SITE:
import { SITE } from "./site";
const urls = posts.map((p) => `${SITE}${p.url}`);
Step 6: Disable Vercel preview comments on the GitHub PR
If the repo is public:
Vercel → Project → Settings → Git → uncheck “Comments on Pull Requests” or set to “Off”.
This stops Google from harvesting new preview URLs via GitHub crawl. Old PR comments are still in Google’s history, so step 4 remains necessary for those.
Step 7: Add a robots.txt at the preview level
A belt-and-suspenders measure. Detect host and serve different robots.txt:
// src/pages/robots.txt.ts
export const GET = ({ request }: { request: Request }) => {
const url = new URL(request.url);
const isPreview =
url.hostname.endsWith(".vercel.app") ||
url.hostname.endsWith(".netlify.app");
const body = isPreview
? "User-agent: *\nDisallow: /\n"
: "User-agent: *\nAllow: /\nSitemap: https://your-site.com/sitemap.xml\n";
return new Response(body, { headers: { "content-type": "text/plain" } });
};
Important: also keep X-Robots-Tag from step 1 — robots.txt is advisory, the header is enforced. See robots txt not working for related debug paths.
Verify
curl -I https://your-site-abc123.vercel.app/returnsx-robots-tag: noindex, nofollow.curl https://your-site-abc123.vercel.app/sitemap.xmlreturns 404 or contains only canonical-domain URLs.- View-source on a preview deploy shows
<meta name="robots" content="noindex,nofollow">. - Opening a preview URL in incognito prompts for auth (if deployment protection is enabled).
- Search Console “Removals” shows submitted preview URLs as “Approved”.
- After ~7 days,
site:vercel.app your-projectcount drops sharply.
Long-term prevention
- Make
noindexon preview hosts a hardcoded header invercel.json/netlify.toml, committed to the repo — never relying on a runtime flag. - Always set canonical from a single hardcoded
SITEconstant, never derived from a runtime env var. - Keep deployment protection enabled by default for ALL non-production deployments.
- Add a CI check that runs
curl -Iagainst the deployed preview URL and fails ifx-robots-tagis missing. - In Search Console, add the production domain only — never claim ownership of
*.vercel.app(that would create more index surface). - Audit
site:vercel.app your-projectquarterly; even one indexed preview can cannibalize brand search.
Common pitfalls
- Adding
noindexto a<meta>tag but not to the response header — Google honors both, but the header is much harder to accidentally drop. - Relying on
robots.txtalone — Google does not always respect it for already-known URLs; the header is what evicts. - Setting noindex globally (on production too) by accident when refactoring — always test on prod after the change.
- Forgetting that PR previews remain indexable for months after the PR is merged because the deployment URL stays alive. See duplicate domain versions indexed.
- Adding Search Console property for
*.vercel.app“to monitor” — that signals Google to keep crawling them.
FAQ
Q: How long until indexed preview URLs drop from Google after I fix this?
Search Console “Removals” evicts immediately (6 months temporary). For permanent removal, return noindex headers AND wait for Google’s next crawl, usually 1-4 weeks per URL. Old, low-traffic URLs may persist longer.
Q: Should I 301 redirect preview URLs to production?
Generally yes — pair it with noindex so Google passes any link equity to production, then evicts the preview. Add to vercel.json a redirect from preview hosts to the canonical equivalent.
Q: Will noindex on preview break the Vercel-bot’s deployment check?
No — Vercel’s deployment check uses HTTP 200 status, not robots directives. The bot ignores X-Robots-Tag.
Q: My custom branch alias (e.g., staging.your-site.com) is leaking too. Same fix?
Yes — extend the host regex to include all non-production hostnames. Treat staging like a preview: noindex + deployment protection.