Deploy Preview URLs Got Indexed by Google

Preview URLs from Vercel or Netlify appear in Google's index, often outranking your canonical domain — usually a missing noindex header or robots block on preview hosts.

You search site:vercel.app your-project and find dozens of preview URLs — your-site-git-feature-x.vercel.app, your-site-abc123.vercel.app, deployment hashes you forgot existed — indexed by Google. Worse, some of them rank for your brand keyword instead of your real domain. This happens because preview hosts on Vercel, Netlify, and Cloudflare Pages are public by default. Without an explicit noindex header or robots block keyed to the host, Googlebot will crawl whatever URL it finds linked from anywhere — a Slack share, a PR comment, a Notion doc, a stale link in a tweet — and treat that preview as a real page. Duplicate-content penalties and brand-cannibalization follow.

Common causes

Ordered by how each leaked into the index.

1. No X-Robots-Tag: noindex on preview hostnames

Vercel and Netlify do NOT add noindex to preview deployments by default. They are real, crawlable pages. If your codebase does not detect the preview hostname and emit the header, Google will index them.

How to spot it: curl -I https://your-site-abc123.vercel.app/ does not contain x-robots-tag: noindex.

Slack and Discord both fetch a URL preview when a link is posted. That fetch can be enough to seed Google’s crawl if the link is later shared in any public archive (a Discourse forum, a public Slack export, a GitHub issue).

How to spot it: Indexed preview URLs trace back to a Slack/Discord share log. The link appears in a Google site: search for the channel’s web archive.

The Vercel/Netlify GitHub integration auto-posts a preview URL as a comment. Public repos make these comments world-readable. Google crawls public GitHub, follows the link, indexes the preview.

How to spot it: Indexed previews correspond 1-to-1 with PRs in a public repo. site:vercel.app + your project name shows them in roughly PR-number order.

4. Production code accidentally outputs preview URLs in sitemap

A bug in the sitemap generator uses process.env.VERCEL_URL instead of a hardcoded canonical domain, so preview deploys emit a sitemap pointing at themselves.

How to spot it: curl https://your-site-abc123.vercel.app/sitemap.xml contains URLs starting with https://your-site-abc123.vercel.app/, not the canonical domain.

5. Preview deploys have password protection off

Vercel “Deployment Protection” (formerly “Password Protection”) can gate previews behind a password. On free / hobby plans, it is off by default. Even Pro plans need an explicit toggle.

How to spot it: Open an incognito browser, paste a preview URL — it loads with no auth prompt.

6. Canonical tags point at the preview host, not production

<link rel="canonical" href="${import.meta.env.SITE}/path"> where SITE is set per-deploy includes the preview domain. Google reads the canonical and treats the preview as canonical.

How to spot it: View source on a preview deploy. The <link rel="canonical"> href starts with your-site-abc123.vercel.app, not your real domain.

Vercel keeps deployment URLs forever by default (a “feature” for permalinks). If old preview URLs were posted publicly months ago, Google’s crawl queue still has them and will recheck periodically.

How to spot it: Indexed preview URLs include ones tied to deployments from many months ago. Even after fixing config today, old ones persist.

Before you start

  • Run site:vercel.app your-project and site:netlify.app your-project (or your provider’s preview suffix). Note the count.
  • Check Google Search Console for any “Duplicate without user-selected canonical” or “Crawled - currently not indexed” warnings.
  • Determine if you have access to the deploy provider’s “Deployment Protection” setting.
  • Verify what process.env.VERCEL_URL or process.env.URL resolves to on a preview build (it changes per deploy).
  • Note whether your repo is public — that drastically changes the leak surface area.

Information to collect

  • Sample of 5 indexed preview URLs and their corresponding deployment IDs.
  • Output of curl -I against one preview URL: check for x-robots-tag, x-vercel-deployment-url.
  • Your canonical-URL generation code (search for canonical, og:url, import.meta.env.SITE, VERCEL_URL).
  • Your sitemap generation code (often scripts/build-sitemap.mjs or auto-generated).
  • The provider’s deployment protection settings.
  • Search Console “Pages” report filtered by vercel.app or netlify.app to count affected URLs.

Step-by-step fix

Ordered by what stops the bleeding first.

Step 1: Block preview hosts with a noindex header

For Vercel, add in vercel.json:

{
  "headers": [
    {
      "source": "/(.*)",
      "has": [
        { "type": "host", "value": "(?!your-site\\.com$).*\\.vercel\\.app" }
      ],
      "headers": [
        { "key": "X-Robots-Tag", "value": "noindex, nofollow" }
      ]
    }
  ]
}

For Netlify, in netlify.toml:

[[headers]]
  for = "/*"
  [headers.values]
    X-Robots-Tag = "noindex, nofollow"
[context.deploy-preview.environment]
  ROBOTS_NOINDEX = "true"

For Astro / framework-level approach:

---
const isPreview =
  Astro.url.hostname.endsWith(".vercel.app") ||
  Astro.url.hostname.endsWith(".netlify.app");
---
{isPreview && <meta name="robots" content="noindex, nofollow" />}

This stops new crawls. Existing indexed URLs need step 4 to evict.

Step 2: Fix canonical generation to always use the production hostname

Hardcode or env-pin the canonical:

// src/lib/site.ts
export const SITE = "https://your-site.com"; // never derived from VERCEL_URL
<link rel="canonical" href={new URL(Astro.url.pathname, SITE).toString()} />

Even on a preview deploy, the canonical now points at production. Google consolidates ranking signals there.

Step 3: Lock down preview deployments

Vercel: Project → Settings → Deployment Protection → enable “Vercel Authentication” or “Password Protection” for preview and development.

Netlify: Site → Site Configuration → Visitor Access → “Password protection” for branch deploys and deploy previews.

Now preview URLs require a login or password; Googlebot cannot crawl them.

Step 4: Submit the indexed preview URLs for removal

For URLs already in Google’s index:

  1. Google Search Console → Removals → New Request → URL prefix https://your-site-abc123.vercel.app/.
  2. Submit. Removal is temporary (6 months) but evicts immediately.
  3. For permanence, also serve 404 or 410 from those hosts (combined with step 1’s noindex).

This is the only fast way to clear indexed previews — relying on natural recrawl can take weeks. See old deployment url in search for the long-tail removal flow.

Step 5: Audit your sitemap for leaked URLs

Run on a preview build:

npm run build
grep -E "vercel\.app|netlify\.app" dist/sitemap*.xml

If matches appear, your sitemap generator is using a per-deploy host. Replace it with the hardcoded SITE:

import { SITE } from "./site";
const urls = posts.map((p) => `${SITE}${p.url}`);

Step 6: Disable Vercel preview comments on the GitHub PR

If the repo is public:

Vercel → Project → Settings → Git → uncheck “Comments on Pull Requests” or set to “Off”.

This stops Google from harvesting new preview URLs via GitHub crawl. Old PR comments are still in Google’s history, so step 4 remains necessary for those.

Step 7: Add a robots.txt at the preview level

A belt-and-suspenders measure. Detect host and serve different robots.txt:

// src/pages/robots.txt.ts
export const GET = ({ request }: { request: Request }) => {
  const url = new URL(request.url);
  const isPreview =
    url.hostname.endsWith(".vercel.app") ||
    url.hostname.endsWith(".netlify.app");
  const body = isPreview
    ? "User-agent: *\nDisallow: /\n"
    : "User-agent: *\nAllow: /\nSitemap: https://your-site.com/sitemap.xml\n";
  return new Response(body, { headers: { "content-type": "text/plain" } });
};

Important: also keep X-Robots-Tag from step 1 — robots.txt is advisory, the header is enforced. See robots txt not working for related debug paths.

Verify

  • curl -I https://your-site-abc123.vercel.app/ returns x-robots-tag: noindex, nofollow.
  • curl https://your-site-abc123.vercel.app/sitemap.xml returns 404 or contains only canonical-domain URLs.
  • View-source on a preview deploy shows <meta name="robots" content="noindex,nofollow">.
  • Opening a preview URL in incognito prompts for auth (if deployment protection is enabled).
  • Search Console “Removals” shows submitted preview URLs as “Approved”.
  • After ~7 days, site:vercel.app your-project count drops sharply.

Long-term prevention

  • Make noindex on preview hosts a hardcoded header in vercel.json / netlify.toml, committed to the repo — never relying on a runtime flag.
  • Always set canonical from a single hardcoded SITE constant, never derived from a runtime env var.
  • Keep deployment protection enabled by default for ALL non-production deployments.
  • Add a CI check that runs curl -I against the deployed preview URL and fails if x-robots-tag is missing.
  • In Search Console, add the production domain only — never claim ownership of *.vercel.app (that would create more index surface).
  • Audit site:vercel.app your-project quarterly; even one indexed preview can cannibalize brand search.

Common pitfalls

  • Adding noindex to a <meta> tag but not to the response header — Google honors both, but the header is much harder to accidentally drop.
  • Relying on robots.txt alone — Google does not always respect it for already-known URLs; the header is what evicts.
  • Setting noindex globally (on production too) by accident when refactoring — always test on prod after the change.
  • Forgetting that PR previews remain indexable for months after the PR is merged because the deployment URL stays alive. See duplicate domain versions indexed.
  • Adding Search Console property for *.vercel.app “to monitor” — that signals Google to keep crawling them.

FAQ

Q: How long until indexed preview URLs drop from Google after I fix this?

Search Console “Removals” evicts immediately (6 months temporary). For permanent removal, return noindex headers AND wait for Google’s next crawl, usually 1-4 weeks per URL. Old, low-traffic URLs may persist longer.

Q: Should I 301 redirect preview URLs to production?

Generally yes — pair it with noindex so Google passes any link equity to production, then evicts the preview. Add to vercel.json a redirect from preview hosts to the canonical equivalent.

Q: Will noindex on preview break the Vercel-bot’s deployment check?

No — Vercel’s deployment check uses HTTP 200 status, not robots directives. The bot ignores X-Robots-Tag.

Q: My custom branch alias (e.g., staging.your-site.com) is leaking too. Same fix?

Yes — extend the host regex to include all non-production hostnames. Treat staging like a preview: noindex + deployment protection.

Tags: #Troubleshooting #SEO #preview #Vercel #netlify