Noindex vs Nofollow vs Disallow: When to Use Each

Three controls, three different jobs. Pick the wrong one and you either leak pages into the index, waste crawl budget, or hide content from yourself by accident.

Noindex, nofollow, and Disallow look interchangeable in the docs and behave wildly differently in production. One keeps a page out of search results, one is a hint about link signals, and one tells crawlers not to fetch the page at all. Use the wrong one and you either leak pages into the index, waste crawl budget on junk, or hide pages from Google so thoroughly it cannot even drop them.

Background

Three separate mechanisms grew out of three separate problems. Disallow (in robots.txt) was the original 1994 standard: do not fetch this path. noindex (meta tag or HTTP header) came later for “fetch it, but do not show it in search results.” nofollow (originally an anti-spam signal on links) tells Google not to pass PageRank through a link. They live in different files, fire at different stages of crawling, and answer different questions.

How to tell

  • A staging or thank-you page is showing up in site:yoursite.com results.
  • Your Search Console Pages report flags “Indexed, though blocked by robots.txt” — the worst of both worlds.
  • You added Disallow to robots.txt to hide a page, and the URL still appears in search results without a snippet.
  • Internal links to login, cart, or admin are bleeding PageRank to non-indexable pages.

Quick verdict

Use noindex for pages you do not want in search results (cart, admin, internal tools, thin tag pages). Use Disallow only for paths you want crawlers to skip entirely — usually large dynamic surfaces (search, faceted filters). Use nofollow on links you do not vouch for (user-generated content, paid links). Never combine noindex and Disallow on the same URL.

Disallow vs noindex — the trap

The most common bug: someone wants to hide a page, so they add it to Disallow in robots.txt. Google obeys and stops crawling — but the URL was already indexed, and now Google cannot fetch the page to see the noindex you also added. The URL stays in the index forever, listed without a description. The fix is to remove the Disallow, let Google re-crawl, see noindex, and drop the URL cleanly. Then, if you still want to block crawling, re-add Disallow afterwards.

<!-- Page-level noindex (the safe default for "do not show this") -->
<meta name="robots" content="noindex, follow">
# robots.txt — block crawling, not indexing
User-agent: *
Disallow: /search?
Disallow: /admin/

When nofollow is the right answer

nofollow is link-level, not page-level. It says “I do not trust where this link goes” or “this is a paid placement.” Use it on outbound links in user-generated content, paid review links, and affiliate-style placements where Google has explicit attribute names (rel="sponsored", rel="ugc"). Do not use nofollow to “save PageRank” by capping outbound links — that pattern stopped working years ago and now mostly signals manipulation.

<a href="https://example.com" rel="nofollow">untrusted destination</a>
<a href="https://partner.com" rel="sponsored">paid placement</a>
<a href="https://forum-comment.com" rel="ugc">user comment</a>

Decision table

  • Page should not appear in search results, but Google can crawl it: noindex only.
  • Crawler should never fetch this path (heavy, low-value, infinite): Disallow only, and accept that orphan URLs may still appear without snippet.
  • You want to stop passing trust through a link: nofollow or sponsored / ugc.
  • You want a page truly gone: noindex first, wait for re-crawl, then optionally Disallow.
  • You want to deindex an entire site temporarily (staging): noindex at the response-header level, or HTTP auth — never just Disallow.

HTTP header variant for non-HTML

noindex lives in two places: the <meta> tag (HTML pages) and the X-Robots-Tag response header (everything else). PDFs, JSON endpoints, images, and any URL whose response is not HTML cannot carry a meta tag — use the header form on the server or CDN layer.

X-Robots-Tag: noindex, follow

Most hosts let you set this in a config file. On Firebase Hosting, in the headers block. On Nginx, in the location directive. On Cloudflare, via a worker or transform rule. The directive is identical in meaning to the meta tag — Google honors whichever it sees first.

How long each signal takes to apply

  • noindex removal from the index: 1-3 weeks after the next crawl. Speed it up by requesting indexing on the noindexed URL.
  • Disallow effect: takes effect on the next crawl attempt, typically within hours for popular sites, days for small sites.
  • nofollow effect: link equity stops flowing immediately on the next crawl. PageRank already passed does not get clawed back.

A common surprise: you remove noindex from a page hoping it indexes quickly. Google still has to recrawl to see the change. Request indexing on a representative URL to push it sooner; the rest follow within a couple of weeks.

Common mistakes

  • Adding Disallow and noindex together on the same URL. The Disallow blocks crawling, Google never sees the noindex, the URL stays indexed.
  • Treating nofollow as a “do not index this link target” signal. It controls link equity, not indexing.
  • Using Disallow: / on staging without removing it before launch. The launched site silently refuses crawling for weeks.
  • Leaving noindex in a shared layout template after a temporary block, then wondering why the entire site dropped from search.
  • Adding noindex on canonical alternates (paginated ?page=2, language alternates) and accidentally deindexing valid content.

FAQ

  • If I Disallow a URL, will it still appear in search?: Yes, sometimes. If other sites link to it, Google may list the URL without a description. Use noindex if your goal is “do not appear in search.”
  • Does noindex, nofollow make sense together?: Rarely. noindex already removes the page from results; adding nofollow blocks internal link flow to your own content. Default to noindex, follow unless you explicitly want links sealed off.
  • What is rel="sponsored" versus nofollow?: Both pass no link equity, but sponsored declares “this is a paid placement” specifically. Google prefers the precise attribute when accurate.
  • How long until a noindex page leaves the index?: Usually 1-3 weeks after the next crawl. Speed it up with URL Inspection > Request indexing on the noindexed URL.
  • Should I noindex thin tag and category pages?: Only if they truly add no value. A thin tag page with 3 articles is a candidate for merging, not noindexing. Noindex is the last resort.

Tags: #Indie dev #SEO #Technical SEO #robots.txt #Indexing