A lot of new site owners write User-agent: * Disallow: / and also set noindex — the most common self-defeating combo.
One-line difference
robots.txtcontrols whether the page can be crawled — the bot never opens it<meta name="robots" content="noindex">controls whether it can be indexed — the bot reads it but doesn’t list it in results
Comparison
| Use case | robots.txt | noindex |
|---|---|---|
| Private / admin dirs you don’t want Google to see | ✅ Disallow | ❌ |
| Public pages you don’t want indexed yet | ❌ | ✅ |
| Duplicate filter / pagination pages | ❌ (still want crawl signals) | ✅ |
| Site-search result pages | ✅ | also fine |
| Sensitive query-string URLs | ✅ | + nofollow |
The dangerous combo
If a page is both Disallowed in robots.txt and has noindex: Google can’t see the noindex. Worse, the URL can still get indexed via external signals — with empty title and snippet.
Right way: to noindex a page, leave it crawlable. Don’t Disallow it in robots.txt.
When this applies
- “Live but not indexed” → noindex
- “Hidden from all users” → server-side auth, not robots.txt
- “Reduce crawl budget on junk URLs” → robots.txt
Decision checklist
- If the error started right after a change, roll back or isolate that change before trying unrelated fixes.
- If the error happens only in production, compare environment variables, build output, cache, permissions, and platform settings.
- If the error happens only for one account or browser, test permissions, cookies, extensions, quota, and regional availability.
- If two fixes seem possible, choose the one that is easiest to verify and easiest to undo first.
When to stop debugging
Stop and escalate when you cannot reproduce the issue, when logs contradict the UI, when billing or account security is involved, or when every fix requires production access you do not control. At that point, package the exact error, timestamp, project ID, reproduction steps, screenshots, and recent changes before asking support or another engineer. Good escalation notes often solve the problem faster than another hour of guessing.
Diagnostic flow
- Reproduce the issue once and write down the exact path. If you cannot reproduce it, collect more evidence before changing settings.
- Check scope: one user or everyone, one browser or all browsers, local only or production only, new content only or old content too.
- Check the last change first. Most troubleshooting work is not about finding a mysterious root cause; it is about identifying which recent change created the mismatch.
- Split the system in two: input vs output, local vs hosted, account vs project, source file vs generated file, prompt vs model. Test which side still fails.
- Apply the smallest reversible fix. Avoid changes that touch DNS, permissions, billing, deployment, and code at the same time.
- Verify the original reproduction path and one nearby path, then write down what fixed it.
Minimal reproduction template
Issue:
- [exact error or broken behavior]
Where it happens:
- URL / tool / project:
- Account:
- Environment: local / preview / production
- Browser / device:
Steps to reproduce:
1.
2.
3.
Expected:
-
Actual:
-
Recent changes:
- Code:
- Config:
- DNS / permissions / billing:
- Prompt / model / uploaded files:
Evidence:
- Screenshot:
- Console error:
- Server or platform log:
False fixes to avoid
- Clearing cache without checking whether the underlying file, permission, route, or setting is correct.
- Reinstalling packages when the error is actually caused by environment variables, credentials, quota, or platform config.
- Changing several unrelated settings at once, then not knowing which one mattered.
- Copying a fix from another framework or platform without checking whether the routing, build output, or auth model is the same.
- Treating a temporary platform outage as your own bug before checking status pages and recent reports.