Cursor or Claude Code?

Cursor for IDE-native exploration where you jump to definition mid-conversation; its index syncs every ~5 minutes and is validated past 550,000 files. Claude Code for agentic multi-step questions where the agent reads files and runs commands itself, with 1M-token context on Opus 4.7. Both work — start with the one you already pay for.

What about NotebookLM?

Good when the codebase has substantial written design docs you can upload. For raw source, an agent with direct repo access wins because it can follow imports instead of guessing.

How big a codebase before this stops working?

Above roughly 500k lines or 10k files the context window bites even on 1M-token models. Scope the tour to one subsystem at a time and lean on Cursor's index or Claude Code's Explore subagents for retrieval.

Should I share the tour doc with the team?

Yes, after a teammate spot-checks the dragons section. Onboarding docs from newcomers often catch unwritten conventions that long-timers no longer notice.

What if the codebase has no tests at all?

That is itself the answer to question 3. Note it as a risk in the tour, and ask yourself whether you want to land here.

Can I do this for a private repo without uploading code?

Yes — Claude Code runs locally against your checkout, and Cursor encrypts file paths and never stores source in plaintext. Still confirm your provider's data-handling policy before pointing any tool at proprietary code.

AI Tool Tutorials

AI Codebase Tour: Onboard to a New Repo in One Day

A five-question workflow for Claude Code, Cursor, or ChatGPT that maps an unfamiliar repo in an afternoon — with verification steps so you don't ship a tour built on hallucinations.

Published: May 17, 2026 Updated: Jun 04, 2026 Author: AI Productivity Guide Team 🌐 查看中文版本

Joining a new codebase used to mean a week of clicking through files, two weeks of pairing, and a month before you could land a non-trivial PR. With an agent that can read the whole repo, the same orientation collapses into roughly one focused day — if you ask the right questions in the right order, verify every answer, and write down what you learn. This is the day-one ritual for any developer joining a new project, or returning to an old one after months away.

TL;DR

Don’t ask the AI to “explain the codebase.” Run a fixed five-question sequence: important files, request lifecycle, testing, unwritten conventions, and “where are the dragons.”
Use a tool with real repo access, not paste-in chat. As of June 2026 that means Claude Code (1M-token context on Opus 4.7, parallel Explore subagents), Cursor (Merkle-tree index synced every ~5 min, validated past 550,000 files), or a GitHub-connected ChatGPT.
Verify every file:line cite by opening it. Wrong paths and invented functions are the most common failure; they vanish when you demand “quote the exact code.”
Capture answers in a CODEBASE-TOUR.md in your own words. Budget 2–3 focused hours. Ship a tiny PR the same day to validate your dev environment.

What this workflow solves

Most new-repo confusion is not a code-reading problem — it is not knowing which 50 files (out of 5,000) actually matter. An agent with repo access answers that triage question in minutes instead of hours. This guide gives you the exact five-question sequence, a format for capturing answers, and the verification steps that keep you from shipping a tour that reads well but is half-invented.

It is built for developers joining a new project, ramping back after a leave, or auditing a codebase before signing on as a contractor. It also helps staff and senior engineers told to “go take a look at the X service” with no prior context. Skip it for tiny repos (under ~5,000 lines) — you can read those in an afternoon — and for proprietary code you genuinely cannot share with any AI. For the latter, run a local model or a paired human walkthrough instead.

Pick your tool first

Switching agents mid-tour costs context, so commit to one. Here is how the three realistic options compare for repo orientation as of June 2026.

Tool	How it reads the repo	Context window	Best for	Watch out for
Claude Code	Reads files live in the repo root; `/init` scaffolds a `CLAUDE.md`; spawns parallel Explore subagents	1M tokens on Opus 4.7 (~25k lines in one conversation)	Agentic multi-step questions where the agent does its own reads and runs commands	Subagent summaries can flatten nuance; still verify cites
Cursor	Merkle-tree index of embeddings (stored in Turbopuffer), auto-synced every ~5 min; only changed files re-embed	Runs Sonnet 4.6, Opus 4.7, GPT-5.5, Gemini 3.1 Pro, Composer 2.5	IDE-native exploration where you jump to definition mid-conversation	Retrieval (RAG) can miss cross-file relationships; warm the index before you start
ChatGPT (GitHub connector)	Pulls files from a connected repo/branch on demand	GPT-5.5; Plus in-app ~320 pages, full 1M only on $200 Pro	Quick reads when you live in the browser, not an IDE	Connector reads are scoped — large repos get partial views

Claude Code is bundled into Claude Pro ($20/mo) and Cursor’s Pro tier is $20/mo (~$16 annual). If you already pay for one, start there. Note that Google’s Gemini Code Assist and Gemini CLI are being folded into the new Google Antigravity platform — Google announced those extensions stop serving Google AI Pro, Ultra, and the free individual tier on June 18, 2026 — so it is no longer the obvious fourth option for a casual tour.

Before you start

Have the repo checked out locally and the dev environment runnable. You will verify claims by running things, so a broken npm install blocks the whole tour.
Run git pull on main (or the active branch). A stale checkout means you are touring last quarter’s code.
Create an empty CODEBASE-TOUR.md in a scratch directory. This is your output artifact.
Block 2–3 uninterrupted hours. The tour ends when the doc is done, not when the timer runs out.
If you are using Claude Code, run /init first. It scans the repo and writes a starter CLAUDE.md that detects the framework and conventions. Treat it as a scaffold to edit, not a finished map.

The five-question tour

Run these in order, in one session, with the agent connected to the repo. Each question is scoped to force a citation, because a citation is something you can check.

Connect the AI to the codebase. Claude Code launched in the repo root, Cursor with its index fully built, or a GitHub connector pointed at the branch. Plain chat without access is slower and harder to verify.
Question 1 — Important files. “What are the 5 most important files in this codebase? For each, one sentence on why. Cite the file path.”
Question 2 — Request lifecycle. “Walk me through the request lifecycle, from URL or entry point to response. Cite file:line for each hop.” For non-web codebases, swap “request lifecycle” for “primary data flow” or “the path from CLI invocation to output.”
Question 3 — Testing. “How is testing organized? What is the test command for one file vs. all files? Which directories or modules have weak coverage based on test-file ratio?”
Question 4 — Unwritten conventions. “What conventions are unwritten? Look for patterns repeated three or more times that are not in any README or CONTRIBUTING. Examples: naming, error handling, logging, transaction scope, file layout.”
Question 5 — Where are the dragons. “Which files look risky to change? Look for old TODOs, comments with HACK, FIXME, or DO NOT TOUCH, deeply nested conditionals, and modules with high git churn.”
Verify each answer. Open every cited file at the cited line. If the AI claims auth.ts:42 is the entry point but auth.ts is 200 lines with no auth logic, it guessed — re-run with a tighter prompt.
Write up your tour. Put the answers into CODEBASE-TOUR.md in your own words with the verified file:line cites. This becomes your first PR and your proof that you understood.

Verify before you trust

Agents hallucinate line numbers more often than they admit, so verification is the part that separates a real tour from a confident fiction.

Check every cited path. Does the file actually contain what the AI said? Wrong paths and stale line numbers are the single most common failure.
Run the test command. Does it pass green? If not, the AI guessed — find the real command in package.json scripts, the Makefile, or the CI config.
Spot-check the dragons. Ask the most senior person on the team to review just that section. Outsiders catch unwritten conventions long-timers stopped seeing, but only insiders know which fragile modules genuinely bite.
Kill phantom functions. If the AI describes a function that “handles” something you cannot find, re-prompt with “quote the exact code.” Hallucinated functions disappear under that constraint.

Your first run

Don’t tour the whole repo on the first pass — scope it.

Pick a single subsystem (auth, billing, search, or one job worker).
Run the five questions against that subsystem only.
Verify every cite by clicking through. Note which ones the AI got right, wrong, or hallucinated.
Change one variable for a second pass — usually the agent (Claude Code vs. Cursor) — and watch the failure modes change. They will.

Make it a reusable habit

Save your five prompts as a codebase-tour.prompts.md you carry between jobs.
After three tours you will know which prompts give weak answers in your stack and rewrite them — “request lifecycle” is wrong for a CLI tool; use “command dispatch path” instead.
Re-run a mini-tour (three questions) after any major refactor in a repo you already know. Codebases drift, and the README’s architecture diagram is almost certainly out of date.
On day two, pick the smallest viable contribution — a typo fix, a doc update, a missing test — and ship it. It validates your dev environment and leaves a paper trail of competence.

Common mistakes

Asking “explain everything.” You get a vague summary that reads well and helps nothing. Scope every question and demand a cite.
Trusting file:line claims without clicking through.
Skipping the dragons question because it feels rude on day one. Those are the bugs you trip on in month two.
Treating the tour as a finished product. Real understanding only comes from changing code; the tour is the map, not the territory.
Letting the agent invent module names. If a path does not exist, say so and re-prompt — do not paper over it.

FAQ

Cursor or Claude Code? Cursor for IDE-native exploration where you jump to definition mid-conversation; its index syncs every ~5 minutes and is validated past 550,000 files. Claude Code for agentic multi-step questions where the agent reads files and runs commands itself, with 1M-token context on Opus 4.7. Both work — start with the one you already pay for.
What about NotebookLM? Good when the codebase has substantial written design docs you can upload. For raw source, an agent with direct repo access wins because it can follow imports instead of guessing.
How big a codebase before this stops working? Above roughly 500k lines or 10k files the context window bites even on 1M-token models. Scope the tour to one subsystem at a time and lean on Cursor’s index or Claude Code’s Explore subagents for retrieval.
Should I share the tour doc with the team? Yes, after a teammate spot-checks the dragons section. Onboarding docs from newcomers often catch unwritten conventions that long-timers no longer notice.
What if the codebase has no tests at all? That is itself the answer to question 3. Note it as a risk in the tour, and ask yourself whether you want to land here.
Can I do this for a private repo without uploading code? Yes — Claude Code runs locally against your checkout, and Cursor encrypts file paths and never stores source in plaintext. Still confirm your provider’s data-handling policy before pointing any tool at proprietary code.

For Cursor’s own account of how the index works, see Cursor’s codebase-indexing docs. For Claude Code, see Anthropic’s best-practices guide.

Tags: #AI coding #Tutorial #Workflow

TL;DR

What this workflow solves

Pick your tool first

Before you start

The five-question tour

Verify before you trust

Your first run

Make it a reusable habit

Common mistakes

FAQ

Related

Related Articles

AI Changelog Generation: From Commits to a Release Note Humans Read

AI-Assisted Database Migrations — Reversible, Backfilled, Tested

AI for Incident Postmortems Without Sanitizing the Lessons

AI Merge Conflict Resolution: When to Trust the Auto-Merge

AI On-Call Debugging: From Page to Fix Without Panic

AI PR Descriptions: From Diff to Reviewable