Can I just paste raw notes?

Yes — the model handles the polish. But add 2-3 specific incidents per growth area; otherwise the draft will hedge.

Manager reviews vs IC self-reviews?

Same structure, different voice. For self-reviews, lean into ownership of growth areas; for manager reviews, lean into observation language and avoid second-guessing the IC's intent.

How do I handle a gap I haven't actually told the person about?

Don't surface it in writing first. Have the conversation, then write. Writing-first feedback erodes trust; the review process exists to confirm prior conversations, not stage them.

The model keeps softening too much — what changes?

Add to the prompt: "If a gap could be deleted without changing the review's meaning, it should be louder, not removed." Then re-run.

Which model should I use?

For a single review, any of GPT-5.5, Claude Sonnet 4.6, or Gemini 3.1 Pro is fine. If you are pasting a long leveling guide plus a year of notes, the 1M-token context on Sonnet 4.6 and Gemini 3.1 Pro means nothing gets truncated. For real names, use a Team/Enterprise tier or anonymize first.

What about promotion packets?

Different doc, different prompt. A review answers "did they meet level"; a promo packet argues "are they operating at the next level." See the linked promotion self-review article.

AI Use Cases

Draft a Performance Review With AI in 20 Minutes

Turn a folder of half-remembered wins and one nagging gap into a calibrated, balanced performance review draft — for yourself or a report — without it reading like ChatGPT, and without softening the feedback that needs to land.

Published: May 17, 2026 Updated: Jun 06, 2026 Author: AI Productivity Guide Team 🌐 查看中文版本

TL;DR

Self-reviews and manager reviews are the kind of writing AI is genuinely good at: structure, balance, and turning a trait into an evidence sentence. People who use it well cut the drafting time from a few hours to roughly 20-30 minutes. The catch is that AI defaults to bland symmetry and over-softens hard feedback, so you have to feed it specifics and fight it on tone. This guide gives you the exact prompt, what to feed it, how to refine, and one privacy rule that matters more for reviews than for almost anything else you paste into a chatbot.

The task

Performance review forms are due in a week. You have a Notion page of half-remembered wins, two specific incidents you have been dreading writing about, and a vague sense that last cycle’s review hedged too much. You need a draft that covers strengths, gaps, concrete examples, and a development plan — without sounding like ChatGPT wrote it, and without softening the one piece of feedback that actually needs to land.

Where AI helps — and where it does not

AI handles structure and balanced framing. It stops the draft from drifting all-positive (which calibration will reject) or all-negative (which the person will dismiss), and it is good at converting “they communicate well” into the example sentence that survives a review committee.

What AI cannot do: pick the one gap that matters most for promotion, weight examples against your company’s competency rubric, or write the sentence the person actually needs to hear in person. Use AI for the scaffolding; you pick the message and the tone.

A common failure mode: the model defaults to a 3-strength / 3-gap symmetry even when the real story is 5 strengths and 1 real gap. Tell it explicitly, or it will pad the weak side to match the strong side.

The one rule: lead with evidence, not adjectives

The single highest-leverage edit is structural, and it has a name. The Center for Creative Leadership’s SBI model — Situation, Behavior, Impact — exists because feedback grounded in a specific event is harder to argue with and easier to act on than feedback built from traits. Many teams extend it to SBI-N, adding Next steps.

Translate that into the rule you give the model: every strength leads with the artifact (a launch, a doc, a decision), every gap leads with the situation, and every gap ends with a next step. Calibration committees discount trait-only lines (“communicates well”, “strong leader”) heavily, so a review built on adjectives quietly weakens the person’s case even when you meant it as praise.

What to feed the AI

Person’s role, level, and tenure in months
The 3-4 strengths you actually want recognized, each with one concrete artifact
The 1-2 growth areas that matter, each with a specific incident and the next-step idea
The competency rubric or leveling guide, even pasted as raw text (long-context models like Claude Sonnet 4.6 or Gemini 3.1 Pro hold the full rubric without truncating)
Last cycle’s rating plus the one or two pieces of feedback from then — continuity matters
The audience: self, manager’s manager (calibration), or skip-level
Tone constraints: your company’s review style (some lean academic, some lean direct)
The single sentence you most want the reader to remember after closing the doc

A privacy rule that is specific to reviews

A performance review is one of the most sensitive documents you will ever paste into an AI tool: it names a real person and makes judgments about them. Treat the choice of tool as part of the task, not an afterthought.

On ChatGPT Free ($0) and Plus ($20/mo), conversations may be used to train models unless you turn it off in Settings → Data Controls → “Improve the model for everyone.” Do that before you paste anything. (As of June 2026, the US free tier also shows ads.)
ChatGPT Team, Enterprise, Claude Team, and the Gemini Business/Workspace tiers exclude your inputs from training by default and give admins retention controls — the right call for real names.
Safest of all: anonymize. Replace the person’s name with [name] and the team with [team] in your notes, draft against the placeholder, then find-and-replace the real name back in your own editor. Many companies require exactly this; check your AI-use policy first.

This is not legal boilerplate — for HR-adjacent writing it changes which tool you should open.

Copy-ready prompt

Placeholders in braces are slots for you to fill; replace each {...} before sending.

You are drafting a 500-word performance review.
Role: {role}, Level: {level}, Tenure: {months}.
Strengths: {3-4 with concrete examples}.
Growth areas: {1-2 with incidents and ideas}.
Rubric excerpt: {paste competency dimensions}.
Last cycle: {rating + headline feedback}.
Audience: {self / manager review / skip-level read}.
Structure:
1) Headline rating in one sentence that the reader can quote back.
2) Strengths section - lead each with the artifact, not the trait.
3) Growth area - name the gap specifically, but lead with what they already do well in the same area.
4) Development plan - 2 concrete next-quarter moves, owner, and how we'll know it worked.
5) Closing line - the one sentence the reader should carry.
Tone: direct, kind, no corporate mush. Replace every adjective with an example or delete it.

Shorter variant — calibration-ready paragraph only

Write the 80-word calibration paragraph for {name} only.
Level: {level}. Strongest evidence: {one artifact}. Weakest evidence: {one incident}. Last rating: {x}.
Format: 1 sentence rating + 2 sentences evidence + 1 sentence forward-looking. No filler.

What good output looks like

A gap paragraph that does the work: “Sarah owns the customer-facing launch comms well, and the natural next layer is internal stakeholder alignment. In Q3 the security review pushed back on her launch because she had not pre-briefed them; the fix is mechanical — a 1:1 with security two weeks pre-launch — not a question of capability. She has already adopted this pattern for Q4 launches.”

A headline rating that survives calibration: “Solid Meets Expectations with one clear stretch behavior. Sarah is operating one level above her current scope on launch execution, and is one stakeholder-management cycle away from the case being unambiguous at the next promo committee.”

Both lead with a situation and a behavior, both close on next steps, and neither relies on an undefended adjective.

How to refine

Replace adjectives with artifacts: “Every ‘strong’ or ‘great’ must be followed by what they shipped or decided. Otherwise delete the sentence.”
Lead with continuity: “Reference last cycle’s feedback in the first 2 sentences. Show what changed, then what’s new.”
Name the gap, soften the framing, not the content: “The gap stays specific; only the verbs soften. Don’t write ‘sometimes’ if the truth is ‘twice this cycle.’”
Pre-empt the calibration question: “Add 1 sentence answering ‘why is this not a level up’ or ‘why is this not a level down’ — calibrators will ask.”
Trim: “Cut 100 words. Anything that survives the cut is signal.”

Common mistakes

Listing traits with no examples (“communicates well”) — calibration committees discount these heavily and your case quietly weakens
Using AI to soften feedback into uselessness — the person reads it and changes nothing
Forgetting the development plan — gaps without next steps read like verdicts
Pasting the entire rubric and hoping the model picks the right dimensions — pick the 2-3 dimensions yourself, paste only those
Writing the same length per gap as per strength — gaps deserve specificity, not equal real estate
Dropping the “what changed since last cycle” thread — managers reading a sequence of reviews want to see an arc, not a stack of snapshots
Letting AI write the closing line — that is the one sentence you should write yourself

FAQ

Can I just paste raw notes?: Yes — the model handles the polish. But add 2-3 specific incidents per growth area; otherwise the draft will hedge.
Manager reviews vs IC self-reviews?: Same structure, different voice. For self-reviews, lean into ownership of growth areas; for manager reviews, lean into observation language and avoid second-guessing the IC’s intent.
How do I handle a gap I haven’t actually told the person about?: Don’t surface it in writing first. Have the conversation, then write. Writing-first feedback erodes trust; the review process exists to confirm prior conversations, not stage them.
The model keeps softening too much — what changes?: Add to the prompt: “If a gap could be deleted without changing the review’s meaning, it should be louder, not removed.” Then re-run.
Which model should I use?: For a single review, any of GPT-5.5, Claude Sonnet 4.6, or Gemini 3.1 Pro is fine. If you are pasting a long leveling guide plus a year of notes, the 1M-token context on Sonnet 4.6 and Gemini 3.1 Pro means nothing gets truncated. For real names, use a Team/Enterprise tier or anonymize first.
What about promotion packets?: Different doc, different prompt. A review answers “did they meet level”; a promo packet argues “are they operating at the next level.” See the linked promotion self-review article.

Tags: #AI writing #Career #Workflow #Performance

TL;DR

The task

Where AI helps — and where it does not

The one rule: lead with evidence, not adjectives

What to feed the AI

A privacy rule that is specific to reviews

Copy-ready prompt

Shorter variant — calibration-ready paragraph only

What good output looks like

How to refine

Common mistakes

FAQ

Related

Related Articles

AI Follow-Up Email After an Interview

AI Job Description Analysis: Must-Haves, Gaps, Interview Questions

How to Use AI to Write a LinkedIn About Section Recruiters Actually Read

Write Your Weekly Manager Update Email With AI

How to Use AI to Write Cold Networking Emails (Specific, Short, One Ask)

Use AI to Prep Your 1:1 Talking Points