The task
Performance review forms are due in a week. You have a Notion page of half-remembered wins, two specific incidents you have been dreading writing about, and a vague sense that last cycle’s review hedged too much. You need a draft that covers strengths, gaps, concrete examples, and a development plan — without sounding like ChatGPT wrote it, and without softening the one piece of feedback that actually needs to land.
Where AI helps — and where it does not
AI handles structure and balanced framing — it stops the draft from drifting all-positive (which calibration will reject) or all-negative (which the person will dismiss). It is also good at converting “they communicate well” into the example sentence that survives review committee. What AI cannot do: pick the one gap that matters most for promotion, weight examples against your company’s competency rubric, or write the sentence the person actually needs to hear in person. Use AI for the scaffolding; humans pick the message and tone.
A common failure mode: AI defaults to a 3-strength / 3-gap symmetry even when the real story is 5 strengths, 1 real gap. Tell it explicitly.
What to feed the AI
- Person’s role, level, and tenure in months
- The 3-4 strengths you actually want recognized, each with one concrete artifact (a launch, a doc, a decision)
- The 1-2 growth areas that matter, each with a specific incident and the next-step idea
- The company’s competency rubric or leveling guide, even if pasted as raw text
- Last cycle’s rating + the one or two pieces of feedback from then (continuity matters)
- The audience: self, manager’s manager (calibration), or skip-level
- Tone constraints: your company’s review style (some lean academic, some lean direct)
- The single sentence you most want the reader to remember after closing the doc
Copy-ready prompt
You are drafting a 500-word performance review.
Role: {role}, Level: {level}, Tenure: {months}.
Strengths: {3-4 with concrete examples}.
Growth areas: {1-2 with incidents and ideas}.
Rubric excerpt: {paste competency dimensions}.
Last cycle: {rating + headline feedback}.
Audience: {self / manager review / skip-level read}.
Structure:
1) Headline rating in one sentence that the reader can quote back.
2) Strengths section — lead each with the artifact, not the trait.
3) Growth area — name the gap specifically, but lead with what they already do well in the same area.
4) Development plan — 2 concrete next-quarter moves, owner, and how we'll know it worked.
5) Closing line — the one sentence the reader should carry.
Tone: direct, kind, no corporate mush. Replace every adjective with an example or delete it.
Shorter variant — calibration-ready paragraph only
Write the 80-word calibration paragraph for {name} only.
Level: {level}. Strongest evidence: {one artifact}. Weakest evidence: {one incident}. Last rating: {x}.
Format: 1 sentence rating + 2 sentences evidence + 1 sentence forward-looking. No filler.
Sample output
A useful gap paragraph: “Sarah owns the customer-facing launch comms well — and the natural next layer is internal stakeholder alignment. In Q3 the security review pushed back on her launch because she had not pre-briefed them; the fix is mechanical (a 1:1 with security two weeks pre-launch) and not a question of capability. She has already adopted this pattern for Q4 launches.”
A useful headline rating: “Solid Meets Expectations with one clear stretch behavior — Sarah is operating one level above her current scope on launch execution, and is one stakeholder-mgmt cycle away from the case being unambiguous at the next promo committee.”
How to refine
- Replace adjectives with artifacts: “Every ‘strong’ or ‘great’ must be followed by what they shipped or decided. Otherwise delete the sentence.”
- Lead with continuity: “Reference last cycle’s feedback in the first 2 sentences. Show what changed, then what’s new.”
- Name the gap, soften the framing, not the content: “The gap stays specific; only the verbs soften. Don’t write ‘sometimes’ if the truth is ‘twice this cycle.’”
- Pre-empt the calibration question: “Add 1 sentence answering ‘why is this not a level up’ or ‘why is this not a level down’ — calibrators will ask.”
- Trim: “Cut 100 words. Anything that survives the cut is signal.”
Common mistakes
- Listing traits with no examples (“communicates well”) — calibration committees discount these heavily and your case quietly weakens
- Using AI to soften feedback into uselessness — the person reads it and changes nothing
- Forgetting the development plan — gaps without next steps feel like verdicts
- Pasting the entire rubric and hoping the model picks the right dimensions — pick the 2-3 dimensions yourself, paste only those
- Writing the same length per gap as per strength — gaps deserve specificity, not equal real estate
- Dropping the “what changed since last cycle” thread — managers reading a sequence of reviews want to see arc, not snapshots
- Letting AI write the closing line — that’s the one sentence you should write yourself
FAQ
- Can I just paste raw notes?: Yes — the model handles the polish. But add 2-3 specific incidents per growth area; otherwise the draft will hedge.
- Manager reviews vs IC self-reviews?: Same structure, different voice. For self-reviews, lean into ownership of growth areas; for manager reviews, lean into observation language and avoid second-guessing the IC’s intent.
- How do I handle a gap I haven’t actually told the person about?: Don’t surface it in writing first. Have the conversation, then write. Writing-first feedback erodes trust and the review process exists to confirm prior conversations, not stage them.
- The model keeps softening too much — what changes?: Add to the prompt: “If a gap could be deleted without changing the review’s meaning, it should be louder, not removed.” Then re-run.
- What about promotion packets?: Different doc, different prompt. A review answers “did they meet level”; a promo packet argues “are they operating at the next level.” See the linked promotion self-review article.