ChatGPT for Translation — Better Than Just Pasting and Hoping

Glossary, tone preservation, and back-translation checks — the difference between machine-translated and publishable.

What this covers

Pasting 3000 words into ChatGPT and saying “translate this to Japanese” produces something fluent, grammatical, and tonally wrong half the time — branded terms get translated, the author’s voice gets flattened, and idioms become Wikipedia-sounding paraphrases. This guide is the workflow I use when the output has to be publishable, not just intelligible: build a glossary first, paste source plus glossary, instruct on tone, then verify with a back-translation diff.

Who this is for

Anyone whose translation will be read by a native speaker who cares — a marketer localizing a newsletter, a researcher publishing in a second language, a founder writing investor updates for a non-English board. If the audience just needs the gist, paste-and-pray is fine. If they’ll spot a clumsy verb choice or a brand name rendered as a generic noun, you need a workflow.

When to reach for it

  • Localizing marketing copy where tone and brand terms matter.
  • Translating a long-form piece (1000-5000 words) and preserving a specific authorial voice.
  • Producing bilingual documentation where terminology has to stay consistent across files.
  • Handling domain text (legal, medical, technical) where wrong terms have real cost.

Before you start

  • Pick a model. GPT-5 for legal or technical material where reasoning matters; GPT-5.5 for general copy; GPT-5.4 only for low-stakes drafts.
  • Decide which terms must NOT be translated (brand names, product SKUs, code identifiers, person names). Write them down.
  • Identify the tone you want preserved — punchy, formal, conversational, marketing-glossy — and find a 100-word sample of that tone in the target language to anchor on.
  • Open a Project for the language pair so memory and glossary stay scoped — don’t pollute your general chat.

Step by step

  1. Build a glossary from the source. Paste 500-1000 words and ask:

    Extract every term in this text that needs a deliberate translation choice:
    brand names, product names, technical terms, idioms, recurring phrases.
    Output as a table: source term | suggested translation | reason | leave-as-is (Y/N).
  2. Review the glossary by hand. Lock brand names as leave-as-is. Pick a single rendering for each repeated technical term. This is the step that prevents inconsistency on page 6.

  3. Translate one section at a time (500-1500 words per turn), pasting the locked glossary at the top:

    Translate the section below from EN to JA.
    Glossary (use exactly): [paste table]
    Tone: conversational, like a founder writing to early users.
    Preserve paragraph breaks. Do not summarize or shorten.
  4. Run a back-translation check on suspicious paragraphs:

    Translate this JA paragraph back to EN literally, preserving structure.
    Don't smooth it out — I want to see drift.

    Compare to the original. Drift on factual nouns or numbers is a real problem. Drift on adjectives is usually fine.

  5. For idioms or culturally loaded phrases, ask for 3 alternatives with a one-line explanation each, then pick:

    Give me 3 ways to render "we shipped it warts and all" in JA marketing copy,
    each with a one-line note on the connotation it carries.
  6. Final pass: a native speaker reads it. The workflow gets you 90% of the way; the last 10% is human judgment.

A prompt that produces honest output

You are translating EN to {target language} for a published piece.
Constraints:
- Use the glossary I pasted. Don't invent new renderings for terms in it.
- Preserve the author's voice: [paste 100-word voice sample].
- If a phrase is ambiguous in source, flag it inline as [AMBIGUOUS: ...]
  rather than guessing.
- Do not paraphrase to make sentences shorter. Match length where possible.
- Numbers, dates, and proper nouns must round-trip unchanged.

This catches the cases where the model would have silently smoothed something into a half-meaning.

Quality check

  • Spot-check 3 random paragraphs by back-translating literally. Look at factual nouns and numbers, not stylistic shifts.
  • Run a find-replace pass on glossary terms — every locked term should appear in the locked rendering, no exceptions.
  • Read the first and last paragraphs out loud in the target language. If they sound like a translation, the voice prompt didn’t take.
  • Cross-check brand names, URLs, and product SKUs survived intact — these are the most common silent edits.

How to reuse this workflow

  • Save the per-language Project with its glossary; reuse it for every piece into that language pair.
  • Keep a translation-voice-samples/ folder with the 100-word anchor texts for each tone you commonly target.
  • For team use, share the glossary as a CSV so engineering, marketing, and support all use the same rendering.

Glossary extraction → human review → section-by-section translation with glossary pinned → back-translation diff on suspicious paragraphs → idiom alternatives → native-speaker final pass.

Common mistakes

  • Pasting the whole document at once. Long passes lose tone consistency by the end; the model drifts toward generic.
  • Skipping the glossary step. Brand names get translated, technical terms get rendered three different ways across the doc.
  • Asking for “natural” without anchoring with a voice sample. “Natural” defaults to the model’s house style in that language, which is bland.
  • Trusting back-translation to be lossless. It isn’t — but drift on factual content is a red flag worth investigating.
  • Using a fast model for legal or medical text. The stakes don’t match the tradeoff.
  • Forgetting that ChatGPT’s training in low-resource languages (e.g. some Southeast Asian or African languages) is weaker — quality drops, and back-translation drift increases. Get a human reviewer earlier.

FAQ

  • Is ChatGPT better than DeepL for translation?: DeepL is often cleaner for short, conventional text. ChatGPT wins when you need glossary control, tone preservation, or domain instructions. Use DeepL for first-pass, ChatGPT for the polish.
  • Can I translate code comments and docs the same way?: Yes, but pin a glossary that explicitly says “leave code identifiers, variable names, and string literals unchanged.” Code-adjacent prose is where translators go wrong most often.
  • Does the model preserve markdown / HTML?: Mostly, but it occasionally drops a tag or rebalances headings. Diff against source after translation.
  • What about voice mode for translation?: Useful for travel-grade conversation. Not useful for publishable output — there’s no glossary, no back-translation, no audit.

Tags: #ChatGPT #Workflow