How Accurate Is Gemini’s AI Translation? Insights from a Translation Agency

How Accurate Is Gemini’s AI Translation? Insights from a Translation Agency

Otilia Munteanu

14/04/2025

Translation technology

With the release of Gemini, Google’s newest large language model, many are wondering how well it can handle multilingual content. It powers Google’s revamped Gemini Translate, supports over 100 languages, and promises smoother, faster translations using AI. Sounds impressive, but can it really replace professional translators? Is it good enough for business, legal, or medical use?

As a translation agency, we were curious too. So, we decided to ask Gemini directly about its strengths and weaknesses, and then we ran our own tests to see how it performs in real scenarios. Here’s what we found out.

 

Q&A with Gemini AI

Q1: Gemini, what are your strengths when it comes to translation, and where do you struggle?

A1: Strengths:

  • Broad Language Coverage: I have been trained on a massive dataset of text and code, encompassing a wide range of languages. 
  • Understanding Context: I can often understand the context of the text being translated, which helps in choosing the most appropriate word or phrase, rather than just a literal translation.
  • Fluency and Naturalness: My translations sound natural for common languages and general topics.
  • Speed and Efficiency: I can perform translations very quickly.
  • Access to Vast Knowledge: My training data includes information from various domains, allowing me to handle a wide array of topics.
  • Handling Idioms and Figurative Language (to some extent):  I can translate common idioms, though complex nuances may be tricky.

 

Where I Struggle:

  • Nuance and Subtlety: I may miss subtle meanings, implied messages, or cultural layers

  • Highly Specialized Terminology (without specific training): In fields like advanced medicine, niche law, or finance, I may misinterpret terms without clear context or domain-specific training.
  • Ambiguity: When the source text is ambiguous, I might choose an incorrect interpretation, leading to a flawed translation.
  • Creativity and Literary Translation: Poetry, wordplay, and emotional tone are tough to replicate faithfully.
  • Low-Resource Languages: My performance might be less accurate for language pairs where there is significantly less data available for training.
  • Maintaining Tone and Style: Consistently maintaining the intended tone and style of the original text across languages can be challenging.
  • Cultural Context:  I might overlook social norms or cultural references, leading to mistranslations.

 

Q2: Are your AI translations reliable enough to be used without human revision?

Generally, no, my translations are not always reliable enough to be used without human revision, especially for critical or highly sensitive content. While my translations can be quite good for general topics and straightforward language, the areas where I struggle (as outlined above) can lead to inaccuracies, misunderstandings, or awkward phrasing. For certain use cases, such as getting the general gist of a foreign language text, translating simple phrases or common expressions, and facilitating basic communication, my translations might be sufficient.

However, for situations where accuracy, nuance, and cultural understanding are crucial, such as: legal documents, medical reports, financial statements, marketing materials intended for specific cultural audiences, literary works, technical manuals requiring precise language, human review and editing by a professional translator with expertise in the subject matter and the language pair are highly recommended and often necessary. Human translators can identify and correct errors, ensure the tone and style are appropriate, and account for cultural nuances that I might miss.

 

Q3: Gemini, for which languages or contexts do you deliver your best results?

A3: Best:

  • Languages: High-resource languages (English, French, German, Spanish, etc.), some Indic languages (Telugu, Sanskrit, Hindi).
  • Content: General topics, context-rich text, long documents, customer support materials.

Worst:

  • Languages: Low-resource languages.
  • Content: Legal documents, creative/literary works, highly specialized technical fields (without specific adaptation), content heavily reliant on cultural nuances and humor.

 

Does Gemini Get It Right? Our Translation Test

To understand how the LLM Gemini performs beyond theory, we ran a structured evaluation based on real translation tasks. We compared Gemini’s output to human translations in terms of accuracy, fluency, terminology, tone, and contextual understanding. Each test involved translating from and into English across various industries.

1. General Business Communication

Language Pair: English ↔ Spanish
Test Sample: Company newsletter with updates and HR-related announcements

  • Gemini Translation: The AI provided smooth, natural phrasing with an appropriate tone for an internal audience. The sentence structure was clean, and the meaning was largely preserved.

  • Observations: The Tone was slightly formal in places where a more conversational tone was expected. One or two idiomatic phrases were rendered literally.
  • Verdict:Good for drafting and internal communication, with minimal revision.

 

2. Legal Translation

Language Pair: German ↔ English
Test Sample: Employment contract clause and GDPR consent form

  • Gemini Translation: Key terms were often translated correctly, especially standard phrases. However, it missed legal subtleties such as conditional clauses and indirect speech, occasionally leading to ambiguous phrasing.
  • Observations: The word “Einwilligungserklärung” (consent declaration) was mistranslated as “agreement letter,” which changes the legal implication. Gemini Translate also used inconsistent terminology across repeated sections.
  • Verdict:Not suitable for legal use without professional post-editing or certification.

 

3. Medical Translation

Language Pair: English ↔ French
Test Sample: Patient aftercare instructions and prescription label

  • Gemini Translation: Translated terminology such as “antibiotic course” and “dosage instructions” with general correctness. However, it mistranslated "Take on an empty stomach" as "Prendre à jeun," which is not commonly used in patient-friendly French.

  • Observations: The AI translation lacked clarity for non-specialist readers and risked confusion. Also, it didn’t adjust to formal vs informal tone as required in health communication.
  • Verdict: ⚠️ Passable for internal understanding but not for direct patient use.

 

4. Marketing Text

Language Pair: English ↔ Italian
Test Sample: Product description for a skincare brand, including slogan and benefits

  • Gemini Translation: The general information was preserved, but the emotional tone and persuasive elements were weakened. For example, "Glow like never before" was translated literally, losing its metaphorical flair.

  • Observations: AI failed to localize idiomatic expressions and did not adapt the message to cultural preferences. The tagline felt flat and uninspiring.

  • Verdict: Requires creative localization. Not suitable without human rewriting.

 

5. Technical Manual

Language Pair: English ↔ Japanese
Test Sample: Software setup guide with instructions and error messages

  • Gemini Translation: The instruction flow remained mostly intact. It translated UI elements and error codes literally but sometimes missed context-specific terminology (e.g., "instance" was mistranslated in a cloud computing context).

  • Observations: Lack of context-awareness led to errors that could confuse end-users. Consistency in terminology was an issue across multiple sections.
  • Verdict: ⚠️ Can assist technical writers, but not ready for publication without post-editing.

 

Performance Summary

Domain Accuracy Terminology Tone & Style Cultural Adaptation Final Use Suitability
General Business High Moderate Moderate Moderate With light review
Legal Moderate Low Low Low Not recommended
Medical Moderate Moderate Low Low Review required
Marketing Moderate Moderate Low Very Low Not suitable
Technical Moderate Moderate Moderate Low Review required

 

If you're considering using Gemini AI for professional or sensitive materials, remember: It works best as part of a human-reviewed workflow, not a standalone solution.

Looking for translations that go beyond good guesses? Partner with us to combine the power of technology with the precision of human expertise.

PoliLingua

Our translations are performed by translators carefully selected to align with the subject matter and content of your project. They meet and exceed international quality standards. Upon request, we will provide you with a certificate attesting to the precision of our translations