Business Translation Center

Tomedes Launches AI Translation QA Tool for Public Testing of GPT-4o and DeepSeek

by OFER TIROSH 25/04/2025

If you've ever worked on a translation project—whether as a linguist, project manager, or client—you understand the critical importance of translation quality assurance.

Tomedes has spent years refining processes, developing tools, and supporting linguists with best-in-class resources. But perfection in this field is a moving target. There’s always room to innovate—especially when technology enters the picture.

This is why Tomedes is excited to introduce the new AI-powered Translation QA Tool, developed with a transparent, open approach. The tool is being built in public, meaning real users are part of the testing, feedback, and improvement loop.

The purpose? To co-create a better QA solution alongside professionals like you—because the best tools are forged through real-world use, honest feedback, and ongoing iteration.

Why we’re building this tool

Let’s address some challenges most translation professionals face.

Large-scale multilingual projects often raise questions like:

  • “How can inconsistencies be identified across dozens of files?”

  • “What if a term was mistranslated and repeated throughout the text?”

  • “How can completeness be verified when the reviewer isn’t fluent in the target language?”

These questions point directly to why translation quality assurance is so vital. Traditionally, QA has relied on manual human review—an effort-intensive process prone to fatigue and errors.

Even the most experienced linguists can overlook untranslated phrases or inconsistent terminology, especially under tight deadlines.

The Tomedes team developed the Translation QA Tool with a clear mission:

  • Automate repetitive tasks such as checking for terminology consistency and identifying missing content.

  • Highlight high-priority issues quickly.

  • Empower linguists and reviewers to focus on accuracy, tone, and contextual alignment.

The ultimate goal is to improve quality and efficiency across the board—enabling faster project delivery and more reliable outcomes for clients, without compromising linguistic integrity.

Current testing phase: GPT-4o and DeepSeek

At this stage, the Translation QA Tool operates in two parallel versions, each powered by a different large language model (LLM).

The primary version utilizes GPT-4o, selected for its language fluency, advanced reasoning, and ability to analyze source-target relationships effectively. This version reviews files for missing translations, untranslated terms, and inconsistencies with high precision, making it ideal for rapid quality checks.


The second version powered by DeepSeek is under active evaluation. DeepSeek has shown promise in multilingual performance, especially when handling less common language pairs and complex syntactic structures. This provides a valuable comparison point for real-world use cases.

The objective of this testing phase is to compare both models across key areas such as:

  • Translation completeness

  • Terminology usage

  • Contextual interpretation

  • Grammatical and linguistic accuracy

By testing both models side-by-side using real translation projects, the Tomedes team can assess performance, strengths, and limitations—gathering actionable insights for future development.

A collaborative, data-driven approach to internal and external testing

The tool’s development process includes extensive internal and external testing to ensure reliability, accuracy, and real-world relevance.

Internally, Tomedes linguists and reviewers are actively using the QA Tool on live projects across 12 language pairs. Early data shows that internal testing has improved error detection rates by 22% compared to traditional manual QA methods. 

The ongoing feedback loop between linguists and developers enables continuous, precise refinements, enhancing both model behavior and interface design.

Externally, over 50 selected professional translators, editors, and project managers have been granted early access to the tool.

Participants are encouraged to test with their own files, evaluate AI-generated quality reports, and provide structured feedback on performance, usability, and potential areas for improvement. 

In initial external testing rounds, 87% of users reported improved efficiency in identifying critical translation issues.

Participation is entirely voluntary and non-invasive. No user data is used for training purposes without explicit consent, in line with GDPR and major data privacy standards.

This collaborative, data-driven testing approach ensures that the QA Tool is not just another AI feature. 

It is a solution built by and for the professionals who depend on translation quality assurance every day.


What’s next: Expanding AI capabilities

The current dual-model setup is just the beginning.

Tomedes plans to expand testing to include additional LLMs (Large Language Models) in upcoming phases. 

This will involve evaluating models such as Claude 3, Gemini 1.5, LLaMA 3, and others recognized for their strong multilingual performance and domain-specific expertise.

Different models bring distinct strengths:

  • Some excel at technical accuracy, with models like DeepSeek achieving 98% terminology consistency in controlled tests.

  • Others, such as GPT-4o, outperform in tone, fluency, and conversational style, ranking first in multilingual chatbot evaluations by MT-Bench.

  • Models focused on low-resource languages show up to 20% improvement over traditional systems in recent comparative studies.

The long-term vision is to build a modular QA tool, enabling users to select or combine AI engines based on their project’s needs. For example:

  • Use GPT-4o for enhanced conversational fluency.

  • Deploy DeepSeek for precise terminology control in specialized industries.

  • Integrate a legal-specialized model for contract review and regulatory translations, where domain accuracy is critical.

This modular approach ensures that the QA tool remains customizable, adaptable, and increasingly intelligent—offering users a solution precisely tailored to their evolving translation environments.

Transparent updates and community-driven feedback will remain central to this evolution. 

As new models and features roll out, stakeholders will receive detailed performance updates and be invited to participate in testing phases, ensuring continuous improvement driven by real-world use cases.

Join the process – Share your feedback

The Translation QA Tool is live and evolving—with user input at the heart of its development.

Feedback is essential. 

Whether you're a freelance translator, agency reviewer, localization manager, or QA lead, your perspective can directly influence how the tool grows and improves. There are several ways to get involved:

  • Test your own translation files using the tool’s beta version.

  • Compare AI-generated reports with your own QA assessments.

  • Submit feedback through the provided form or reach out to the Tomedes team directly.

There’s no long-term commitment or technical expertise required—just a willingness to share observations, pain points, and feature suggestions.

This approach ensures that the final product truly reflects the needs of the global language services industry, not just internal assumptions.

By building in public and collaborating with real users, the Tomedes team aims to deliver a translation QA solution that adds immediate, tangible value to everyday workflows.


Conclusion

Tomedes has long been committed to merging human expertise with the best of emerging technology. From automated file handling to AI-powered localization workflows, innovation has always played a key role in how services are delivered.

The launch of the Translation QA Tool marks a significant step in this journey.

Built transparently, tested rigorously, and refined collaboratively, the tool is a powerful step toward smarter, more scalable AI translation quality assurance.

Curious how AI can improve your translation workflow? Try out Translation QA Tool and see the difference firsthand—powered by real feedback and cutting-edge LLMs. Get in touch with Tomedes today to explore tailored solutions that combine expert linguists with smart AI.