Mitigating Hallucinated Translations in Large Language Models with Hallucination-focused Preference Optimization
AuthorsZilu Tang, Rajen Chatterjee, Sarthak Garg
Mitigating Hallucinated Translations in Large Language Models with Hallucination-focused Preference Optimization
AuthorsZilu Tang, Rajen Chatterjee, Sarthak Garg
Machine Translation (MT) is undergoing a paradigm shift, with systems based on fine-tuned large language models (LLM) becoming increasingly competitive with traditional encoder-decoder models trained specifically for translation tasks. However, LLM-based systems are at a higher risk of generating hallucinations, which can severely undermine user’s trust and safety. Most prior research on hallucination mitigation focuses on traditional MT models, with solutions that involve post-hoc mitigation - detecting hallucinated translations and re-translating them. While effective, this approach introduces additional complexity in deploying extra tools in production and also increases latency. To address these limitations, we propose a method that intrinsically learns to mitigate hallucinations during the model training phase. Specifically, we introduce a data creation framework to generate hallucination focused preference datasets. Fine-tuning LLMs on these preference datasets reduces the hallucination rate by an average of 96% across five language pairs, while preserving overall translation quality. In a zero-shot setting our approach reduces hallucinations by 89% on an average across three unseen target languages.
Learning to Reason for Hallucination Span Detection
March 3, 2026research area Methods and Algorithms, research area Speech and Natural Language Processingconference ICLR
Large language models (LLMs) often generate hallucinations — unsupported content that undermines reliability. While most prior works frame hallucination detection as a binary task, many real-world applications require identifying hallucinated spans, which is a multi-step decision making process. This naturally raises the question of whether explicit reasoning can help the complex task of detecting hallucination spans. To answer this question, we…
Evaluating Evaluation Metrics — The Mirage of Hallucination Detection
October 27, 2025research area Data Science and Annotation, research area Speech and Natural Language Processingconference EMNLP
Hallucinations pose a significant obstacle to the reliability and widespread adoption of language models, yet their accurate measurement remains a persistent challenge. While many task- and domain-specific metrics have been proposed to assess faithfulness and factuality concerns, the robustness and generalization of these metrics are still untested. In this paper, we conduct a large-scale empirical evaluation of 6 diverse sets of hallucination…