21.09 |
University of Oxford |
ACL2022 |
TruthfulQA: Measuring How Models Mimic Human Falsehoods |
Benchmark&Truthfulness |
23.05 |
KAIST |
NAACL2024(findings) |
Why So Gullible? Enhancing the Robustness of Retrieval-Augmented Models against Counterfactual Noise |
Retrieval-Augmented Models&Counterfactual Noise&Open-Domain Question Answering |
23.07 |
Microsoft Research Asia, Hong Kong University of Science and Technology, University of Science and Technology of China, Tsinghua University, Sony AI |
ResearchSquare |
Defending ChatGPT against Jailbreak Attack via Self-Reminder |
Jailbreak Attack&Self-Reminder&AI Security |
23.10 |
University of Zurich |
arxiv |
Lost in Translation -- Multilingual Misinformation and its Evolution |
Misinformation&Multilingual |
23.10 |
New York University&Javier Rando |
arxiv |
Personas as a Way to Model Truthfulness in Language Models |
Truthfulness&Truthful Persona |
23.10 |
Tsinghua University, Allen Institute for AI, University of Illinois Urbana-Champaign |
NAACL2024 |
Language Models Hallucinate, but May Excel at Fact Verification |
Large Language Models&Hallucination&Fact Verification |
23.10 |
Stanford University&University of Maryland&Carnegie Mellon University&NYU Shanghai&New York University&Microsoft Research |
NAACL2024 |
Large Language Models Help Humans Verify Truthfulness—Except When They Are Convincingly Wrong |
Large Language Models&Fact-Checking&Truthfulness |
23.10 |
Shandong University |
NAACL2024 |
Knowing What LLMs DO NOT Know: A Simple Yet Effective Self-Detection Method |
Large Language Models&Self-Detection&Non-Factuality Detection |
23.10 |
Fudan University |
CIKM 2023 |
Hallucination Detection: Robustly Discerning Reliable Answers in Large Language Models |
Hallucination Detection&Reliable Answers |
23.11 |
Dialpad Canada Inc |
arxiv |
Are Large Language Models Reliable Judges? A Study on the Factuality Evaluation Capabilities of LLMs |
Factuality Assessment |
23.11 |
The University of Manchester |
arxiv |
Emotion Detection for Misinformation: A Review |
Survey&Misinformation&Emotions |
23.11 |
University of Virginia |
arxiv |
Can Language Models Be Tricked by Language Illusions? Easier with Syntax, Harder with Semantics |
Language Illusions |
23.11 |
University of Illinois Urbana-Champaign |
arxiv |
Learn to Refuse: Making Large Language Models More Controllable and Reliable through Knowledge Scope Limitation and Refusal Mechanism |
Hallucinations&Refusal Mechanism |
23.11 |
University of Washington Bothell |
arxiv |
Creating Trustworthy LLMs: Dealing with Hallucinations in Healthcare AI |
Healthcare&Trustworthiness&Hallucinations |
23.11 |
Intuit AI Research |
EMNLP2023 |
SAC3: Reliable Hallucination Detection in Black-Box Language Models via Semantic-aware Cross-check Consistency |
Hallucination Detection&Trustworthiness |
23.11 |
Shanghai Jiao Tong University |
arxiv |
Support or Refute: Analyzing the Stance of Evidence to Detect Out-of-Context Mis- and Disinformation |
Misinformation&Disinformation&Out-of-Context |
23.11 |
Hamad Bin Khalifa University |
arxiv |
ArAIEval Shared Task: Persuasion Techniques and Disinformation Detection in Arabic Text |
Disinformation&Arabic Text |
23.11 |
UNC-Chapel Hill |
arxiv |
Holistic Analysis of Hallucination in GPT-4V(ision): Bias and Interference Challenges |
Hallucination&Benchmark&Multimodal |
23.11 |
Cornell University |
arxiv |
Adapting Fake News Detection to the Era of Large Language Models |
Fake news detection&Generated News&Misinformation |
23.11 |
Harbin Institute of Technology |
arxiv |
A Survey on Hallucination in Large Language Models: Principles, Taxonomy, Challenges, and Open Questions |
Hallucination&Factual Consistency&Trustworthiness |
23.11 |
Korea University, KAIST AI,LG AI Research |
arXiv |
VOLCANO: Mitigating Multimodal Hallucination through Self-Feedback Guided Revision |
Multimodal Models&Hallucination&Self-Feedback |
23.11 |
Beijing Jiaotong University, Alibaba Group, Peng Cheng Lab |
arXiv |
AMBER: An LLM-free Multi-dimensional Benchmark for MLLMs Hallucination Evaluation |
Multi-modal Large Language Models&Hallucination&Benchmark |
23.11 |
LMU Munich; Munich Center of Machine Learning; Google Research |
arXiv |
Hallucination Augmented Recitations for Language Models |
Hallucination&Counterfactual Datasets |
23.11 |
Stanford University, UNC Chapel Hill |
arxiv |
Fine-tuning Language Models for Factuality |
Factuality&Reference-Free Truthfulness&Direct Preference Optimization |
23.11 |
Corporate Data and Analytics Office (CDAO) |
arxiv |
Hallucination-minimized Data-to-answer Framework for Financial Decision-makers |
Financial Decision Making&Hallucination Minimization |
23.11 |
Arizona State University |
arxiv |
Can Knowledge Graphs Reduce Hallucinations in LLMs? : A Survey |
Knowledge Graphs&Hallucinations&Survey |
23.11 |
Kempelen Institute of Intelligent Technologies; Brno University of Technology |
arxiv |
Disinformation Capabilities of Large Language Models |
Disinformation Generation&Safety Filters&Automated Evaluation |
23.11 |
UNC-Chapel Hill, University of Washington |
arxiv |
EVER: Mitigating Hallucination in Large Language Models through Real-Time Verification and Rectification |
Hallucination&Real-Time Verification&Rectification |
23.11 |
Peking University, WeChat AI, Tencent Inc. |
arXiv |
RECALL: A Benchmark for LLMs Robustness against External Counterfactual Knowledge |
External Counterfactual Knowledge&Benchmarking&Robustness |
23.11 |
PolyAI Limited |
arXiv |
Dial BEINFO for Faithfulness: Improving Factuality of Information-Seeking Dialogue via Behavioural Fine-Tuning |
Factuality&Behavioural Fine-Tuning&Hallucination |
23.11 |
The Hong Kong University of Science and Technology, University of Illinois Urbana-Champaign |
arxiv |
R-Tuning: Teaching Large Language Models to Refuse Unknown Questions |
Hallucination&Refusal-Aware Instruction Tuning&Knowledge Gap |
23.11 |
University of Southern California, University of Pennsylvania, University of California Davis |
arxiv |
Deceiving Semantic Shortcuts on Reasoning Chains: How Far Can Models Go without Hallucination? |
Hallucinations&Semantic Associations&Benchmark |
23.11 |
The Ohio State University, University of California Davis |
arxiv |
How Trustworthy are Open-Source LLMs? An Assessment under Malicious Demonstrations Shows their Vulnerabilities |
Trustworthiness&Malicious Demonstrations&Adversarial Attacks |
23.11 |
University of Sheffield |
arXiv |
Lighter yet More Faithful: Investigating Hallucinations in Pruned Large Language Models for Abstractive Summarization |
Hallucinations&&Language Model Reliability |
23.11 |
Institute of Information Engineering Chinese Academy of Sciences, University of Chinese Academy of Sciences |
arxiv |
Can Large Language Models Understand Content and Propagation for Misinformation Detection: An Empirical Study |
Misinformation Detection |
23.11 |
Shanghai Jiaotong University, Amazon AWS AI, Westlake University, IGSNRR Chinese Academy of Sciences, China |
arXiv |
Enhancing Uncertainty-Based Hallucination Detection with Stronger Focus |
Hallucination Detection&Uncertainty-Based Methods&Factuality Checking |
23.11 |
Institute of Software Chinese Academy of Sciences, University of Chinese Academy of Sciences |
arXiv |
Mitigating Large Language Model Hallucinations via Autonomous Knowledge Graph-based Retrofitting |
Hallucinations&Knowledge Graphs&Retrofitting |
23.11 |
Applied Research Quantiphi |
arxiv |
Minimizing Factual Inconsistency and Hallucination in Large Language Models |
Factual Inconsistency&Hallucination |
23.11 |
Microsoft Research, Georgia Tech |
arxiv |
Calibrated Language Models Must Hallucinate |
Hallucination&Calibration&Statistical Analysis |
23.11 |
School of Information Renmin University of China |
arxiv |
UHGEval: Benchmarking the Hallucination of Chinese Large Language Models via Unconstrained Generation |
Hallucination&Evaluation Benchmark |
23.11 |
DAMO Academy Alibaba Group, Nanyang Technological University, Hupan Lab |
arxiv |
Mitigating Object Hallucinations in Large Vision-Language Models through Visual Contrastive Decoding |
Vision-Language Models&Object Hallucinations |
23.11 |
Shanghai AI Laboratory |
arxiv |
Beyond Hallucinations: Enhancing LVLMs through Hallucination-Aware Direct Preference Optimization |
Multimodal Language Models&Hallucination Problem&Direct Preference Optimization |
23.11 |
Arizona State University |
NAACL2024 |
Can Knowledge Graphs Reduce Hallucinations in LLMs? : A Survey |
Knowledge Graphs&Large Language Models&Hallucination Reduction |
23.11 |
Mohamed bin Zayed University of Artificial Intelligence |
NAACL2024 |
A Survey of Confidence Estimation and Calibration in Large Language Models |
Confidence Estimation&Calibration&Large Language Models |
23.11 |
University of California, Davis |
NAACL2024 |
Deceptive Semantic Shortcuts on Reasoning Chains: How Far Can Models Go without Hallucination? |
Semantic Shortcuts&Reasoning Chains&Hallucination |
23.11 |
University of Utah |
NAACL2024 |
To Tell The Truth: Language of Deception and Language Models |
Deception Detection&Language Models&Conversational Analysis |
23.11 |
Cornell University |
NAACL2024(findings) |
Adapting Fake News Detection to the Era of Large Language Models |
Fake News Detection&Large Language Models&Machine-Generated Content |
23.12 |
Singapore Management University, Beijing Forestry University, University of Electronic Science and Technology of China |
MMM 2024 |
Mitigating Fine-Grained Hallucination by Fine-Tuning Large Vision-Language Models with Caption Rewrites |
Vision-language Models&Hallucination&Fine-grained Evaluation |
23.12 |
Mila, McGill University |
EMNLP2023(findings) |
Evaluating Dependencies in Fact Editing for Language Models: Specificity and Implication Awareness |
Knowledge Bases&Dataset&Evaluation Protocol |
23.12 |
MIT CSAIL |
arxiv |
Cognitive Dissonance: Why Do Language Model Outputs Disagree with Internal Representations of Truthfulness? |
Truthfulness&Internal Representations |
23.12 |
University of Illinois Chicago, Bosch Research North America & Bosch Center for Artificial Intelligence (BCAI), UNC Chapel-Hill |
arxiv |
DELUCIONQA: Detecting Hallucinations in Domain-specific Question Answering |
Hallucination Detection&Domain-specific QA&Retrieval-augmented LLMs |
23.12 |
The University of Hong Kong, Beihang University |
AAAI2024 |
Improving Factual Error Correction by Learning to Inject Factual Errors |
Factual Error Correction |
23.12 |
Allen Institute for AI |
arxiv |
BARDA: A Belief and Reasoning Dataset that Separates Factual Accuracy and Reasoning Ability |
Dataset&Factual Accuracy&Reasoning Ability |
23.12 |
Tsinghua University, Shanghai Jiao Tong University, Stanford University, Nanyang Technological University |
arxiv |
The Earth is Flat because...: Investigating LLMs’ Belief towards Misinformation via Persuasive Conversation |
Misinformation&Persuasive Conversation&Factual Questions |
23.12 |
University of California Davis |
arXiv |
A Revisit of Fake News Dataset with Augmented Fact-checking by ChatGPT |
Fake News&Fact-checking |
23.12 |
Amazon Web Services |
arxiv |
On Early Detection of Hallucinations in Factual Question Answering |
Hallucinations&Factual Question Answering |
23.12 |
University of California Santa Cruz |
arxiv |
Don’t Believe Everything You Read: Enhancing Summarization Interpretability through Automatic Identification of Hallucinations in Large Language Models |
Hallucinations&Faithfulness&Token-level |
23.12 |
Department of Radiology, The University of Tokyo Hospital |
arxiv |
Theory of Hallucinations based on Equivariance |
Hallucinations&Equivariance |
23.12 |
Georgia Institute of Technology |
arXiv |
REDUCING LLM HALLUCINATIONS USING EPISTEMIC NEURAL NETWORKS |
Hallucinations&Uncertainty Estimation&TruthfulQA |
23.12 |
Institute of Artificial Intelligence, School of Computer Science and Technology, Soochow University, Tencent AI Lab |
arXiv |
Alleviating Hallucinations of Large Language Models through Induced Hallucinations |
Hallucinations&Induce-then-Contrast Decoding&Factuality |
23.12 |
SKLOIS Institute of Information Engineering Chinese Academy of Sciences, School of Cyber Security University of Chinese Academy of Sciences |
arXiv |
LLM Factoscope: Uncovering LLMs’ Factual Discernment through Inner States Analysis |
Factual Detection&Inner States |
24.01 |
The Chinese University of Hong Kong, Tencent AI Lab |
arxiv |
The Earth is Flat? Unveiling Factual Errors in Large Language Models |
Factual Errors&Knowledge Graph&Answer Assessment |
24.01 |
NewsBreak, University of Illinois Urbana-Champaign |
arxiv |
RAGTruth: A Hallucination Corpus for Developing Trustworthy Retrieval-Augmented Language Models |
Retrieval-Augmented Generation&Hallucination Detection&Dataset |
24.01 |
University of California Berkeley, Université de Montréal, McGill University, Mila |
arxiv |
Uncertainty Resolution in Misinformation Detection |
Misinformation&Uncertainty Resolution |
24.01 |
Yale University, Stanford University |
arxiv |
Large Legal Fictions: Profiling Legal Hallucinations in Large Language Models |
Legal Hallucinations |
24.01 |
Islamic University of Technology, AI Institute University of South Carolina, Stanford University, Amazon AI |
arxiv |
A Comprehensive Survey of Hallucination Mitigation Techniques in Large Language Models |
ß Hallucination Mitigation |
24.01 |
Renmin University of China, Renmin University of China, DIRO, Université de Montréal |
arxiv |
The Dawn After the Dark: An Empirical Study on Factuality Hallucination in Large Language Models |
Hallucination&Detection and Mitigation&Empirical Study |
24.01 |
IIT Hyderabad India, Parmonic USA, University of Glasgow UK, LDRP Institute of Technology and Research India |
arxiv |
Fighting Fire with Fire: Adversarial Prompting to Generate a Misinformation Detection Dataset |
Misinformation Detection&LLM-generated Synthetic Data |
24.01 |
University College London |
arxiv |
Hallucination Benchmark in Medical Visual Question Answering |
Medical Visual Question Answering&Hallucination Benchmark |
24.01 |
Soochow University |
arxiv |
LightHouse: A Survey of AGI Hallucination |
AGI Hallucination |
24.01 |
University of Washington, Carnegie Mellon University, Allen Institute for AI |
arxiv |
Fine-grained Hallucination Detection and Editing for Language Models |
Hallucination Detection&FAVA |
24.01 |
Dartmouth College, Université de Montréal, McGill University,Mila |
arxiv |
Comparing GPT-4 and Open-Source Language Models in Misinformation Mitigation |
GPT-4&Misinformation Detection |
24.01 |
Utrecht University |
arxiv |
The Pitfalls of Defining Hallucination |
Hallucination |
24.01 |
Samsung AI Center |
arxiv |
Hallucination Detection and Hallucination Mitigation: An Investigation |
Hallucination Detection&Hallucination Mitigation |
24.01 |
McGill University, Mila, Université de Montréal |
arxiv |
Combining Confidence Elicitation and Sample-based Methods for Uncertainty Quantification in Misinformation Mitigation |
Misinformation Mitigation&Uncertainty Quantification&Sample-based Consistency |
24.01 |
LY Corporation |
arxiv |
On the Audio Hallucinations in Large Audio-Video Language Models |
Audio Hallucinations&Audio-visual Learning&Audio-video language Models |
24.01 |
Sun Yat-sen University Tencent AI Lab |
arXiv |
Mitigating Hallucinations of Large Language Models via Knowledge Consistent Alignment |
Hallucination Mitigation&Knowledge Consistent Alignment |
24.01 |
National University of Singapore |
arxiv |
Hallucination is Inevitable: An Innate Limitation of Large Language Models |
Hallucination&Real World LLMs |
24.01 |
X2Robot&International Digital Economy Academy |
arXiv |
Learning to Trust Your Feelings: Leveraging Self-awareness in LLMs for Hallucination Mitigation |
Hallucination Mitigation&Knowledge Probing&Reinforcement Learning |
24.01 |
University of Texas at Austin, Northeastern University |
arxiv |
Diverse but Divisive: LLMs Can Exaggerate Gender Differences in Opinion Related to Harms of Misinformation |
Misinformation Detection&Socio-Technical Systems |
24.01 |
National University of Defense Technology, National University of Singapore |
arxiv |
SWEA: Changing Factual Knowledge in Large Language Models via Subject Word Embedding Altering |
Factual Knowledge Editing&Word Embeddings |
24.02 |
University of Washington, University of California Berkeley, The Hong Kong University of Science and Technology, Carnegie Mellon University |
arxiv |
Don’t Hallucinate Abstain: Identifying LLM Knowledge Gaps via Multi-LLM Collaboration |
Knowledge Gaps&Multi-LLM Collaboration |
24.02 |
IT Innovation and Research Center, Huawei Technologies |
arxiv |
A Survey on Hallucination in Large Vision-Language Models |
Large Vision-Language Models&Hallucination&Mitigation Strategies |
24.02 |
Tianjin University, National University of Singapore, A*STAR |
arxiv |
SKIP \N: A SIMPLE METHOD TO REDUCE HALLUCINATION IN LARGE VISION-LANGUAGE MODELS |
Semantic Shift Bias&Hallucination Mitigation&Vision-Language Models |
24.02 |
University of Marburg, University of Mannheim |
EACL Findings 2024 |
The Queen of England is not England’s Queen: On the Lack of Factual Coherency in PLMs |
Factual Coherency&Knowledge Bases |
24.02 |
MBZUAI, Monash University, LibrAI, Sofia University |
arxiv |
Factuality of Large Language Models in the Year 2024 |
Factuality&Evaluation&Multimodal LLMs |
24.02 |
Institute of Information Engineering, Chinese Academy of Sciences, University of Chinese Academy of Sciences |
arxiv |
Are Large Language Models Table-based Fact-Checkers? |
Table-based Fact Verification&In-context Learning |
24.02 |
Zhejiang University, Ant Group |
arxiv |
Unified Hallucination Detection for Multimodal Large Language Models |
Multimodal Large Language Models&Hallucination Detection&Benchmark |
24.02 |
Alibaba Cloud, Zhejiang University |
ICLR2024 |
INSIDE: LLMS’ INTERNAL STATES RETAIN THE POWER OF HALLUCINATION DETECTION |
Hallucination Detection&EigenScore |
24.02 |
The Hong Kong University of Science and Technology, University of Illinois at Urbana-Champaign, The Hong Kong Polytechnic University |
arxiv |
The Instinctive Bias: Spurious Images lead to Hallucination in MLLMs |
Multimodal Large Language Models&Hallucination |
24.02 |
Institute of Automation Chinese Academy of Sciences, University of Chinese Academy of Sciences |
arxiv |
Can Large Language Models Detect Rumors on Social Media? |
Rumor Detection&Social Media |
24.02 |
CAS Key Laboratory of AI Safety, School of Computer Science and Technology University of Chinese Academy of Science, International Digital Economy Academy IDEA Research |
arxiv |
A Survey on Large Language Model Hallucination via a Creativity Perspective |
Creativity&Hallucination |
24.02 |
University College London, Speechmatics, MATS, Anthropic, FAR AI |
arxiv |
Debating with More Persuasive LLMs Leads to More Truthful Answers |
Debate&Truthfulness |
24.02 |
University of Illinois Urbana-Champaign, DAMO Academy Alibaba Group, Northwestern University |
arxiv |
Towards Faithful Explainable Fact-Checking via Multi-Agent Debate |
Fact-checking&Explainability |
24.02 |
Rice Universitym, Texas A&M University, Wake Forest University, New Jersey Institute of Technology, Meta Platforms Inc. |
arxiv |
Large Language Models As Faithful Explainers |
Explainability&Fidelity&Optimization |
24.02 |
The Hong Kong University of Science and Technology |
arxiv |
Do LLMs Know about Hallucination? An Empirical Investigation of LLM’s Hidden States |
Hallucination&Hidden States&Model Interpretation |
24.02 |
UC Santa Cruz, ByteDance Research, Northwestern University |
arxiv |
MEASURING AND REDUCING LLM HALLUCINATION WITHOUT GOLD-STANDARD ANSWERS VIA EXPERTISE-WEIGHTING |
Large Language Models (LLMs)&Hallucination&Factualness Evaluations&FEWL |
24.02 |
Paul G. Allen School of Computer Science & Engineering, University of Washington |
arxiv |
Comparing Hallucination Detection Metrics for Multilingual Generation |
Hallucination Detection&Multilingual Generation&Lexical Metrics&Natural Language Inference (NLI) |
24.02 |
Institute of Computing Technology, Chinese Academy of Sciences; University of Chinese Academy of Sciences |
arxiv |
Retrieve Only When It Needs: Adaptive Retrieval Augmentation for Hallucination Mitigation in Large Language Models |
Large Language Models (LLMs)&Hallucination Mitigation&Retrieval Augmentation&Rowen |
24.02 |
Institute of Automation, Chinese Academy of Sciences, University of Chinese Academy of Sciences, Nanjing University |
arxiv |
Logical Closed Loop: Uncovering Object Hallucinations in Large Vision-Language Models |
Object Hallucination&Vision-Language Models (LVLMs) |
24.02 |
Institute of Mathematics and Statistics University of São Paulo, Artificial Intelligence Specialist in the Banking Sector |
arxiv |
Hallucinations or Attention Misdirection? The Path to Strategic Value Extraction in Business Using Large Language Models |
Hallucinations&Generative Artificial Intelligence |
24.02 |
Stevens Institute of Technology, Peraton Labs |
arxiv |
Can Large Language Models Detect Misinformation in Scientific News Reporting? |
Scientific Reporting&Misinformation&Explainability |
24.02 |
Middle East Technical University |
arxiv |
HypoTermQA: Hypothetical Terms Dataset for Benchmarking Hallucination Tendency of LLMs |
Hallucination&Benchmarking Dataset |
24.02 |
National University of Singapore |
arxiv |
Seeing is Believing: Mitigating Hallucination in Large Vision-Language Models via CLIP-Guided Decoding |
Vision-Language Models&Hallucination&CLIP-Guided Decoding |
24.02 |
University of California Los Angeles, Cisco Research |
arxiv |
Characterizing Truthfulness in Large Language Model Generations with Local Intrinsic Dimension |
Truthfulness&Local Intrinsic Dimension |
24.02 |
Institute of Automation Chinese Academy of Sciences, School of Artificial Intelligence University of Chinese Academy of Sciences, Hunan Normal University |
arxiv |
Whispers that Shake Foundations: Analyzing and Mitigating False Premise Hallucinations in Large Language Models |
False Premise Hallucinations&Attention Mechanism |
24.02 |
Shanghai Artificial Intelligence Laboratory, Renmin University of China, University of Chinese Academy of Sciences, Shanghai Jiao Tong University, The University of Sydney |
arxiv |
Towards Tracing Trustworthiness Dynamics: Revisiting Pre-training Period of Large Language Models |
Trustworthiness Dynamics&Pre-training |
24.02 |
AWS AI Labs&Korea Advanced Institute of Science & Technology&The University of Texas at Austin |
NAACL2024 |
TOFUEVAL: Evaluating Hallucinations of LLMs on Topic-Focused Dialogue Summarization |
Hallucination Evaluation&LLMs&Dialogue Summarization |
24.03 |
École polytechnique fédérale de Lausanne, Carnegie Mellon University, University of Maryland College Park |
arxiv |
"Flex Tape Can’t Fix That": Bias and Misinformation in Edited Language Models |
Model Editing&Demographic Bias&Misinformation |
24.03 |
East China Normal University |
arxiv |
DiaHalu: A Dialogue-level Hallucination Evaluation Benchmark for Large Language Models |
Dialogue-level Hallucination&Benchmarking&Human-machine Interaction |
24.03 |
Peking University |
arxiv |
Evaluating and Mitigating Number Hallucinations in Large Vision-Language Models: A Consistency Perspective |
Number Hallucination&Vision-Language Models&Consistency Training |
24.03 |
City University of Hong Kong, National University of Singapore, Shanghai Jiao Tong University, Stanford University, Penn State University, Hong Kong University of Science and Technology |
arxiv |
IN-CONTEXT SHARPNESS AS ALERTS: AN INNER REPRESENTATION PERSPECTIVE FOR HALLUCINATION MITIGATION |
Hallucination&Inner Representation&Entropy |
24.03 |
Microsoft |
arxiv |
In Search of Truth: An Interrogation Approach to Hallucination Detection |
Hallucination Detection&Interrogation Technique&Balanced Accuracy |
24.03 |
Mohamed bin Zayed University of Artificial Intelligence |
arxiv |
Multimodal Large Language Models to Support Real-World Fact-Checking |
Multimodal Large Language Models&Fact-Checking&Misinformation |
24.03 |
KAIST, Microsoft Research Asia |
arxiv |
ERBENCH: AN ENTITY-RELATIONSHIP BASED AUTOMATICALLY VERIFIABLE HALLUCINATION BENCHMARK FOR LARGE LANGUAGE MODELS |
Hallucination&Entity-Relationship Model&Benchmarking |
24.03 |
University of Alberta, Platform and Content Group, Tencent |
arxiv |
SIFiD: Reassess Summary Factual Inconsistency Detection with LLM |
Factual Consistency&Summarization |
24.03 |
Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences; University of Chinese Academy of Sciences |
arxiv |
Truth-Aware Context Selection: Mitigating the Hallucinations of Large Language Models Being Misled by Untruthful Contexts |
Truth Detection&Context Selection |
24.03 |
UC Berkeley, Google DeepMind |
arxiv |
Unfamiliar Finetuning Examples Control How Language Models Hallucinate |
Large Language Models&Finetuning&Hallucination Control |
24.03 |
University of Alberta, Platform and Content Group, Tencent |
arxiv |
SIFiD: Reassess Summary Factual Inconsistency Detection with LLM |
Factual Consistency&Summarization |
24.03 |
Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences; University of Chinese Academy of Sciences |
arxiv |
Truth-Aware Context Selection: Mitigating the Hallucinations of Large Language Models Being Misled by Untruthful Contexts |
Truth Detection&Context Selection |
24.03 |
UC Berkeley, Google DeepMind |
arxiv |
Unfamiliar Finetuning Examples Control How Language Models Hallucinate |
Large Language Models&Finetuning&Hallucination Control |
24.03 |
Google Research, UC San Diego |
COLING 2024 |
Detecting Hallucination and Coverage Errors in Retrieval Augmented Generation for Controversial Topics |
Conversational Systems&Evaluation Methodologies |
24.03 |
University of Maryland, University of Antwerp, New York University |
arxiv |
Evaluating LLMs for Gender Disparities in Notable Persons |
Bias&Fairness&Hallucinations |
24.03 |
University of Duisburg-Essen |
arxiv |
The Human Factor in Detecting Errors of Large Language Models: A Systematic Literature Review and Future Research Directions |
Hallucination |
24.03 |
Wuhan University, Beihang University, The University of Sydney, Nanyang Technological University |
COLING 2024 |
Take Care of Your Prompt Bias! Investigating and Mitigating Prompt Bias in Factual Knowledge Extraction |
Factual Knowledge Extraction&Prompt Bias |
24.03 |
Carnegie Mellon University |
arxiv |
Enhancing LLM Factual Accuracy with RAG to Counter Hallucinations: A Case Study on Domain-Specific Queries in Private Knowledge-Bases |
Retrieval Augmented Generation (RAG)&Private Knowledge-Bases&Hallucinations |
24.03 |
Integrated Vision and Language Lab KAIST South Korea |
arxiv |
What if...?: Counterfactual Inception to Mitigate Hallucination Effects in Large Multimodal Models |
Large Multimodal Models&Hallucination |
24.03 |
UCAS |
arxiv |
MMIDR: Teaching Large Language Model to Interpret Multimodal Misinformation via Knowledge Distillation |
Multimodal Misinformation Detection&Knowledge Distillation |
24.03 |
Seoul National University, Sogang University |
arxiv |
Exploiting Semantic Reconstruction to Mitigate Hallucinations in Vision-Language Models |
Semantic Reconstruction&Vision-Language Models&Hallucination Mitigation |
24.03 |
University of Illinois Urbana-Champaign |
arxiv |
Hallucination Detection in Foundation Models for Decision-Making: A Flexible Definition and Review of the State of the Art |
Hallucination Detection&Foundation Models&Decision-Making |
24.03 |
Shanghai Jiao Tong University |
arxiv |
Rejection Improves Reliability: Training LLMs to Refuse Unknown Questions Using RL from Knowledge Feedback |
Knowledge Feedback&Reliable Reward Model&Refusal Mechanism |
24.03 |
Universität Hamburg, The University of Sydney |
arxiv |
Mitigating Hallucinations in Large Vision-Language Models with Instruction Contrastive Decoding |
Instruction Contrastive Decoding&Large Vision-Language Models&Hallucination Mitigation |
24.03 |
AI Institute University of South Carolina, Indian Institute of Technology Kharagpur, Islamic University of Technology, Stanford University, Amazon AI |
arxiv |
“Sorry Come Again?” Prompting – Enhancing Comprehension and Diminishing Hallucination with [PAUSE] -injected Optimal Paraphrasing |
Prompt Engineering&Hallucination Mitigation&[PAUSE] Injection |
24.04 |
Beihang University, School of Computer Science and Engineering, School of Software, Shandong University |
arxiv |
Exploring and Evaluating Hallucinations in LLM-Powered Code Generation |
Code Generation&Hallucination |
24.03 |
University of Illinois Urbana-Champaign |
NAACL2024 |
Evidence-Driven Retrieval Augmented Response Generation for Online Misinformation |
Online Misinformation&Retrieval Augmented Response&Evidence-Based Countering |
24.03 |
Department of Electronic Engineering, Tsinghua University, Pattern Recognition Center, WeChat AI, Tencent Inc, China |
NAACL 2024 |
On Large Language Models’ Hallucination with Regard to Known Facts |
Hallucination&Inference Dynamics |
24.03 |
Department of Electronic Engineering, Tsinghua University, Pattern Recognition Center, WeChat AI, Tencent Inc, China |
NAACL 2024 |
On Large Language Models’ Hallucination with Regard to Known Facts |
Hallucination&Inference Dynamics |
24.03 |
Tsinghua University, WeChat AI, Tencent Inc. |
NAACL2024 |
On Large Language Models’ Hallucination with Regard to Known Facts |
Large Language Models&Hallucination&Inference Dynamics |
24.04 |
Technical University of Munich, University of Stavanger, University of Alberta |
arxiv |
PoLLMgraph: Unraveling Hallucinations in Large Language Models via State Transition Dynamics |
Hallucination Detection&State Transition Dynamics&Large Language Models |
24.04 |
University of Edinburgh, University College London, Peking University, Together AI |
arxiv |
The Hallucinations Leaderboard – An Open Effort to Measure Hallucinations in Large Language Models |
Hallucination Detection&Benchmarking |
24.04 |
IIIT Hyderabad, Purdue University, Northwestern University, Indiana University Indianapolis |
arxiv |
Halu-NLP at SemEval-2024 Task 6: MetaCheckGPT - A Multi-task Hallucination Detection Using LLM Uncertainty and Meta-models |
Hallucination Detection&LLM Uncertainty&Meta-models |
24.04 |
Technion – Israel Institute of Technology, Google Research |
arxiv |
Constructing Benchmarks and Interventions for Combating Hallucinations in LLMs |
Hallucinations&Benchmarks |
24.04 |
The University of Texas at Austin, Salesforce AI Research |
arxiv |
MiniCheck: Efficient Fact-Checking of LLMs on Grounding Documents |
Fact-Checking&Efficiency |
24.04 |
Meta, Technical University of Munich |
arxiv |
Uncertainty-Based Abstention in LLMs Improves Safety and Reduces Hallucinations |
Safety&Hallucinations&Uncertainty |
24.04 |
Zhejiang University, Alibaba Group, Fudan University |
arxiv |
Detecting and Mitigating Hallucination in Large Vision Language Models via Fine-Grained AI Feedback |
Large Vision Language Model&Hallucination Detection And Mitigating&Direct Preference Optimization |
24.04 |
Cheriton School of Computer Science |
arxiv |
Rumour Evaluation with Very Large Language Models |
Misinformation in Social Networks&Explainable AI |
24.04 |
University of California, Berkeley |
NAACL 2024 |
ALOHa: A New Measure for Hallucination in Captioning Models |
Adversarial Attack&AI-Text Detection |
24.04 |
ServiceNow |
NAACL 2024 |
Reducing hallucination in structured outputs via Retrieval-Augmented Generation |
Retrieval-Augmented Generation&Structured Outputs&Generative AI |
24.04 |
Stanford University |
NAACL2024 |
NLP Systems That Can’t Tell Use from Mention Censor Counterspeech, but Teaching the Distinction Helps |
Counterspeech&Censorship&Use-Mention Distinction |
24.04 |
Department of Computing Science, University of Aberdeen |
NAACL2024 |
Improving Factual Accuracy of Neural Table-to-Text Output by Addressing Input Problems in ToTTo |
Neural Table-to-Text&Factual Accuracy&Input Problems |
24.04 |
Seoul National University |
NAACL2024(findings) |
Mitigating Hallucination in Abstractive Summarization with Domain-Conditional Mutual Information |
Hallucination&Abstractive Summarization&Domain-Conditional Mutual Information |
24.05 |
The University of Tokyo, University of California Santa Barbara, Mila - Québec AI Institute, Université de Montréal, Speech Lab, Alibaba Group, Hong Kong Baptist University |
arxiv |
CodeHalu: Code Hallucinations in LLMs Driven by Execution-based Verification |
Code Hallucination&Execution-based Verification |
24.05 |
Department of Computer Science, The University of Sheffield |
arxiv |
Addressing Topic Granularity and Hallucination in Large Language Models for Topic Modelling |
Topic Modelling&Hallucination&Topic Granularity |
24.04 |
School of Computing and Information Systems |
COLING 2024 |
Reinforcement Retrieval Leveraging Fine-grained Feedback for Fact Checking News Claims with Black-Box LLM |
Claim Verification&Reinforcement Retrieval&Fine-Grained Feedback |
24.05 |
DeepMind |
arxiv |
Mitigating LLM Hallucinations via Conformal Abstention |
Conformal Prediction&Hallucination Mitigation |
24.05 |
MBZUAI, Monash University, Sofia University |
arxiv |
OpenFactCheck: A Unified Framework for Factuality Evaluation of LLMs |
Factuality Evaluation&Automatic Fact-Checking |
24.05 |
Indian Institute of Technology Patna |
arxiv |
Unveiling Hallucination in Text, Image, Video, and Audio Foundation Models: A Comprehensive Review |
Hallucination Detection&Multimodal Models&Review |
24.05 |
Dublin City University |
arxiv |
Tell Me Why: Explainable Public Health Fact-Checking with Large Language Models |
Explainable AI&Fact-Checking&Public Health |
24.05 |
University of Information Technology, Vietnam National University |
arxiv |
ViWikiFC: Fact-Checking for Vietnamese Wikipedia-Based Textual Knowledge Source |
Fact Checking&Information Verification&Corpus |
24.05 |
Imperial College London |
arxiv |
Mitigating Hallucinations in Large Language Models via Self-Refinement-Enhanced Knowledge Retrieval |
Hallucination Mitigation&Knowledge Graph Retrieval |
24.05 |
Paul G. Allen School of Computer Science & Engineering |
arxiv |
MASSIVE Multilingual Abstract Meaning Representation: A Dataset and Baselines for Hallucination Detection |
Hallucination Detection&Multilingual AMR&Dataset |
24.05 |
Microsoft Corporation |
arxiv |
Unlearning Climate Misinformation in Large Language Models |
Climate Misinformation&Unlearning&Fine-Tuning |
24.05 |
Baylor University |
arxiv |
Detecting Hallucinations in Large Language Model Generation: A Token Probability Approach |
Hallucinations Detection&Token Probability Approach |
24.05 |
Shanghai AI Laboratory |
arxiv |
ANAH: Analytical Annotation of Hallucinations in Large Language Models |
Hallucinations&Analytical Annotation |
24.06 |
University of Waterloo |
arxiv |
TruthEval: A Dataset to Evaluate LLM Truthfulness and Reliability |
Truthfulness&Reliability |
24.06 |
Peking University |
arxiv |
Towards Detecting LLMs Hallucination via Markov Chain-based Multi-agent Debate Framework |
Hallucination Detection&Markov Chain&Multi-agent Debate |
24.06 |
Northeastern University |
ACL 2024 |
Analyzing LLM Behavior in Dialogue Summarization: Unveiling Circumstantial Hallucination Trends |
Dialogue Summarization&Circumstantial Hallucination&Error Detection |
24.06 |
McGill University |
ACL 2024 |
Confabulation: The Surprising Value of Large Language Model Hallucinations |
Confabulation&Hallucinations&Narrativity |
24.06 |
University of Michigan |
arxiv |
3D-GRAND: A Million-Scale Dataset for 3D-LLMs with Better Grounding and Less Hallucination |
3D-LLMs&Grounding&Hallucination |
24.06 |
Arizona State University |
arxiv |
Investigating and Addressing Hallucinations of LLMs in Tasks Involving Negation |
Hallucinations&Negation |
24.06 |
Tsinghua University |
arxiv |
Benchmarking Trustworthiness of Multimodal Large Language Models: A Comprehensive Study |
Trustworthiness&MLLMs&Benchmark |
24.06 |
Beijing Academy of Artificial Intelligence |
arxiv |
HalluDial: A Large-Scale Benchmark for Automatic Dialogue-Level Hallucination Evaluation |
Hallucination Evaluation&Dialogue-Level&HalluDial |
24.06 |
KFUPM |
arxiv |
DefAn: Definitive Answer Dataset for LLMs Hallucination Evaluation |
Hallucination Evaluation&Definitive Answers |
24.06 |
Harbin Institute of Technology |
ACL 2024 findings |
Paying More Attention to Source Context: Mitigating Unfaithful Translations from Large Language Model |
Unfaithful Translations&Source Context |
24.06 |
National Taiwan University |
Interspeech 2024 |
Understanding Sounds, Missing the Questions: The Challenge of Object Hallucination in Large Audio-Language Models |
Large audio-language models&Object hallucination&Discriminative questions |
24.06 |
University of Texas at San Antonio |
arxiv |
We Have a Package for You! A Comprehensive Analysis of Package Hallucinations by Code Generating LLMs |
Package Hallucinations&Code Generating LLMs&Software Supply Chain Security |
24.06 |
The University of Manchester |
arxiv |
RAEmoLLM: Retrieval Augmented LLMs for Cross-Domain Misinformation Detection Using In-Context Learning based on Emotional Information |
RAEmoLLM&Cross-Domain Misinformation Detection&Affective Information |
24.06 |
KAIST |
arxiv |
Adversarial Style Augmentation via Large Language Model for Robust Fake News Detection |
Adversarial Style Augmentation&Fake News Detection |
24.06 |
The Chinese University of Hong Kong |
arxiv |
Mitigating Large Language Model Hallucination with Faithful Finetuning |
Hallucination&Faithful Finetuning |
24.06 |
Gaoling School of Artificial Intelligence, Renmin University of China |
arxiv |
Small Agent Can Also Rock! Empowering Small Language Models as Hallucination Detector |
Hallucination Detection&Small Language Models&HaluAgent |
24.06 |
University of Science and Technology of China |
arxiv |
CrAM: Credibility-Aware Attention Modification in LLMs for Combating Misinformation in RAG |
CrAM&Credibility-Aware Attention&Retrieval-Augmented Generation |
24.06 |
University of Rochester |
arxiv |
Do More Details Always Introduce More Hallucinations in LVLM-based Image Captioning? |
LVLMs&Image Captioning&Object Hallucination |
24.06 |
Xi'an Jiaotong University |
arxiv |
AGLA: Mitigating Object Hallucinations in Large Vision-Language Models with Assembly of Global and Local Attention |
AGLA&Object Hallucinations&Large Vision-Language Models |
24.06 |
University of Groningen, University of Amsterdam |
arxiv |
Model Internals-based Answer Attribution for Trustworthy Retrieval-Augmented Generation |
Retrieval-Augmented Generation&Trustworthy AI |
24.06 |
Seoul National University |
arxiv |
Large Language Models are Skeptics: False Negative Problem of Input-conflicting Hallucination |
False Negative Problem&Input-conflicting Hallucination&Bias |
24.06 |
University of Houston |
arxiv |
Seeing Through AI’s Lens: Enhancing Human Skepticism Towards LLM-Generated Fake News |
Fake news&LLM-generated news |
24.06 |
University of Oxford |
arxiv |
Semantic Entropy Probes: Robust and Cheap Hallucination Detection in LLMs |
Hallucination Detection&Semantic Entropy&Probes |
24.06 |
UC San Diego |
arxiv |
Mitigating Hallucination in Fictional Character Role-Play |
Hallucination Mitigation&Role-Play&Fictional Characters |
24.06 |
Lamini |
arxiv |
Banishing LLM Hallucinations Requires Rethinking Generalization |
Hallucinations&Generalization&Memory Experts |
24.06 |
Waseda University |
arxiv |
ToolBeHonest: A Multi-level Hallucination Diagnostic Benchmark for Tool-Augmented Large Language Models |
Tool-Augmented Large Language Models&Hallucination Diagnostic Benchmark&Tool Usage |
24.07 |
Beihang University |
arxiv |
PFME: A Modular Approach for Fine-grained Hallucination Detection and Editing of Large Language Models |
Hallucination Detection&Model Editing |
24.07 |
Tsinghua University |
arxiv |
Fake News Detection and Manipulation Reasoning via Large Vision-Language Models |
Large Vision-Language Models&Fake News Detection&Manipulation Reasoning |
24.07 |
Brno University of Technology |
arxiv |
Generative Large Language Models in Automated Fact-Checking: A Survey |
Automated Fact-Checking&Survey |
24.07 |
SRI International |
arxiv |
Pelican: Correcting Hallucination in Vision-LLMs via Claim Decomposition and Program of Thought Verification |
Vision-LLMs&Hallucination Detection&Claim Verification |
24.07 |
Hong Kong University of Science and Technology |
arxiv |
LLM Internal States Reveal Hallucination Risk Faced With a Query |
Hallucination Detection&Uncertainty Estimation |
24.07 |
Harbin Institute of Technology |
ICLR 2024 AGI Workshop |
Investigating and Mitigating the Multimodal Hallucination Snowballing in Large Vision-Language Models |
Vision-Language Models&Multimodal Hallucination&Residual Visual Decoding |
24.07 |
Harbin Institute of Technology |
ACL 2024 |
Investigating and Mitigating the Multimodal Hallucination Snowballing in Large Vision-Language Models |
Multimodal Hallucinations&LVLMs&Residual Visual Decoding |
24.07 |
University of Amsterdam |
arxiv |
Leveraging Graph Structures to Detect Hallucinations in Large Language Models |
Hallucination Detection&Graph Attention Network&Large Language Models |
24.07 |
Cisco Research |
arxiv |
Code Hallucination |
Code Hallucination&Generative Models&HallTrigger |
24.07 |
Beijing Jiaotong University |
arxiv |
KG-FPQ: Evaluating Factuality Hallucination in LLMs with Knowledge Graph-based False Premise Questions |
Factuality Hallucination&Knowledge Graph&False Premise Questions |
24.07 |
University of California, Santa Barbara |
arxiv |
DebUnc: Mitigating Hallucinations in Large Language Model Agent Communication with Uncertainty Estimations |
Hallucinations&Uncertainty Estimations&Multi-agent Systems |
24.07 |
Massachusetts Institute of Technology |
arxiv |
Lookback Lens: Detecting and Mitigating Contextual Hallucinations in Large Language Models Using Only Attention Maps |
Contextual Hallucinations&Attention Maps |
24.07 |
University of Illinois Urbana-Champaign |
arxiv |
Knowledge Overshadowing Causes Amalgamated Hallucination in Large Language Models |
Knowledge Overshadowing&Hallucination |
24.07 |
Patronus AI |
arxiv |
Lynx: An Open Source Hallucination Evaluation Model |
Hallucination Detection&RAG&Evaluation Model |
24.07 |
Shanghai Jiao Tong University |
arxiv |
On the Universal Truthfulness Hyperplane Inside LLMs |
Truthfulness Hyperplane&Hallucination |
24.07 |
University of Michigan |
ACL 2024 ALVR |
Multi-Object Hallucination in Vision-Language Models |
Multi-Object Hallucination&Vision-Language Models&Evaluation Protocol |
24.07 |
ASAPP, Inc. |
ACL 2024 Findings |
Enhancing Hallucination Detection through Perturbation-Based Synthetic Data Generation in System Responses |
Hallucination Detection&Synthetic Data&System Responses |
24.07 |
FAR AI |
COLM 2024 |
Transformer Circuit Faithfulness Metrics Are Not Robust |
Transformer Circuits&Ablation Studies&Faithfulness Metrics |
24.07 |
University of Science and Technology of China |
arxiv |
Detect, Investigate, Judge and Determine: A Novel LLM-based Framework for Few-shot Fake News Detection |
Fake News Detection |
24.07 |
Tsinghua University |
arxiv |
Mitigating Entity-Level Hallucination in Large Language Models |
Hallucination&Retrieval Augmented Generation& |
24.07 |
Amazon Web Services |
arxiv |
On Mitigating Code LLM Hallucinations with API Documentation |
API Hallucinations&Code LLMs&Documentation Augmented Generation |
24.07 |
Technical University of Darmstadt |
arxiv |
Fine-grained Hallucination Detection and Mitigation in Long-form Question Answering |
Hallucination Detection&Error Annotation&Factuality |
24.07 |
Heidelberg University |
arxiv |
Truth is Universal: Robust Detection of Lies in LLMs |
Lie Detection&Activation Vectors&Truth Direction |
24.07 |
Shanghai Jiao Tong University |
arxiv |
HALU-J: Critique-Based Hallucination Judge |
Hallucination Detection&Critique-Based Evaluation&Evidence Categorization |
24.07 |
TH Köln – University of Applied Sciences |
CLEF 2024 |
The Two Sides of the Coin: Hallucination Generation and Detection with LLMs as Evaluators for LLMs |
Hallucination Generation&Hallucination Detection&Multilingual Models |
24.07 |
POSTECH |
ECCV 2024 |
BEAF: Observing BEfore-AFter Changes to Evaluate Hallucination in Vision-language Models |
Hallucination&Vision-Language Models |
24.07 |
University College London |
arxiv |
Machine Translation Hallucination Detection for Low and High Resource Languages using Large Language Models |
Machine Translation&Hallucination Detection |
24.07 |
Cornell University |
arxiv |
WILDHALLUCINATIONS: Evaluating Long-form Factuality in LLMs with Real-World Entity Queries |
WildHallucinations&Factuality Evaluation&Real-World Entities |
24.07 |
Columbia University |
ECCV 2024 |
HaloQuest: A Visual Hallucination Dataset for Advancing Multimodal Reasoning |
Hallucination&Vision-Language Models&Datasets |
24.07 |
IBM Research |
ICML 2024 Workshop |
Generation Constraint Scaling Can Mitigate Hallucination |
Hallucination&Memory-Augmented Models |
24.07 |
Harvard-MIT |
arxiv |
The Need for Guardrails with Large Language Models in Medical Safety-Critical Settings: An Artificial Intelligence Application in the Pharmacovigilance Ecosystem |
Pharmacovigilance&Drug Safety&Guardrails |
24.07 |
Illinois Institute of Technology |
arxiv |
Can Editing LLMs Inject Harm? |
Knowledge Editing&Misinformation Injection&Bias Injection |
24.07 |
Stanford University |
arxiv |
Dancing in Chains: Reconciling Instruction Following and Faithfulness in Language Models |
Instruction Following&Faithfulness&Multi-task Learning |
24.07 |
Jilin University |
ACM MM 2024 |
Harmfully Manipulated Images Matter in Multimodal Misinformation Detection |
Social media&Misinformation detection |
24.07 |
Zhejiang University |
COLING 2024 |
Improving Faithfulness of Large Language Models in Summarization via Sliding Generation and Self-Consistency |
Summarization&Faithfulness |
24.08 |
Huazhong University of Science and Technology |
arxiv |
Mitigating Multilingual Hallucination in Large Vision-Language Models |
Large Vision-Language Models&Multilingual Hallucination&Supervised Fine-tuning |
24.08 |
Huazhong University of Science and Technology |
arxiv |
Alleviating Hallucination in Large Vision-Language Models with Active Retrieval Augmentation |
Hallucination&Vision-Language Models (VLMs)&Active Retrieval Augmentation |
24.08 |
DFKI |
UbiComp Companion '24 |
Misinforming LLMs: Vulnerabilities, Challenges and Opportunities |
Misinformation&Trustworthy AI |
24.08 |
Bar Ilan University |
arxiv |
Mitigating Hallucinations in Large Vision-Language Models (LVLMs) via Language-Contrastive Decoding (LCD) |
Large Vision-Language Models&Object Hallucinations&Language-Contrastive Decoding |
24.08 |
University of Liverpool |
arxiv |
Order Matters in Hallucination: Reasoning Order as Benchmark and Reflexive Prompting for Large-Language-Models |
Hallucination&Reasoning Order&Reflexive Prompting |
24.08 |
The Alan Turing Institute |
arxiv |
Large Language Models Can Consistently Generate High-Quality Content for Election Disinformation Operations |
Election Disinformation&DisElect Dataset |
24.08 |
Google DeepMind |
COLM 2024 |
Training Language Models on the Knowledge Graph: Insights on Hallucinations and Their Detectability |
Knowledge Graph&Hallucinations |
24.08 |
The Hong Kong Polytechnic University |
arxiv |
MegaFake: A Theory-Driven Dataset of Fake News Generated by Large Language Models |
Fake News&MegaFake Dataset |
24.08 |
IIT Kharagpur |
arxiv |
Evidence-backed Fact Checking using RAG and Few-Shot In-Context Learning with LLMs |
Fact Checking&RAG&In-Context Learning |
24.08 |
Fudan University |
arxiv |
Improving Factuality in Large Language Models via Decoding-Time Hallucinatory and Truthful Comparators |
Factuality Improvement&Hallucination Mitigation&Decoding-Time Intervention |
24.08 |
The University of Tokyo |
arxiv |
Interactive DualChecker for Mitigating Hallucinations in Distilling Large Language Models |
Hallucination Mitigation&Knowledge Distillation&Large Language Models |
24.08 |
University of Surrey |
IJCAI 2024 |
CodeMirage: Hallucinations in Code Generated by Large Language Models |
Code Hallucinations&CodeMirage Dataset |
24.08 |
Sichuan Normal University |
arxiv |
Can LLM Be a Good Path Planner Based on Prompt Engineering? Mitigating the Hallucination for Path Planning |
Path Planning&Spatial Reasoning&Hallucination Mitigation |
24.08 |
Alibaba Cloud |
arxiv |
LRP4RAG: Detecting Hallucinations in Retrieval-Augmented Generation via Layer-wise Relevance Propagation |
Hallucination Detection&RAG&Layer-wise Relevance Propagation |
24.08 |
Royal Holloway, University of London |
arxiv |
Logic-Enhanced Language Model Agents for Trustworthy Social Simulations |
Social Simulations&Trustworthy AI&Game Theory |
24.09 |
Inria, University of Rennes |
arxiv |
LLMs hallucinate graphs too: a structural perspective |
Large Language Models&Hallucination&Graph Analysis |
24.09 |
Scale AI |
arxiv |
Pre-Training Multimodal Hallucination Detectors with Corrupted Grounding Data |
Multimodal Hallucination&Grounding Data&Sample Efficiency |
24.09 |
Fudan University |
arxiv |
LLM-GAN: Construct Generative Adversarial Network Through Large Language Models For Explainable Fake News Detection |
Explainable Fake News Detection&Generative Adversarial Network |
24.09 |
University of Oslo |
arxiv |
Hallucination Detection in LLMs: Fast and Memory-Efficient Fine-tuned Models |
Hallucination Detection&Memory Efficiency&Ensemble Models |
24.09 |
Univ. Polytechnique Hauts-de-France |
arxiv |
FIDAVL: Fake Image Detection and Attribution using Vision-Language Model |
Fake Image Detection&Vision-Language Model&Synthetic Image Attribution |
24.09 |
EPFL |
arxiv |
LLM Detectors Still Fall Short of Real World: Case of LLM-Generated Short News-Like Posts |
LLM Detectors&Disinformation&Adversarial Evasion |
24.09 |
Geely Automobile Research Institute, Beihang University |
arxiv |
Alleviating Hallucinations in Large Language Models with Scepticism Modeling |
Hallucinations&Scepticism Modeling |
24.09 |
AppCubic, Georgia Institute of Technology |
arxiv |
Securing Large Language Models: Addressing Bias, Misinformation, and Prompt Attacks |
Misinformation&Jailbreak Attacks&Prompt Injection |
24.09 |
Carnegie Mellon University |
arxiv |
AI-LIEDAR: Examine the Trade-off Between Utility and Truthfulness in LLM Agents |
Utility&Truthfulness&LLM Agents |
24.09 |
Salesforce AI Research |
arxiv |
SFR-RAG: Towards Contextually Faithful LLMs |
Retrieval Augmented Generation&Contextual Comprehension&Hallucination Minimization |
24.09 |
University of North Texas |
arxiv |
HALO: Hallucination Analysis and Learning Optimization to Empower LLMs with Retrieval-Augmented Context for Guided Clinical Decision Making |
Hallucination Mitigation&Retrieval Augmented Generation&Medical Question Answering |
24.09 |
Tsinghua University |
arxiv |
Trustworthiness in Retrieval-Augmented Generation Systems: A Survey |
Trustworthiness&RAG |
24.09 |
National University of Defense Technology |
arxiv |
Zero-resource Hallucination Detection for Text Generation via Graph-based Contextual Knowledge Triples Modeling |
Zero-resource Hallucination Detection&Text Generation&Graph-based Knowledge Triples |
24.09 |
The University of Manchester |
arxiv |
FMDLlama: Financial Misinformation Detection based on Large Language Models |
Financial Misinformation Detection&Instruction Tuning&FMDLlama |
24.09 |
University of Montreal |
arxiv |
From Deception to Detection: The Dual Roles of Large Language Models in Fake News |
Fake News&Fake News Detection&Bias Mitigation |
24.09 |
Korea University |
EMNLP 2024 Findings |
Pre-trained Language Models Return Distinguishable Probability Distributions to Unfaithfully Hallucinated Texts |
Hallucination Detection&Unfaithful Texts&Uncertainty Distribution |
24.10 |
The University of Texas at Dallas |
TMLR |
A Unified Hallucination Mitigation Framework for Large Vision-Language Models |
Hallucination Mitigation&Vision-Language Models&Reasoning Queries |
24.09 |
The Chinese University of Hong Kong |
arxiv |
A Survey on the Honesty of Large Language Models |
LLM Honesty&Self-knowledge&Self-expression |
24.09 |
University of Surrey |
arxiv |
MEDHALU: Hallucinations in Responses to Healthcare Queries by Large Language Models |
LLM Hallucinations&Healthcare Queries&Hallucination Detection |
24.09 |
Harvard Medical School |
arxiv |
Wait, but Tylenol is Acetaminophen… Investigating and Improving Language Models' Ability to Resist Requests for Misinformation |
LLM Misinformation Resistance&Healthcare&Instruction Tuning |
24.09 |
Nanjing University of Aeronautics and Astronautics |
arxiv |
HELPD: Mitigating Hallucination of LVLMs by Hierarchical Feedback Learning with Vision-enhanced Penalty Decoding |
Hallucination Mitigation&LVLMs&Feedback Learning |
24.09 |
Sun Yat-sen University |
arxiv |
LLM Hallucinations in Practical Code Generation: Phenomena, Mechanism, and Mitigation |
LLM Hallucinations&Code Generation&Mitigation Strategies |
24.10 |
Technion |
arxiv |
LLMS KNOW MORE THAN THEY SHOW: ON THE INTRINSIC REPRESENTATION OF LLM HALLUCINATIONS |
LLM Hallucinations&Error Detection&Truthfulness Encoding |
24.10 |
Meta |
arxiv |
Ingest-And-Ground: Dispelling Hallucinations from Continually-Pretrained LLMs with RAG |
RAG&Hallucination |
24.10 |
IBM Research |
arxiv |
ST-WebAgentBench: A Benchmark for Evaluating Safety and Trustworthiness in Web Agents |
Web Agents&Safety&Trustworthiness |
24.10 |
National University of Singapore |
arxiv |
Can Knowledge Graphs Make Large Language Models More Trustworthy? An Empirical Study Over Open-ended Question Answering |
Knowledge Graphs&Trustworthiness |
24.10 |
Tongji University |
EMNLP 2024 |
DAMRO: Dive into the Attention Mechanism of LVLM to Reduce Object Hallucination |
LVLM&Object Hallucination&Attention Mechanism |
24.10 |
The University of Sydney, The University of Hong Kong |
arxiv |
NOVO: Norm Voting Off Hallucinations with Attention Heads in Large Language Models |
Hallucination mitigation&Attention heads&Norm voting |
24.10 |
Purdue University |
arxiv |
COLLU-BENCH: A Benchmark for Predicting Language Model Hallucinations in Code |
Code hallucinations&Code generation&Automated program repair |
24.10 |
National University of Sciences and Technology, Rawalpindi Medical University, King Faisal University, Sejong University |
arxiv |
Mitigating Hallucinations Using Ensemble of Knowledge Graph and Vector Store in Large Language Models to Enhance Mental Health Support |
Hallucination mitigation&Knowledge graphs&Mental health support |
24.10 |
Renmin University of China, Kuaishou Technology Co., Ltd., University of International Business and Economics |
ICLR 2025 |
ReDeEP: Detecting Hallucination in Retrieval-Augmented Generation via Mechanistic Interpretability |
Retrieval-Augmented Generation (RAG)&Hallucination detection&Mechanistic interpretability |
24.10 |
Zhejiang University, National University of Singapore |
arxiv |
MLLM CAN SEE? Dynamic Correction Decoding for Hallucination Mitigation |
Hallucination mitigation&Multimodal LLMs&Dynamic correction decoding |
24.10 |
Vectara, Inc., Iowa State University, University of Southern California, Entropy Technologies, University of Waterloo, Funix.io, University of Wisconsin, Madison |
arxiv |
FaithBench: A Diverse Hallucination Benchmark for Summarization by Modern LLMs |
Hallucination detection&Human-annotated benchmark&Faithfulness |
24.10 |
Harbin Institute of Technology (Shenzhen), Huawei Cloud |
arxiv |
MEDICO: Towards Hallucination Detection and Correction with Multi-source Evidence Fusion |
Hallucination detection&Multi-source evidence fusion&Hallucination correction |
24.10 |
Independent Researchers |
KDD 2024 RAG Workshop |
Honest AI: Fine-Tuning "Small" Language Models to Say "I Don’t Know", and Reducing Hallucination in RAG |
Hallucination reduction&Small LLMs&False premise |
24.10 |
University of California Irvine |
arxiv |
From Single to Multi: How LLMs Hallucinate in Multi-Document Summarization |
Multi-Document Summarization&LLM Hallucination&Benchmarking |
24.10 |
Harvard University |
arxiv |
Good Parenting is All You Need: Multi-agentic LLM Hallucination Mitigation |
LLM Hallucination&Multi-agent Systems&Self-reflection |
24.10 |
University of Science and Technology of China |
arxiv |
Coarse-to-Fine Highlighting: Reducing Knowledge Hallucination in Large Language Models |
Knowledge Hallucination&Retrieval-Augmented Language Models&Highlighting Techniques |
24.10 |
McGill University |
arxiv |
Hallucination Detox: Sensitive Neuron Dropout (SEND) for Large Language Model Training |
Hallucination Mitigation&Sensitive Neurons&Training Protocols |
24.10 |
National Taiwan University |
arxiv |
Can Large Audio-Language Models Truly Hear? Tackling Hallucinations with Multi-Task Assessment and Stepwise Audio Reasoning |
Audio-Language Models&Hallucination Analysis&Multi-Task Evaluation |
24.10 |
Mila - Quebec AI Institute |
arxiv |
Multilingual Hallucination Gaps in Large Language Models |
Multilingual Hallucination&FACTSCORE&Low-Resource Languages |
24.10 |
University of Edinburgh |
arxiv |
DECORE: Decoding by Contrasting Retrieval Heads to Mitigate Hallucinations |
Hallucination Mitigation&Contrastive Decoding&Retrieval Heads |
24.10 |
University of Science and Technology of China |
EMNLP 2024 Findings |
Mitigating Hallucinations of Large Language Models in Medical Information Extraction via Contrastive Decoding |
Hallucination Mitigation&Medical Information Extraction&Contrastive Decoding |
24.10 |
University of Science and Technology of China |
ICML 2024 |
Trustworthy Alignment of Retrieval-Augmented Large Language Models via Reinforcement Learning |
Trustworthy Alignment&Reinforcement Learning&Retrieval-Augmented Generation |
24.10 |
Intel Labs |
NeurIPS 2024 Workshop on SafeGenAI |
Debiasing Large Vision-Language Models by Ablating Protected Attribute Representations |
Debiasing&Vision-Language Models&Attribute Ablation |
24.10 |
The Pennsylvania State University |
arxiv |
The Reopening of Pandora’s Box: Analyzing the Role of LLMs in the Evolving Battle Against AI-Generated Fake News |
Fake News Detection&Human-AI Collaboration |
24.10 |
Stellenbosch University |
arxiv |
Investigating the Role of Prompting and External Tools in Hallucination Rates of Large Language Models |
Hallucination Mitigation&Prompt Engineering&External Tools |
24.10 |
Algoverse AI Research |
arxiv |
A Debate-Driven Experiment on LLM Hallucinations and Accuracy |
LLM Hallucinations&Accuracy Improvement&Model Interaction |
24.10 |
Narrative BI |
arxiv |
Beyond Fine-Tuning: Effective Strategies for Mitigating Hallucinations in Large Language Models for Data Analytics |
Hallucination Mitigation&Data Analytics&Prompt Engineering |
24.10 |
HKUST (GZ) |
arxiv |
Maintaining Informative Coherence: Migrating Hallucinations in Large Language Models via Absorbing Markov Chains |
Hallucination Mitigation&Markov Chains |
24.10 |
National Taiwan University |
arxiv |
LLMs are Biased Evaluators But Not Biased for Retrieval Augmented Generation |
Bias Analysis&LLM Evaluation&Retrieval-Augmented Generation |
24.10 |
University Hospital Leipzig |
arxiv |
LLM Robustness Against Misinformation in Biomedical Question Answering |
Biomedical Question Answering&Robustness&Misinformation |
24.10 |
Technion – Israel Institute of Technology |
arxiv |
Distinguishing Ignorance from Error in LLM Hallucinations |
LLM Hallucinations&Error Classification&Knowledge Detection |
24.10 |
The Hong Kong University of Science and Technology |
arxiv |
Unified Triplet-Level Hallucination Evaluation for Large Vision-Language Models |
Vision-Language Models&Hallucination Evaluation&Relation Analysis |
24.10 |
University of Notre Dame, MBZUAI, IBM Research, UW, Peking University |
arxiv |
Justice or Prejudice? Quantifying Biases in LLM-as-a-Judge |
Large Language Models&Bias |
24.11 |
New York University |
arxiv |
Exploring the Knowledge Mismatch Hypothesis: Hallucination Propensity in Small Models Fine-tuned on Data from Larger Models |
Hallucination&Knowledge Mismatch&Fine-tuning |
24.11 |
Nankai University |
arxiv |
Prompt-Guided Internal States for Hallucination Detection of Large Language Models |
Hallucination Detection&Prompt-Guided Internal States&Cross-Domain Generalization |
24.11 |
Georgia Institute of Technology |
arXiv |
LLM Hallucination Reasoning with Zero-shot Knowledge Test |
Hallucination Detection&Zero-shot Methods&Model Knowledge Test |
24.11 |
Shanghai Jiao Tong University |
arxiv |
Seeing Clearly by Layer Two: Enhancing Attention Heads to Alleviate Hallucination in LVLMs |
Multimodal Large Language Models&Hallucination&Attention Mechanism |
24.11 |
Renmin University of China |
arxiv |
Mitigating Hallucination in Multimodal Large Language Models via Hallucination-targeted Direct Preference Optimization |
Multimodal Large Language Models&Hallucination Mitigation&Direct Preference Optimization |
24.11 |
AIRI |
arxiv |
Addressing Hallucinations in Language Models with Knowledge Graph Embeddings as an Additional Modality |
Hallucination Mitigation&Knowledge Graphs |
24.11 |
University of Pennsylvania |
arxiv |
Thinking Before Looking: Improving Multimodal LLM Reasoning via Mitigating Visual Hallucination |
Multimodal Large Language Models&Visual Hallucination&Reasoning Accuracy |
24.11 |
Tsinghua University |
arxiv |
CATCH: Complementary Adaptive Token-level Contrastive Decoding to Mitigate Hallucinations in LVLMs |
Large Vision-Language Models&Hallucination Mitigation&Contrastive Decoding |
24.11 |
Stony Brook University |
arxiv |
A Novel Approach to Eliminating Hallucinations in Large Language Model-Assisted Causal Discovery |
Hallucination&Causal Discovery |
24.11 |
ETH Zürich |
arxiv |
Do I Know This Entity? Knowledge Awareness and Hallucinations in Language Models |
Hallucinations&Knowledge Awareness&Sparse Autoencoders |
24.11 |
Aalborg University |
arxiv |
Knowledge Graphs, Large Language Models, and Hallucinations: An NLP Perspective |
Knowledge Graphs&Hallucinations |
24.11 |
China Telecom Shanghai Company, Ferret Relationship Intelligence |
arxiv |
Enhancing Multi-Agent Consensus through Third-Party LLM Integration: Analyzing Uncertainty and Mitigating Hallucinations in Large Language Models |
Multi-Agent Systems&Hallucination Mitigation&Uncertainty Analysis |