Merging Generated and Retrieved Knowledge for Open-Domain QA

Yunxiang Zhang; Muhammad Khalifa; Lajanugen Logeswaran; Moontae Lee; Honglak Lee; Lu Wang

2023 EMNLP EMNLP 2023

Merging Generated and Retrieved Knowledge for Open-Domain QA

Abstract

AbstractOpen-domain question answering (QA) systems are often built with retrieval modules. However, retrieving passages from a given source is known to suffer from insufficient knowledge coverage. Alternatively, prompting large language models (LLMs) to generate contextual passages based on their parametric knowledge has been shown to improve QA performance. Yet, LLMs tend to “hallucinate” content that conflicts with the retrieved knowledge. Based on the intuition that answers supported by both sources are more likely to be correct, we propose COMBO, a Compatibility-Oriented knowledge Merging for Better Open-domain QA framework, to effectively leverage the two sources of information. Concretely, we match LLM-generated passages with retrieved counterparts into compatible pairs, based on discriminators trained with silver compatibility labels. Then a Fusion-in-Decoder-based reader model handles passage pairs to arrive at the final answer. Experiments show that COMBO outperforms competitive baselines on three out of four tested open-domain QA benchmarks. Further analysis reveals that our proposed framework demonstrates greater efficacy in scenarios with a higher degree of knowledge conflicts.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning and Natural Language Processing

🧭 Keyword Pioneer — passage fusion

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Yunxiang Zhang , Muhammad Khalifa , Lajanugen Logeswaran , Moontae Lee , Honglak Lee , Lu Wang

Topics

Artificial Intelligence > Core AI > Multimodal Learning Machine Learning > Application Areas > Knowledge Distillation Natural Language Processing > Applications > Question Answering Machine Learning > Learning Types > Retrieval-Augmented Generation

Keywords

question answering retrieval-augmented generation knowledge retrieval open-domain question answering passage fusion large language model knowledge fusion

Download PDF

Related papers

Exploring Linguistic Probes for Morphological Generalization 2023

NameGuess: Column Name Expansion for Tabular Data 2023

Vision-Enhanced Semantic Entity Recognition in Document Images via Visually-Asymmetric Consistency Learning 2023

Improving Conversational Recommendation Systems via Bias Analysis and Language-Model-Enhanced Data Augmentation 2023

On the Calibration of Large Language Models and Alignment 2023