TWEETQA: A Social Media Focused Question Answering Dataset

Wenhan Xiong; Jiawei Wu; Hong Wang; Vivek Kulkarni; Mo Yu; Shiyu Chang; Xiaoxiao Guo; William Yang Wang

2019 ACL ACL 2019

TWEETQA: A Social Media Focused Question Answering Dataset

Abstract

AbstractWith social media becoming increasingly popular on which lots of news and real-time events are reported, developing automated question answering systems is critical to the effective-ness of many applications that rely on real-time knowledge. While previous datasets have concentrated on question answering (QA) for formal text like news and Wikipedia, we present the first large-scale dataset for QA over social media data. To ensure that the tweets we collected are useful, we only gather tweets used by journalists to write news articles. We then ask human annotators to write questions and answers upon these tweets. Unlike otherQA datasets like SQuAD in which the answers are extractive, we allow the answers to be abstractive. We show that two recently proposed neural models that perform well on formal texts are limited in their performance when applied to our dataset. In addition, even the fine-tuned BERT model is still lagging behind human performance with a large margin. Our results thus point to the need of improved QA systems targeting social media text.

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning and Natural Language Processing

🧭 Keyword Pioneer — fine-tuned bert

🐣 Hot Topic Early Bird — social media text

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Wenhan Xiong , Jiawei Wu , Hong Wang , Vivek Kulkarni , Mo Yu , Shiyu Chang , Xiaoxiao Guo , William Yang Wang

Topics

Natural Language Processing > Applications > Question Answering Natural Language Processing > Resources & Methods > Text Representation Machine Learning > Learning Types > Deep Learning Deep Learning > Techniques > Transfer Learning Machine Learning > Application Areas > Information Retrieval

Keywords

question answering bert model social media neural model social media text neural network fine-tuned bert abstractive answer abstractive answering

Download PDF

Related papers

What do phone embeddings learn about Phonology? 2019

Unsupervised Morphological Segmentation for Low-Resource Polysynthetic Languages 2019

Understanding Undesirable Word Embedding Associations 2019

Inferential Machine Comprehension: Answering Questions by Recursively Deducing the Evidence Chain from Text 2019

Domain Adaptation of Neural Machine Translation by Lexicon Induction 2019