Blow the Dog Whistle: A Chinese Dataset for Cant Understanding with Common Sense and World Knowledge

Canwen Xu; Wangchunshu Zhou; Tao Ge; Ke Xu; Julian McAuley; Furu Wei

2021 NAACL NAACL 2021

Blow the Dog Whistle: A Chinese Dataset for Cant Understanding with Common Sense and World Knowledge

Abstract

AbstractCant is important for understanding advertising, comedies and dog-whistle politics. However, computational research on cant is hindered by a lack of available datasets. In this paper, we propose a large and diverse Chinese dataset for creating and understanding cant from a computational linguistics perspective. We formulate a task for cant understanding and provide both quantitative and qualitative analysis for tested word embedding similarity and pretrained language models. Experiments suggest that such a task requires deep language understanding, common sense, and world knowledge and thus can be a good testbed for pretrained language models and help models perform better on other tasks.

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Canwen Xu , Wangchunshu Zhou , Tao Ge , Ke Xu , Julian McAuley , Furu Wei

Topics

Natural Language Processing > Understanding > Semantic Analysis Natural Language Processing > Resources & Methods > Text Representation

Keywords

semantic analysis natural language understanding pretrained language model common sense reasoning world knowledge

Download PDF

Related papers

Knowledge Router: Learning Disentangled Representations for Knowledge Graphs 2021

Cross-Task Instance Representation Interactions and Label Dependencies for Joint Information Extraction with Graph Convolutional Networks 2021

Abstract Meaning Representation Guided Graph Encoding and Decoding for Joint Information Extraction 2021

Beyond Fair Pay: Ethical Implications of NLP Crowdsourcing 2021

Probing Word Translations in the Transformer and Trading Decoder for Encoder Layers 2021