Do We Need to Create Big Datasets to Learn a Task?

Swaroop Mishra; Bhavdeep Singh Sachdeva

2020 EMNLP EMNLP 2020

Do We Need to Create Big Datasets to Learn a Task?

Abstract

AbstractDeep Learning research has been largely accelerated by the development of huge datasets such as Imagenet. The general trend has been to create big datasets to make a deep neural network learn. A huge amount of resources is being spent in creating these big datasets, developing models, training them, and iterating this process to dominate leaderboards. We argue that the trend of creating bigger datasets needs to be revised by better leveraging the power of pre-trained language models. Since the language models have already been pre-trained with huge amount of data and have basic linguistic knowledge, there is no need to create big datasets to learn a task. Instead, we need to create a dataset that is sufficient for the model to learn various task-specific terminologies, such as ‘Entailment’, ‘Neutral’, and ‘Contradiction’ for NLI. As evidence, we show that RoBERTA is able to achieve near-equal performance on 2% data of SNLI. We also observe competitive zero-shot generalization on several OOD datasets. In this paper, we propose a baseline algorithm to find the optimal dataset for learning a task.

❓ The Questioner

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning and Machine Learning

🧭 Keyword Pioneer — dataset efficiency

🐣 Hot Topic Early Bird — zero-shot generalization

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Swaroop Mishra , Bhavdeep Singh Sachdeva

Topics

Artificial Intelligence > Learning Paradigms > Transfer Learning Machine Learning > Core Methods > Representation Learning Machine Learning > Learning Types > Zero-Shot Learning Machine Learning > Learning Paradigms > Few-Shot Learning Artificial Intelligence > Core AI > Language Deep Learning > Learning Types > Representation Learning Deep Learning > Learning Types > Transfer Learning

Keywords

sample efficiency few-shot learning transfer learning natural language inference zero-shot generalization data efficiency pre-trained language model task adaptation pretrained language model dataset efficiency

Download PDF

Related papers

Fast semantic parsing with well-typedness guarantees 2020

Detecting Objectifying Language in Online Professor Reviews 2020

Analogous Process Structure Induction for Sub-event Sequence Prediction 2020

Aspect Sentiment Classification with Aspect-Specific Opinion Spans 2020

Robust and Interpretable Grounding of Spatial References with Relation Networks 2020