SciBERT: A Pretrained Language Model for Scientific Text

Iz Beltagy; Kyle Lo; Arman Cohan

2019 EMNLP EMNLP 2019

SciBERT: A Pretrained Language Model for Scientific Text

Abstract

AbstractObtaining large-scale annotated data for NLP tasks in the scientific domain is challenging and expensive. We release SciBERT, a pretrained language model based on BERT (Devlin et. al., 2018) to address the lack of high-quality, large-scale labeled scientific data. SciBERT leverages unsupervised pretraining on a large multi-domain corpus of scientific publications to improve performance on downstream scientific NLP tasks. We evaluate on a suite of tasks including sequence tagging, sentence classification and dependency parsing, with datasets from a variety of scientific domains. We demonstrate statistically significant improvements over BERT and achieve new state-of-the-art results on several of these tasks. The code and pretrained models are available at https://github.com/allenai/scibert/.

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning and Natural Language Processing

📈 Trend Setter — Pretraining

🐣 Hot Topic Early Bird — sequence tagging

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Security & Privacy, Speech & Audio

Authors

Iz Beltagy , Kyle Lo , Arman Cohan

Topics

Machine Learning > Learning Types > Self-Supervised Learning Deep Learning > Techniques > Pretraining Natural Language Processing > Applications > Text Classification Natural Language Processing > Resources & Methods > Large Language Models Deep Learning > Models > Transformers Deep Learning > Models > Language Models Natural Language Processing > Resources & Methods > Pretraining

Keywords

sequence labeling text classification unsupervised pretraining dependency parsing sequence tagging pretrained language model sentence classification scientific text

Download PDF

Related papers

Read, Attend and Comment: A Deep Architecture for Automatic News Comment Generation 2019

Chains-of-Reasoning at TextGraphs 2019 Shared Task: Reasoning over Chains of Facts for Explainable Multi-hop Inference 2019

A Boundary-aware Neural Model for Nested Named Entity Recognition 2019

Iterative Dual Domain Adaptation for Neural Machine Translation 2019

A Multi-Pairwise Extension of Procrustes Analysis for Multilingual Word Translation 2019