2020 EMNLP EMNLP 2020

iobes: Library for Span Level Processing

Abstract

AbstractMany tasks in natural language processing, such as named entity recognition and slot-filling, involve identifying and labeling specific spans of text. In order to leverage common models, these tasks are often recast as sequence labeling tasks. Each token is given a label and these labels are prefixed with special tokens such as B- or I-. After a model assigns labels to each token, these prefixes are used to group the tokens into spans. Properly parsing these annotations is critical for producing fair and comparable metrics; however, despite its importance, there is not an easy-to-use, standardized, programmatically integratable library to help work with span labeling. To remedy this, we introduce our open-source library, iobes. iobes is used for parsing, converting, and processing spans represented as token level decisions.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Natural Language Processing
🧭 Keyword Pioneer — span labeling
🐣 Hot Topic Early Bird — token classification
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Security & Privacy, Speech & Audio

Authors