2021
NAACL
NAACL 2021
IIITN NLP at SMM4H 2021 Tasks: Transformer Models for Classification on Health-Related Imbalanced Twitter Datasets
Abstract
AbstractWith increasing users sharing health-related information on social media, there has been a rise in using social media for health monitoring and surveillance. In this paper, we present a system that addresses classic health-related binary classification problems presented in Tasks 1a, 4, and 8 of the 6th edition of Social Media Mining for Health Applications (SMM4H) shared tasks. We developed a system based on RoBERTa (for Task 1a & 4) and BioBERT (for Task 8). Furthermore, we address the challenge of the imbalanced dataset and propose techniques such as undersampling, oversampling, and data augmentation to overcome the imbalanced nature of a given health-related dataset.
🌉
Interdisciplinary Bridge
— Data Science & Analytics and Deep Learning and Healthcare & Medicine and Machine Learning and Natural Language Processing
🐝
Cross-Pollinator
— Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio
Authors
Topics
Deep Learning > Architectures > Transformers
Natural Language Processing > Applications > Text Classification
Healthcare & Medicine > Clinical > Clinical NLP
Data Science & Analytics > Applications > Disease Surveillance
Machine Learning > Learning Types > Classification
Deep Learning > Models > Transformers