Detect Profane Language in Streaming Services to Protect Young Audiences

Jingxiang Chen; Kai Wei; Xiang Hao

2021 ACL ACL 2021

Detect Profane Language in Streaming Services to Protect Young Audiences

Abstract

AbstractWith the rapid growth of online video streaming, recent years have seen increasing concerns about profane language in their content. Detecting profane language in streaming services is challenging due to the long sentences appeared in a video. While recent research on handling long sentences has focused on developing deep learning modeling techniques, little work has focused on techniques on improving data pipelines. In this work, we develop a data collection pipeline to address long sequence of texts and integrate this pipeline with a multi-head self-attention model. With this pipeline, our experiments show the self-attention model offers 12.5% relative accuracy improvement over state-of-the-art distilBERT model on profane language detection while requiring only 3% of parameters. This research designs a better system for informing users of profane language in video streaming services.

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning

🧭 Keyword Pioneer — data pipeline

🐣 Hot Topic Early Bird — content moderation

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Jingxiang Chen , Kai Wei , Xiang Hao

Topics

Machine Learning > Core Methods > Classification Machine Learning > Application Areas > Efficient Computing Deep Learning > Architectures > Transformers

Keywords

transformer architecture self-attention mechanism text classification content moderation deep learning self-attention model data pipeline multi-head self-attention multi-head attention long sequence transformer model profane language detection streaming media streaming service

Download PDF

Related papers

Out-of-Scope Intent Detection with Self-Supervision and Discriminative Training 2021

A Non-Autoregressive Edit-Based Approach to Controllable Text Simplification 2021

How Did This Get Funded?! Automatically Identifying Quirky Scientific Achievements 2021

Exploring Discourse Structures for Argument Impact Classification 2021

Language Embeddings for Typology and Cross-lingual Transfer Learning 2021