Visual Detection with Context for Document Layout Analysis

Carlos Soto; Shinjae Yoo

2019 EMNLP EMNLP 2019

Visual Detection with Context for Document Layout Analysis

Abstract

AbstractWe present 1) a work in progress method to visually segment key regions of scientific articles using an object detection technique augmented with contextual features, and 2) a novel dataset of region-labeled articles. A continuing challenge in scientific literature mining is the difficulty of consistently extracting high-quality text from formatted PDFs. To address this, we adapt the object-detection technique Faster R-CNN for document layout detection, incorporating contextual information that leverages the inherently localized nature of article contents to improve the region detection performance. Due to the limited availability of high-quality region-labels for scientific articles, we also contribute a novel dataset of region annotations, the first version of which covers 9 region classes and 822 article pages. Initial experimental results demonstrate a 23.9% absolute improvement in mean average precision over the baseline model by incorporating contextual features, and a processing speed 14x faster than a text-based technique. Ongoing work on further improvements is also discussed.

🌉 Interdisciplinary Bridge — Computer Science and Computer Vision and Deep Learning

📈 Trend Setter — Document Analysis

🧭 Keyword Pioneer — document layout analysis

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Carlos Soto , Shinjae Yoo

Topics

Computer Vision > Analysis > Object Detection Computer Vision > Analysis > Scene Understanding Computer Science > Applications > Document Analysis Computer Vision > Domain-Specific > Document Analysis Deep Learning > Architectures > Convolutional Neural Networks Computer Vision > Applications > Document Analysis

Keywords

object detection document layout analysis region detection faster r-cnn pdf parsing contextual feature context feature region segmentation

Download PDF

Related papers

Read, Attend and Comment: A Deep Architecture for Automatic News Comment Generation 2019

Chains-of-Reasoning at TextGraphs 2019 Shared Task: Reasoning over Chains of Facts for Explainable Multi-hop Inference 2019

A Boundary-aware Neural Model for Nested Named Entity Recognition 2019

Iterative Dual Domain Adaptation for Neural Machine Translation 2019

A Multi-Pairwise Extension of Procrustes Analysis for Multilingual Word Translation 2019