Visual Detection with Context for Document Layout Analysis

Carlos Soto; Shinjae Yoo

2019 IJCNLP IJCNLP 2019

Visual Detection with Context for Document Layout Analysis

Abstract

AbstractWe present 1) a work in progress method to visually segment key regions of scientific articles using an object detection technique augmented with contextual features, and 2) a novel dataset of region-labeled articles. A continuing challenge in scientific literature mining is the difficulty of consistently extracting high-quality text from formatted PDFs. To address this, we adapt the object-detection technique Faster R-CNN for document layout detection, incorporating contextual information that leverages the inherently localized nature of article contents to improve the region detection performance. Due to the limited availability of high-quality region-labels for scientific articles, we also contribute a novel dataset of region annotations, the first version of which covers 9 region classes and 822 article pages. Initial experimental results demonstrate a 23.9% absolute improvement in mean average precision over the baseline model by incorporating contextual features, and a processing speed 14x faster than a text-based technique. Ongoing work on further improvements is also discussed.

🌱 Topic Pioneer — Document Analysis

🌉 Interdisciplinary Bridge — Computer Science and Computer Vision

🧭 Keyword Pioneer — document layout analysis

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Deep Learning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Speech & Audio

🐣 Hot Topic Early Bird — scientific literature

Authors

Carlos Soto , Shinjae Yoo

Topics

Computer Vision > Analysis > Object Detection Computer Science > Applications > Document Analysis Computer Vision > Domain-Specific > Document Analysis

Keywords

object detection document layout analysis visual context region detection faster r-cnn scientific literature contextual feature scientific article

Download PDF

Related papers

Fine-grained Knowledge Fusion for Sequence Labeling Domain Adaptation 2019

Exploiting Monolingual Data at Scale for Neural Machine Translation 2019

Distributionally Robust Language Modeling 2019

Unsupervised Domain Adaptation of Contextualized Embeddings for Sequence Labeling 2019

ARAML: A Stable Adversarial Training Framework for Text Generation 2019