← Domain-Specific

Computer Vision › Domain-Specific ›

Document Analysis

278 directly classified papers

Papers per year

Papers

Textron: Weakly Supervised Multilingual Text Detection Through Data Programming WACV 2024

Benchmarking Visually-Situated Translation of Text in Natural Images EMNLP 2024

Modeling Layout Reading Order as Ordering Relations for Visually-rich Document Understanding EMNLP 2024

De-Identification of Sensitive Personal Data in Datasets Derived from IIT-CDIP EMNLP 2024

ArxivDIGESTables: Synthesizing Scientific Literature into Tables using Language Models EMNLP 2024

Post-Correction of Historical Text Transcripts with Large Language Models: An Exploratory Study EACL 2024

VGBench: Evaluating Large Language Models on Vector Graphics Understanding and Generation EMNLP 2024

SpreadsheetBench: Towards Challenging Real World Spreadsheet Manipulation NIPS 2024

Uncovering the Handwritten Text in the Margins: End-to-end Handwritten Text Detection and Recognition EACL 2024

A One-Shot Learning Approach To Document Layout Segmentation of Ancient Arabic Manuscripts WACV 2024

Muharaf: Manuscripts of Handwritten Arabic Dataset for Cursive Text Recognition NIPS 2024

TabPedia: Towards Comprehensive Visual Table Understanding with Concept Synergy NIPS 2024

TinyChart: Efficient Chart Understanding with Program-of-Thoughts Learning and Visual Token Merging EMNLP 2024

SEMv3: A Fast and Robust Approach to Table Separation Line Detection IJCAI 2024

Reading between the Lines: Image-Based Order Detection in OCR for Chinese Historical Documents AAAI 2024

DocFormerv2: Local Features for Document Understanding AAAI 2024

TextNeRF: A Novel Scene-Text Image Synthesis Method based on Neural Radiance Fields CVPR 2024

Choose What You Need: Disentangled Representation Learning for Scene Text Recognition Removal and Editing CVPR 2024

Bridging the Gap Between End-to-End and Two-Step Text Spotting CVPR 2024

Layout-Agnostic Scene Text Image Synthesis with Diffusion Models CVPR 2024

Enhancing Vision-Language Pre-training with Rich Supervisions CVPR 2024

Texture-Preserving Diffusion Models for High-Fidelity Virtual Try-On CVPR 2024

Language, OCR, Form Independent (LOFI) pipeline for Industrial Document Information Extraction EMNLP 2024

The Manga Whisperer: Automatically Generating Transcriptions for Comics CVPR 2024

PPTC Benchmark: Evaluating Large Language Models for PowerPoint Task Completion ACL 2024