← Domain-Specific

Computer Vision › Domain-Specific ›

Document Analysis

278 directly classified papers

Papers per year

Papers

CommonForms: A Large, Diverse Dataset for Form Field Detection WACV 2026

Exploring the Boundaries of Diffusion Models for Offline Writer Identification with Sparse and Intra-Variable Data WACV 2026

ChartGaze: Enhancing Chart Understanding in LVLMs with Eye-Tracking Guided Attention Refinement EMNLP 2025

AID-Agent: An LLM-Agent for Advanced Extraction and Integration of Documents ACL 2025

SlideCoder: Layout-aware RAG-enhanced Hierarchical Slide Generation from Design EMNLP 2025

M-LongDoc: A Benchmark For Multimodal Super-Long Document Understanding And A Retrieval-Aware Tuning Framework EMNLP 2025

Towards Storage-Efficient Visual Document Retrieval: An Empirical Study on Reducing Patch-Level Embeddings ACL 2025

SCITAT: A Question Answering Benchmark for Scientific Tables and Text Covering Diverse Reasoning Types ACL 2025

FS-DAG: Few Shot Domain Adapting Graph Networks for Visually Rich Document Understanding COLING 2025

LATTE: Improving Latex Recognition for Tables and Formulae with Iterative Refinement AAAI 2025

CalligraphicOCR for Chinese Calligraphy Recognition EMNLP 2025

LAW: Legal Agentic Workflows for Custody and Fund Services Contracts COLING 2025

PreP-OCR: A Complete Pipeline for Document Image Restoration and Enhanced OCR Accuracy ACL 2025

Any Information Is Just Worth One Single Screenshot: Unifying Search With Visualized Information Retrieval ACL 2025

NusaAksara: A Multimodal and Multilingual Benchmark for Preserving Indonesian Indigenous Scripts ACL 2025

P²Net: Parallel Pointer-based Network for Key Information Extraction with Complex Layouts ACL 2025

Where is this coming from? Making groundedness count in the evaluation of Document VQA models NAACL 2025

MEH: A Multi-Style Dataset and Toolkit for Advancing Egyptian Hieroglyph Recognition ICCV 2025

CISOL: An Open and Extensible Dataset for Table Structure Recognition in the Construction Industry WACV 2025

TabComp: A Dataset for Visual Table Reading Comprehension NAACL 2025

Page Stream Segmentation with LLMs: Challenges and Applications in Insurance Document Automation COLING 2025

Bringing Suzhou Numerals into the Digital Age: A Dataset and Recognition Study on Ancient Chinese Trade Records NAACL 2025

Towards Comprehensive Lecture Slides Understanding: Large-scale Dataset and Effective Method ICCV 2025

Automating the Expansion of Instrument Typicals in Piping and Instrumentation Diagrams (P&IDs) AAAI 2025

READoc: A Unified Benchmark for Realistic Document Structured Extraction ACL 2025