Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Domain-Specific
Computer Vision
›
Domain-Specific
›
Document Analysis
278 directly classified papers
Papers per year
2005: 1
2007: 1
2009: 1
2011: 1
2013: 2
2014: 1
2015: 1
2016: 1
2017: 3
2018: 7
2019: 10
2020: 19
2021: 16
2022: 31
2023: 44
2024: 43
2025: 94
2026: 2
Papers
CommonForms: A Large, Diverse Dataset for Form Field Detection
WACV 2026
Exploring the Boundaries of Diffusion Models for Offline Writer Identification with Sparse and Intra-Variable Data
WACV 2026
ChartGaze: Enhancing Chart Understanding in LVLMs with Eye-Tracking Guided Attention Refinement
EMNLP 2025
AID-Agent: An LLM-Agent for Advanced Extraction and Integration of Documents
ACL 2025
SlideCoder: Layout-aware RAG-enhanced Hierarchical Slide Generation from Design
EMNLP 2025
M-LongDoc: A Benchmark For Multimodal Super-Long Document Understanding And A Retrieval-Aware Tuning Framework
EMNLP 2025
Towards Storage-Efficient Visual Document Retrieval: An Empirical Study on Reducing Patch-Level Embeddings
ACL 2025
SCITAT: A Question Answering Benchmark for Scientific Tables and Text Covering Diverse Reasoning Types
ACL 2025
FS-DAG: Few Shot Domain Adapting Graph Networks for Visually Rich Document Understanding
COLING 2025
LATTE: Improving Latex Recognition for Tables and Formulae with Iterative Refinement
AAAI 2025
CalligraphicOCR for Chinese Calligraphy Recognition
EMNLP 2025
LAW: Legal Agentic Workflows for Custody and Fund Services Contracts
COLING 2025
PreP-OCR: A Complete Pipeline for Document Image Restoration and Enhanced OCR Accuracy
ACL 2025
Any Information Is Just Worth One Single Screenshot: Unifying Search With Visualized Information Retrieval
ACL 2025
NusaAksara: A Multimodal and Multilingual Benchmark for Preserving Indonesian Indigenous Scripts
ACL 2025
P²Net: Parallel Pointer-based Network for Key Information Extraction with Complex Layouts
ACL 2025
Where is this coming from? Making groundedness count in the evaluation of Document VQA models
NAACL 2025
MEH: A Multi-Style Dataset and Toolkit for Advancing Egyptian Hieroglyph Recognition
ICCV 2025
CISOL: An Open and Extensible Dataset for Table Structure Recognition in the Construction Industry
WACV 2025
TabComp: A Dataset for Visual Table Reading Comprehension
NAACL 2025
Page Stream Segmentation with LLMs: Challenges and Applications in Insurance Document Automation
COLING 2025
Bringing Suzhou Numerals into the Digital Age: A Dataset and Recognition Study on Ancient Chinese Trade Records
NAACL 2025
Towards Comprehensive Lecture Slides Understanding: Large-scale Dataset and Effective Method
ICCV 2025
Automating the Expansion of Instrument Typicals in Piping and Instrumentation Diagrams (P&IDs)
AAAI 2025
READoc: A Unified Benchmark for Realistic Document Structured Extraction
ACL 2025
<
1
2
3
4
5
…
12
>