Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Domain-Specific
Computer Vision
›
Domain-Specific
›
Document Analysis
278 directly classified papers
Papers per year
2005: 1
2007: 1
2009: 1
2011: 1
2013: 2
2014: 1
2015: 1
2016: 1
2017: 3
2018: 7
2019: 10
2020: 19
2021: 16
2022: 31
2023: 44
2024: 43
2025: 94
2026: 2
Papers
LAW: Legal Agentic Workflows for Custody and Fund Services Contracts
COLING 2025
SlideCoder: Layout-aware RAG-enhanced Hierarchical Slide Generation from Design
EMNLP 2025
CalligraphicOCR for Chinese Calligraphy Recognition
EMNLP 2025
Structural Patent Classification Using Label Hierarchy Optimization
EMNLP 2025
Automating the Expansion of Instrument Typicals in Piping and Instrumentation Diagrams (P&IDs)
AAAI 2025
SERVAL: Surprisingly Effective Zero-Shot Visual Document Retrieval Powered by Large Vision and Language Models
EMNLP 2025
MMDocIR: Benchmarking Multimodal Retrieval for Long Documents
EMNLP 2025
WildDoc: How Far Are We from Achieving Comprehensive and Robust Document Understanding in the Wild?
EMNLP 2025
VisFinEval: A Scenario-Driven Chinese Multimodal Benchmark for Holistic Financial Understanding
EMNLP 2025
PreP-OCR: A Complete Pipeline for Document Image Restoration and Enhanced OCR Accuracy
ACL 2025
Any Information Is Just Worth One Single Screenshot: Unifying Search With Visualized Information Retrieval
ACL 2025
NusaAksara: A Multimodal and Multilingual Benchmark for Preserving Indonesian Indigenous Scripts
ACL 2025
SCITAT: A Question Answering Benchmark for Scientific Tables and Text Covering Diverse Reasoning Types
ACL 2025
P²Net: Parallel Pointer-based Network for Key Information Extraction with Complex Layouts
ACL 2025
Towards Storage-Efficient Visual Document Retrieval: An Empirical Study on Reducing Patch-Level Embeddings
ACL 2025
READoc: A Unified Benchmark for Realistic Document Structured Extraction
ACL 2025
AID-Agent: An LLM-Agent for Advanced Extraction and Integration of Documents
ACL 2025
Hidden Forms: A Dataset to Fill Masked Interfaces from Language Commands
ACL 2025
Page Stream Segmentation with LLMs: Challenges and Applications in Insurance Document Automation
COLING 2025
Bringing Suzhou Numerals into the Digital Age: A Dataset and Recognition Study on Ancient Chinese Trade Records
NAACL 2025
PRIM: Towards Practical In-Image Multilingual Machine Translation
EMNLP 2025
Towards Comprehensive Lecture Slides Understanding: Large-scale Dataset and Effective Method
ICCV 2025
M-LongDoc: A Benchmark For Multimodal Super-Long Document Understanding And A Retrieval-Aware Tuning Framework
EMNLP 2025
DocSAM: Unified Document Image Segmentation via Query Decomposition and Heterogeneous Mixed Learning
CVPR 2025
FlexDoc: Parameterized Sampling for Diverse Multilingual Synthetic Documents for Training Document Understanding Models
EMNLP 2025
<
1
2
3
4
5
…
12
>