Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Domain-Specific
Computer Vision
›
Domain-Specific
›
Document Analysis
278 directly classified papers
Papers per year
2005: 1
2007: 1
2009: 1
2011: 1
2013: 2
2014: 1
2015: 1
2016: 1
2017: 3
2018: 7
2019: 10
2020: 19
2021: 16
2022: 31
2023: 44
2024: 43
2025: 94
2026: 2
Papers
LATTE: Improving Latex Recognition for Tables and Formulae with Iterative Refinement
AAAI 2025
Continuous Fingerspelling Dataset for Indian Sign Language
IJCNLP 2025
LAW: Legal Agentic Workflows for Custody and Fund Services Contracts
COLING 2025
CalligraphicOCR for Chinese Calligraphy Recognition
EMNLP 2025
SlideCoder: Layout-aware RAG-enhanced Hierarchical Slide Generation from Design
EMNLP 2025
M-LongDoc: A Benchmark For Multimodal Super-Long Document Understanding And A Retrieval-Aware Tuning Framework
EMNLP 2025
ChartGaze: Enhancing Chart Understanding in LVLMs with Eye-Tracking Guided Attention Refinement
EMNLP 2025
PRIM: Towards Practical In-Image Multilingual Machine Translation
EMNLP 2025
TVQACML: Benchmarking Text-Centric Visual Question Answering in Multilingual Chinese Minority Languages
EMNLP 2025
SheetDesigner: MLLM-Powered Spreadsheet Layout Generation with Rule-Based and Vision-Based Reflection
EMNLP 2025
MultiDocFusion : Hierarchical and Multimodal Chunking Pipeline for Enhanced RAG on Long Industrial Documents
EMNLP 2025
WildDoc: How Far Are We from Achieving Comprehensive and Robust Document Understanding in the Wild?
EMNLP 2025
VisFinEval: A Scenario-Driven Chinese Multimodal Benchmark for Holistic Financial Understanding
EMNLP 2025
SimpleDoc: Multi‐Modal Document Understanding with Dual‐Cue Page Retrieval and Iterative Refinement
EMNLP 2025
SERVAL: Surprisingly Effective Zero-Shot Visual Document Retrieval Powered by Large Vision and Language Models
EMNLP 2025
Is Cognition Consistent with Perception? Assessing and Mitigating Multimodal Knowledge Conflicts in Document Understanding
EMNLP 2025
MMDocIR: Benchmarking Multimodal Retrieval for Long Documents
EMNLP 2025
CHURRO: Making History Readable with an Open-Weight Large Vision-Language Model for High-Accuracy, Low-Cost Historical Text Recognition
EMNLP 2025
GraDeT-HTR: A Resource-Efficient Bengali Handwritten Text Recognition System utilizing Grapheme-based Tokenizer and Decoder-only Transformer
EMNLP 2025
PDFMathTranslate: Scientific Document Translation Preserving Layouts
EMNLP 2025
FlexDoc: Parameterized Sampling for Diverse Multilingual Synthetic Documents for Training Document Understanding Models
EMNLP 2025
Structural Patent Classification Using Label Hierarchy Optimization
EMNLP 2025
CourtNav: Voice-Guided, Anchor-Accurate Navigation of Long Legal Documents in Courtrooms
EMNLP 2025
Any Information Is Just Worth One Single Screenshot: Unifying Search With Visualized Information Retrieval
ACL 2025
Arctic-TILT. Business Document Understanding at Sub-Billion Scale
ACL 2025
<
1
2
3
4
5
…
12
>