Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Applications
Computer Vision
›
Applications
›
Document Analysis
74 directly classified papers
Papers per year
2008: 1
2013: 1
2015: 1
2016: 2
2017: 2
2018: 1
2019: 2
2020: 6
2021: 10
2022: 10
2023: 6
2024: 14
2025: 18
Papers
DongbaMIE: A Multimodal Information Extraction Dataset for Evaluating Semantic Understanding of Dongba Pictograms
EMNLP 2025
RFL: Simplifying Chemical Structure Recognition with Ring-Free Language
AAAI 2025
PDFMathTranslate: Scientific Document Translation Preserving Layouts
EMNLP 2025
Adaptive Markup Language Generation for Contextually-Grounded Visual Document Understanding
CVPR 2025
PatentLMM: Large Multimodal Model for Generating Descriptions for Patent Figures
AAAI 2025
Out of Length Text Recognition with Sub-String Matching
AAAI 2025
From Charts to Fair Narratives: Uncovering and Mitigating Geo-Economic Biases in Chart-to-Text
EMNLP 2025
MMDocIR: Benchmarking Multimodal Retrieval for Long Documents
EMNLP 2025
Scaling Text-Rich Image Understanding via Code-Guided Synthetic Multimodal Data Generation
ACL 2025
Zero-Shot Styled Text Image Generation, but Make It Autoregressive
CVPR 2025
Linguistics-aware Masked Image Modeling for Self-supervised Scene Text Recognition
CVPR 2025
SSAN: A Symbol Spatial-Aware Network for Handwritten Mathematical Expression Recognition
AAAI 2025
TAMER: Tree-Aware Transformer for Handwritten Mathematical Expression Recognition
AAAI 2025
InstructOCR: Instruction Boosting Scene Text Spotting
AAAI 2025
Finding Needles in Images: Can Multi-modal LLMs Locate Fine Details?
ACL 2025
DocSAM: Unified Document Image Segmentation via Query Decomposition and Heterogeneous Mixed Learning
CVPR 2025
LongDocURL: a Comprehensive Multimodal Long Document Benchmark Integrating Understanding, Reasoning, and Locating
ACL 2025
SERVAL: Surprisingly Effective Zero-Shot Visual Document Retrieval Powered by Large Vision and Language Models
EMNLP 2025
LRANet: Towards Accurate and Efficient Scene Text Detection with Low-Rank Approximation Network
AAAI 2024
Grab What You Need: Rethinking Complex Table Structure Recognition with Flexible Components Deliberation
AAAI 2024
Modeling Layout Reading Order as Ordering Relations for Visually-rich Document Understanding
EMNLP 2024
Towards Automated Chinese Ancient Character Restoration: A Diffusion-Based Method with a New Dataset
AAAI 2024
Bridging the Gap Between End-to-End and Two-Step Text Spotting
CVPR 2024
Effective Synthetic Data and Test-Time Adaptation for OCR Correction
EMNLP 2024
Enhanced Optical Character Recognition by Optical Sensor Combined with BERT and Cosine Similarity Scoring (Student Abstract)
AAAI 2024
<
1
2
3
>