Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Applications
Natural Language Processing
›
Applications
›
Document Analysis
94 directly classified papers
Papers per year
2017: 3
2018: 2
2019: 4
2020: 10
2021: 16
2022: 11
2023: 20
2024: 12
2025: 16
Papers
DocLayLLM: An Efficient Multi-modal Extension of Large Language Models for Text-rich Document Understanding
CVPR 2025
LAW: Legal Agentic Workflows for Custody and Fund Services Contracts
COLING 2025
Arctic-TILT. Business Document Understanding at Sub-Billion Scale
ACL 2025
Beyond Human Labels: A Multi-Linguistic Auto-Generated Benchmark for Evaluating Large Language Models on Resume Parsing
EMNLP 2025
REVISE: A Framework for Revising OCRed text in Practical Information Systems with Data Contamination Strategy
ACL 2025
DocMamba: Efficient Document Pre-training with State Space Model
AAAI 2025
RelationalCoder: Rethinking Complex Tables via Programmatic Relational Transformation
ACL 2025
A Survey on Patent Analysis: From NLP to Multimodal AI
ACL 2025
Understanding the Gap: an Analysis of Research Collaborations in NLP and Language Documentation
ACL 2025
ComicScene154: A Scene Dataset for Comic Analysis
EMNLP 2025
TableCoder: Table Extraction from Text via Reliable Code Generation
ACL 2025
Docopilot: Improving Multimodal Models for Document-Level Understanding
CVPR 2025
A Simple yet Effective Layout Token in Large Language Models for Document Understanding
CVPR 2025
M-LongDoc: A Benchmark For Multimodal Super-Long Document Understanding And A Retrieval-Aware Tuning Framework
EMNLP 2025
Page Stream Segmentation with LLMs: Challenges and Applications in Insurance Document Automation
COLING 2025
LATTE: Improving Latex Recognition for Tables and Formulae with Iterative Refinement
AAAI 2025
Seg2Act: Global Context-aware Action Generation for Document Logical Structuring
EMNLP 2024
M2Doc: A Multi-Modal Fusion Approach for Document Layout Analysis
AAAI 2024
Patentformer: A Novel Method to Automate the Generation of Patent Applications
EMNLP 2024
Leveraging Collection-Wide Similarities for Unsupervised Document Structure Extraction
ACL 2024
ArxivDIGESTables: Synthesizing Scientific Literature into Tables using Language Models
EMNLP 2024
“What is the value of templates?” Rethinking Document Information Extraction Datasets for LLMs
EMNLP 2024
UDA: A Benchmark Suite for Retrieval Augmented Generation in Real-World Document Analysis
NIPS 2024
Overview of the Fourth Workshop on Scholarly Document Processing
ACL 2024
Modeling Layout Reading Order as Ordering Relations for Visually-rich Document Understanding
EMNLP 2024
<
1
2
3
4
>