Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Generation
Computer Vision
›
Generation
›
Image Captioning
781 directly classified papers
Papers per year
2003: 1
2008: 1
2011: 1
2012: 1
2013: 5
2014: 2
2015: 21
2016: 17
2017: 36
2018: 47
2019: 92
2020: 73
2021: 96
2022: 91
2023: 107
2024: 86
2025: 96
2026: 8
Papers
DIUSum: Dynamic Image Utilization for Multimodal Summarization
AAAI 2024
Cycle-Consistency Learning for Captioning and Grounding
AAAI 2024
MeaCap: Memory-Augmented Zero-shot Image Captioning
CVPR 2024
Visual Fact Checker: Enabling High-Fidelity Detailed Caption Generation
CVPR 2024
AAST-NLP at Multimodal Hate Speech Event Detection 2024 : A Multimodal Approach for Classification of Text-Embedded Images Based on CLIP and BERT-Based Models.
EACL 2024
LVD-2M: A Long-take Video Dataset with Temporally Dense Captions
NIPS 2024
Multilingual Synopses of Movie Narratives: A Dataset for Vision-Language Story Understanding
EMNLP 2024
Fine-tuning CLIP Text Encoders with Two-step Paraphrasing
EACL 2024
Towards Better Vision-Inspired Vision-Language Models
CVPR 2024
Mining Fine-Grained Image-Text Alignment for Zero-Shot Captioning via Text-Only Training
AAAI 2024
Divide and Conquer Radiology Report Generation via Observation Level Fine-grained Pretraining and Prompt Tuning
EMNLP 2024
No Culture Left Behind: ArtELingo-28, a Benchmark of WikiArt with Captions in 28 Languages
EMNLP 2024
Masking Latent Gender Knowledge for Debiasing Image Captioning
NAACL 2024
LaDiC: Are Diffusion Models Really Inferior to Autoregressive Counterparts for Image-to-Text Generation?
NAACL 2024
KALE: An Artwork Image Captioning System Augmented with Heterogeneous Graph
IJCAI 2024
Visual Enhanced Entity-Level Interaction Network for Multimodal Summarization
NAACL 2024
ShareGPT4Video: Improving Video Understanding and Generation with Better Captions
NIPS 2024
Context-aware Difference Distilling for Multi-change Captioning
ACL 2024
Complex Organ Mask Guided Radiology Report Generation
WACV 2024
Enhancing Argument Summarization: Prioritizing Exhaustiveness in Key Point Generation and Introducing an Automatic Coverage Evaluation Metric
NAACL 2024
FIRE: Food Image to REcipe Generation
WACV 2024
MIVC: Multiple Instance Visual Component for Visual-Language Models
WACV 2024
Semantic Map-based Generation of Navigation Instructions
COLING 2024
Noise-Aware Image Captioning with Progressively Exploring Mismatched Words
AAAI 2024
ImageCaptioner2: Image Captioner for Image Captioning Bias Amplification Assessment
AAAI 2024
<
1
…
5
6
7
…
32
>