Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Core AI
Artificial Intelligence
›
Core AI
›
Large Language Models
6405 directly classified papers
Papers per year
2007: 3
2017: 2
2018: 3
2019: 10
2020: 49
2021: 53
2022: 188
2023: 558
2024: 1910
2025: 3619
2026: 10
Papers
Reason-before-Retrieve: One-Stage Reflective Chain-of-Thoughts for Training-Free Zero-Shot Composed Image Retrieval
CVPR 2025
Augmenting Multimodal LLMs with Self-Reflective Tokens for Knowledge-based Visual Question Answering
CVPR 2025
PerLA: Perceptive 3D Language Assistant
CVPR 2025
CoMMIT: Coordinated Multimodal Instruction Tuning
EMNLP 2025
FlashSloth : Lightning Multimodal Large Language Models via Embedded Visual Compression
CVPR 2025
ImageGen-CoT: Enhancing Text-to-Image In-context Learning with Chain-of-Thought Reasoning
ICCV 2025
HSI-GPT: A General-Purpose Large Scene-Motion-Language Model for Human Scene Interaction
CVPR 2025
Are They the Same? Exploring Visual Correspondence Shortcomings of Multimodal LLMs
ICCV 2025
Docopilot: Improving Multimodal Models for Document-Level Understanding
CVPR 2025
Seq2Time: Sequential Knowledge Transfer for Video LLM Temporal Grounding
CVPR 2025
ICP: Immediate Compensation Pruning for Mid-to-high Sparsity
CVPR 2025
Antidote: A Unified Framework for Mitigating LVLM Hallucinations in Counterfactual Presupposition and Object Perception
CVPR 2025
ERFSL: An Efficient Reward Function Searcher via Large Language Models for Custom-Environment Multi-Objective Reinforcement Learning (Student Abstract)
AAAI 2025
EEE-Bench: A Comprehensive Multimodal Electrical And Electronics Engineering Benchmark
CVPR 2025
Oasis: One Image is All You Need for Multimodal Instruction Data Synthesis
ICCV 2025
Beyond Label Semantics: Language-Guided Action Anatomy for Few-shot Action Recognition
ICCV 2025
CheckManual: A New Challenge and Benchmark for Manual-based Appliance Manipulation
CVPR 2025
MiDSummer: Multi-Guidance Diffusion for Controllable Zero-Shot Immersive Gaussian Splatting Scene Generation
ICCV 2025
MicroVQA: A Multimodal Reasoning Benchmark for Microscopy-Based Scientific Research
CVPR 2025
Causality-guided Prompt Learning for Vision-language Models via Visual Granulation
ICCV 2025
3D-LLaVA: Towards Generalist 3D LMMs with Omni Superpoint Transformer
CVPR 2025
Breaking the Encoder Barrier for Seamless Video-Language Understanding
ICCV 2025
DiSciPLE: Learning Interpretable Programs for Scientific Visual Discovery
CVPR 2025
TTD-SQL: Tree-Guided Token Decoding for Efficient and Schema-Aware SQL Generation
EMNLP 2025
Vision-Language Model IP Protection via Prompt-based Learning
CVPR 2025
<
1
2
3
4
5
…
257
>