Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Core AI
Artificial Intelligence
›
Core AI
›
Multi-Modal Learning
1457 directly classified papers
Papers per year
2011: 1
2013: 4
2014: 3
2015: 3
2016: 9
2017: 11
2018: 27
2019: 61
2020: 109
2021: 87
2022: 153
2023: 213
2024: 391
2025: 384
2026: 1
Papers
Decomposing and Fusing Intra- and Inter-Sensor Spatio-Temporal Signal for Multi-Sensor Wearable Human Activity Recognition
AAAI 2025
JEN-1 Composer: A Unified Framework for High-Fidelity Multi-Track Music Generation
AAAI 2025
Enhancing Multi-Robot Semantic Navigation Through Multimodal Chain-of-Thought Score Collaboration
AAAI 2025
S3E: Self-Supervised State Estimation for Radar-Inertial System
ICCV 2025
APIRL: Deep Reinforcement Learning for REST API Fuzzing
AAAI 2025
IAA: Inner-Adaptor Architecture Empowers Frozen Large Language Model with Multimodal Capabilities
AAAI 2025
Noisy Correspondence Rectification via Asymmetric Similarity Learning
AAAI 2025
EgoLM: Multi-Modal Language Model of Egocentric Motions
CVPR 2025
CoPEFT: Fast Adaptation Framework for Multi-Agent Collaborative Perception with Parameter-Efficient Fine-Tuning
AAAI 2025
Long Video Diffusion Generation with Segmented Cross-Attention and Content-Rich Video Data Curation
CVPR 2025
CoMT: A Novel Benchmark for Chain of Multi-modal Thought on Large Vision-Language Models
AAAI 2025
MotionBench: Benchmarking and Improving Fine-grained Video Motion Understanding for Vision Language Models
CVPR 2025
Large Multi-modal Models Can Interpret Features in Large Multi-modal Models
ICCV 2025
Towards Long-Horizon Vision-Language Navigation: Platform, Benchmark and Method
CVPR 2025
LRM-LLaVA: Overcoming the Modality Gap of Multilingual Large Language-Vision Model for Low-Resource Languages
AAAI 2025
Multi-View Empowered Structural Graph Wordification for Language Models
AAAI 2025
Retrieval-Augmented Visual Question Answering via Built-in Autoregressive Search Engines
AAAI 2025
Tensorized Attention for Understanding Multi-Object Relationships
AAAI 2025
GNS: Solving Plane Geometry Problems by Neural-Symbolic Reasoning with Multi-Modal LLMs
AAAI 2025
Are VLMs Ready for Autonomous Driving? An Empirical Study from the Reliability, Data and Metric Perspectives
ICCV 2025
Multi-View Incremental Learning with Structured Hebbian Plasticity for Enhanced Fusion Efficiency
AAAI 2025
RefDetector: A Simple Yet Effective Matching-based Method for Referring Expression Comprehension
AAAI 2025
UniMuMo: Unified Text, Music, and Motion Generation
AAAI 2025
Muses: 3D-Controllable Image Generation via Multi-Modal Agent Collaboration
AAAI 2025
SongEditor: Adapting Zero-Shot Song Generation Language Model as a Multi-Task Editor
AAAI 2025
<
1
2
3
4
5
…
59
>