Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Core AI
Artificial Intelligence
›
Core AI
›
Multi-Modal Learning
1457 directly classified papers
Papers per year
2011: 1
2013: 4
2014: 3
2015: 3
2016: 9
2017: 11
2018: 27
2019: 61
2020: 109
2021: 87
2022: 153
2023: 213
2024: 391
2025: 384
2026: 1
Papers
EyEar: Learning Audio Synchronized Human Gaze Trajectory Based on Physics-Informed Dynamics
AAAI 2025
Multi-to-Single: Reducing Multimodal Dependency in Emotion Recognition Through Contrastive Learning
AAAI 2025
Pose as a Modality: A Psychology-Inspired Network for Personality Recognition with a New Multimodal Dataset
AAAI 2025
BIG-FUSION: Brain-Inspired Global-Local Context Fusion Framework for Multimodal Emotion Recognition in Conversations
AAAI 2025
CustomContrast: A Multilevel Contrastive Perspective for Subject-Driven Text-to-Image Customization
AAAI 2025
Motion Prior Knowledge Learning with Homogeneous Language Descriptions for Moving Infrared Small Target Detection
AAAI 2025
Cross-View Referring Multi-Object Tracking
AAAI 2025
FIRM: Flexible Interactive Reflection ReMoval
AAAI 2025
Graphic Design with Large Multimodal Model
AAAI 2025
PIXELS: Progressive Image Xemplar-based Editing with Latent Surgery
AAAI 2025
Multi-Pair Temporal Sentence Grounding via Multi-Thread Knowledge Transfer Network
AAAI 2025
VQA4CIR: Boosting Composed Image Retrieval with Visual Question Answering
AAAI 2025
PoseLLaVA: Pose Centric Multimodal LLM for Fine-Grained 3D Pose Manipulation
AAAI 2025
AugRefer: Advancing 3D Visual Grounding via Cross-Modal Augmentation and Spatial Relation-based Referring
AAAI 2025
RefDetector: A Simple Yet Effective Matching-based Method for Referring Expression Comprehension
AAAI 2025
Hierarchical Alignment-enhanced Adaptive Grounding Network for Generalized Referring Expression Comprehension
AAAI 2025
Enhancing Fine-Grained Vision-Language Pretraining with Negative Augmented Samples
AAAI 2025
InstructAvatar: Text-Guided Emotion and Motion Control for Avatar Generation
AAAI 2025
Cross-modulated Attention Transformer for RGBT Tracking
AAAI 2025
Expand VSR Benchmark for VLLM to Expertize in Spatial Rules
AAAI 2025
Few-Shot Incremental Learning via Foreground Aggregation and Knowledge Transfer for Audio-Visual Semantic Segmentation
AAAI 2025
ShotVL: Human-Centric Highlight Frame Retrieval via Language Queries
AAAI 2025
CLIP-MSM: A Multi-Semantic Mapping Brain Representation for Human High-Level Visual Cortex
AAAI 2025
End-to-End Autonomous Driving Through V2X Cooperation
AAAI 2025
Double Entendre: Robust Audio-Based AI-Generated Lyrics Detection via Multi-View Fusion
ACL 2025
<
1
…
8
9
10
…
59
>