← Learning Types

Machine Learning › Learning Types ›

Multi-Modal Learning

1213 directly classified papers

Papers per year

Papers

Solar Multimodal Transformer: Intraday Solar Irradiance Predictor using Public Cameras and Time Series WACV 2025

Unleashing Potentials of Vision-Language Models for Zero-Shot HOI Detection WACV 2025

Trio Innovators @ DravidianLangTech 2025: Multimodal Hate Speech Detection in Dravidian Languages NAACL 2025

ReFu: Recursive Fusion for Exemplar-Free 3D Class-Incremental Learning WACV 2025

RGB-D Video Mirror Detection WACV 2025

Optimizing Vision-Language Model for Road Crossing Intention Estimation WACV 2025

Using Multimodal Models for Informative Classification of Ambiguous Tweets in Crisis Response NAACL 2025

Semantically Conditioned Prompts for Visual Recognition under Missing Modality Scenarios WACV 2025

Cross-Domain Multi-Modal Few-Shot Object Detection via Rich Text WACV 2025

OccFlowNet: Occupancy Estimation via Differentiable Rendering and Occupancy Flow WACV 2025

Endogenous Recovery via Within-modality Prototypes for Incomplete Multimodal Hashing IJCAI 2025

PureForest: A Large-Scale Aerial Lidar and Aerial Imagery Dataset for Tree Species Classification in Monospecific Forests WACV 2025

Representation Learning with Mutual Influence of Modalities for Node Classification in Multi-Modal Heterogeneous Networks IJCAI 2025

Going Beyond Consistency: Target-oriented Multi-view Graph Neural Network IJCAI 2025

Consensus-Guided Incomplete Multi-view Clustering via Cross-view Affinities Learning IJCAI 2025

Beyond Base Predictors: Using LLMs to Resolve Ambiguities in Akkadian Lemmatization NAACL 2025

Exploring Multimodal Foundation AI and Expert-in-the-Loop for Sustainable Management of Wild Salmon Fisheries in Indigenous Rivers IJCAI 2025

Harnessing Vision Models for Time Series Analysis: A Survey IJCAI 2025

Mind the Gap: Aligning Vision Foundation Models to Image Feature Matching ICCV 2025

MagicID: Hybrid Preference Optimization for ID-Consistent and Dynamic-Preserved Video Customization ICCV 2025

Beyond Label Semantics: Language-Guided Action Anatomy for Few-shot Action Recognition ICCV 2025

Multi-modal Multi-platform Person Re-Identification: Benchmark and Method ICCV 2025

HRScene: How Far Are VLMs from Effective High-Resolution Image Understanding? ICCV 2025

EgoAdapt: Adaptive Multisensory Distillation and Policy Learning for Efficient Egocentric Perception ICCV 2025

Beyond RGB: Adaptive Parallel Processing for RAW Object Detection ICCV 2025