Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Learning Types
Machine Learning
›
Learning Types
›
Multi-Modal Learning
1213 directly classified papers
Papers per year
2007: 2
2008: 1
2009: 1
2011: 2
2012: 5
2013: 5
2014: 1
2015: 5
2016: 8
2017: 21
2018: 42
2019: 42
2020: 69
2021: 72
2022: 149
2023: 143
2024: 258
2025: 370
2026: 17
Papers
Fusion of Embeddings Networks for Robust Combination of Text Dependent and Independent Speaker Recognition
INTERSPEECH 2021
Three-Module Modeling For End-to-End Spoken Language Understanding Using Pre-Trained DNN-HMM-Based Acoustic-Phonetic Model
INTERSPEECH 2021
Speak or Chat with Me: End-to-End Spoken Language Understanding System with Flexible Inputs
INTERSPEECH 2021
Embracing Domain Differences in Fake News: Cross-domain Fake News Detection using Multi-modal Data
AAAI 2021
How to leverage the multimodal EHR data for better medical prediction?
EMNLP 2021
Enhanced Audio Tagging via Multi- to Single-Modal Teacher-Student Mutual Learning
AAAI 2021
Uncertainty-Aware Multi-View Representation Learning
AAAI 2021
Noise Estimation Using Density Estimation for Self-Supervised Multimodal Learning
AAAI 2021
PR-Net: Preference Reasoning for Personalized Video Highlight Detection
ICCV 2021
LAReQA: Language-Agnostic Answer Retrieval from a Multilingual Pool
EMNLP 2020
Unsupervised vs. Transfer Learning for Multimodal One-Shot Matching of Speech and Images
INTERSPEECH 2020
Transliteration Based Data Augmentation for Training Multilingual ASR Acoustic Models in Low Resource Settings
INTERSPEECH 2020
Multilingual Speech Recognition with Self-Attention Structured Parameterization
INTERSPEECH 2020
Massively Multilingual ASR: 50 Languages, 1 Model, 1 Billion Parameters
INTERSPEECH 2020
Style Variation as a Vantage Point for Code-Switching
INTERSPEECH 2020
Cross-lingual Spoken Language Understanding with Regularized Representation Alignment
EMNLP 2020
Program Enhanced Fact Verification with Verbalization and Graph Attention Network
EMNLP 2020
HybridQA: A Dataset of Multi-Hop Question Answering over Tabular and Textual Data
EMNLP 2020
An Element-wise Visual-enhanced BiLSTM-CRF Model for Location Name Recognition
EMNLP 2020
Retouchdown: Releasing Touchdown on StreetLearn as a Public Resource for Language Grounding Tasks in Street View
EMNLP 2020
Answer Generation through Unified Memories over Multiple Passages
IJCAI 2020
CopyNext: Explicit Span Copying and Alignment in Sequence to Sequence Models
EMNLP 2020
Facebook AI’s WMT20 News Translation Task Submission
EMNLP 2020
IMRAM: Iterative Matching With Recurrent Attention Memory for Cross-Modal Image-Text Retrieval
CVPR 2020
Telling Left From Right: Learning Spatial Correspondence of Sight and Sound
CVPR 2020
<
1
…
40
41
42
…
49
>