← Learning Types

Machine Learning › Learning Types ›

Multi-Modal Learning

1213 directly classified papers

Papers per year

Papers

Fusion of Embeddings Networks for Robust Combination of Text Dependent and Independent Speaker Recognition INTERSPEECH 2021

Three-Module Modeling For End-to-End Spoken Language Understanding Using Pre-Trained DNN-HMM-Based Acoustic-Phonetic Model INTERSPEECH 2021

Speak or Chat with Me: End-to-End Spoken Language Understanding System with Flexible Inputs INTERSPEECH 2021

Embracing Domain Differences in Fake News: Cross-domain Fake News Detection using Multi-modal Data AAAI 2021

How to leverage the multimodal EHR data for better medical prediction? EMNLP 2021

Enhanced Audio Tagging via Multi- to Single-Modal Teacher-Student Mutual Learning AAAI 2021

Uncertainty-Aware Multi-View Representation Learning AAAI 2021

Noise Estimation Using Density Estimation for Self-Supervised Multimodal Learning AAAI 2021

PR-Net: Preference Reasoning for Personalized Video Highlight Detection ICCV 2021

LAReQA: Language-Agnostic Answer Retrieval from a Multilingual Pool EMNLP 2020

Unsupervised vs. Transfer Learning for Multimodal One-Shot Matching of Speech and Images INTERSPEECH 2020

Transliteration Based Data Augmentation for Training Multilingual ASR Acoustic Models in Low Resource Settings INTERSPEECH 2020

Multilingual Speech Recognition with Self-Attention Structured Parameterization INTERSPEECH 2020

Massively Multilingual ASR: 50 Languages, 1 Model, 1 Billion Parameters INTERSPEECH 2020

Style Variation as a Vantage Point for Code-Switching INTERSPEECH 2020

Cross-lingual Spoken Language Understanding with Regularized Representation Alignment EMNLP 2020

Program Enhanced Fact Verification with Verbalization and Graph Attention Network EMNLP 2020

HybridQA: A Dataset of Multi-Hop Question Answering over Tabular and Textual Data EMNLP 2020

An Element-wise Visual-enhanced BiLSTM-CRF Model for Location Name Recognition EMNLP 2020

Retouchdown: Releasing Touchdown on StreetLearn as a Public Resource for Language Grounding Tasks in Street View EMNLP 2020

Answer Generation through Unified Memories over Multiple Passages IJCAI 2020

CopyNext: Explicit Span Copying and Alignment in Sequence to Sequence Models EMNLP 2020

Facebook AI’s WMT20 News Translation Task Submission EMNLP 2020

IMRAM: Iterative Matching With Recurrent Attention Memory for Cross-Modal Image-Text Retrieval CVPR 2020

Telling Left From Right: Learning Spatial Correspondence of Sight and Sound CVPR 2020