Papers
2,653 papers found
Learning Generative Models with Visual Attention
Charlie Tang, Nitish Srivastava, Ruslan Salakhutdinov
Weakly-supervised Discovery of Visual Pattern Configurations
Hyun Oh Song, Yong Jae Lee, Stefanie Jegelka et al.
Visalogy: Answering Visual Analogy Questions
Fereshteh Sadeghi, C. Lawrence Zitnick, Ali Farhadi
Learning visual biases from human imagination
Carl Vondrick, Hamed Pirsiavash, Aude Oliva et al.
Deep Visual Analogy-Making
Scott E Reed, Yi Zhang, Yuting Zhang et al.
Supervised Word Mover's Distance
Gao Huang, Chuan Guo, Matt J Kusner et al.
Learnable Visual Markers
Oleg Grinchuk, Vadim Lebedev, Victor Lempitsky
Unsupervised Learning of Spoken Language with Visual Context
David Harwath, Antonio Torralba, James Glass
Multimodal Residual Learning for Visual QA
Jin-Hwa Kim, Sang-Woo Lee, Donghyun Kwak et al.
Hierarchical Question-Image Co-Attention for Visual Question Answering
Jiasen Lu, Jianwei Yang, Dhruv Batra et al.
Learned Region Sparsity and Diversity Also Predicts Visual Attention
Zijun Wei, Hossein Adeli, Minh Hoai Nguyen et al.
Visual Question Answering with Question Representation Update (QRU)
Ruiyu Li, Jiaya Jia
High-Order Attention Models for Visual Question Answering
Idan Schwartz, Alexander Schwing, Tamir Hazan
Best of Both Worlds: Transferring Knowledge from Discriminative Learning to a Generative Visual Dialog Model
Jiasen Lu, Anitha Kannan, Jianwei Yang et al.
Variational Laws of Visual Attention for Dynamic Scenes
Dario Zanca, Marco Gori
Learned in Translation: Contextualized Word Vectors
Bryan McCann, James Bradbury, Caiming Xiong et al.
InfoGAIL: Interpretable Imitation Learning from Visual Demonstrations
Yunzhu Li, Jiaming Song, Stefano Ermon
Learning to See Physics via Visual De-animation
Jiajun Wu, Erika Lu, Pushmeet Kohli et al.
Visual Reference Resolution using Attention Memory for Visual Dialog
Paul Hongsuck Seo, Andreas Lehrmann, Bohyung Han et al.
Modulating early visual processing by language
Harm de Vries, Florian Strub, Jeremie Mary et al.
Visual Interaction Networks: Learning a Physics Simulator from Video
Nicholas Watters, Daniel Zoran, Theophane Weber et al.
Learning multiple visual domains with residual adapters
Sylvestre-Alvise Rebuffi, Hakan Bilen, Andrea Vedaldi
Multimodal Learning and Reasoning for Visual Question Answering
Ilija Ilievski, Jiashi Feng
Dual Path Networks
Yunpeng Chen, Jianan Li, Huaxin Xiao et al.
Learning to Specialize with Knowledge Distillation for Visual Question Answering
Jonghwan Mun, Kimin Lee, Jinwoo Shin et al.