Papers
2,383 papers found
M3DETR: Multi-Representation, Multi-Scale, Mutual-Relation 3D Object Detection With Transformers
Tianrui Guan, Jun Wang, Shiyi Lan et al.
Relaxing Contrastiveness in Multimodal Representation Learning
Zudi Lin, Erhan Bas, Kunwar Yashraj Singh et al.
VirtualHome Action Genome: A Simulated Spatio-Temporal Scene Graph Dataset With Consistent Relationship Labels
Yue Qiu, Yoshiki Nagasaki, Kensho Hara et al.
Image-Text Pre-Training for Logo Recognition
Mark Hubenthal, Suren Kumar
Dissecting Deep Metric Learning Losses for Image-Text Retrieval
Hong Xuan, Xi (Stephen) Chen
Learning Latent Structural Relations With Message Passing Prior
Shaogang Ren, Hongliang Fei, Dingcheng Li et al.
Full Contextual Attention for Multi-Resolution Transformers in Semantic Segmentation
Loic Themyr, Clément Rambour, Nicolas Thome et al.
Mutual Learning for Long-Tailed Recognition
Changhwa Park, Junho Yim, Eunji Jun
Textual Alchemy: CoFormer for Scene Text Understanding
Gayatri Deshmukh, Onkar Susladkar, Dhruv Makwana et al.
TriCoLo: Trimodal Contrastive Loss for Text To Shape Retrieval
Yue Ruan, Han-Hung Lee, Yiming Zhang et al.
Text-to-Image Editing by Image Information Removal
Zhongping Zhang, Jian Zheng, Zhiyuan Fang et al.
SCoRD: Subject-Conditional Relation Detection With Text-Augmented Data
Ziyan Yang, Kushal Kafle, Zhe Lin et al.
Contextual Affinity Distillation for Image Anomaly Detection
Jie Zhang, Masanori Suganuma, Takayuki Okatani
Assessing Visually-Continuous Corruption Robustness of Neural Networks Relative to Human Performance
Huakun Shen, Boyue Hu, Krzysztof Czarnecki et al.
DocMatcher: Document Image Dewarping via Structural and Textual Line Matching
Felix Hertlein, Alexander Naumann, York Sure-Vetter
Partial Texture VAE: Color and Texture Encoder for Rock Particle Images
Tetsushi Yamada, Simone Di Santo
Localized Gaussian Splatting Editing with Contextual Awareness
Hanyuan Xiao, Yingshu Chen, Huajian Huang et al.
TRH2TQA: Table Recognition with Hierarchical Relationships to Table Question-Answering on Business Table Images
Pongsakorn Jirachanchaisiri, Nam Tuan Ly, Atsuhiro Takasu
NCAP: Scene Text Image Super-Resolution with Non-CAtegorical Prior
Dongwoo Park, Suk Pil Ko
Improving Deep Detector Robustness via Detection-Related Discriminant Maximization and Reorganization
Jung Im Choi, Qizhen Lan, Qing Tian
Multi-Class Textual-Inversion Secretly Yields a Semantic-Agnostic Classifier
Kai Wang, Fei Yang, Bogdan Raducanu et al.
STRinGS: Selective Text Refinement in Gaussian Splatting
Abhinav Raundhal, Gaurav Behera, P. J. Narayanan et al.
START: Spatial and Textual Learning for Chart Understanding
Zhuoming Liu, Xiaofeng Gao, Feiyang Niu et al.
Conditional Text-to-Image Generation with Reference Guidance
Taewook Kim, Ze Wang, Zhengyuan Yang et al.
Guided Texture Segmentation via Coordinate-Aware Class-Ratio Mapping
Bishal Ranjan Swain, Kyung Joo Cheoi, Jaepil Ko