Papers
8,506 papers found
AV-Link: Temporally-Aligned Diffusion Features for Cross-Modal Audio-Video Generation
Moayed Haji-Ali, Willi Menapace, Aliaksandr Siarohin et al.
AVTrustBench: Assessing and Enhancing Reliability and Robustness in Audio-Visual LLMs
Sanjoy Chowdhury, Sayan Nag, Subhrajyoti Dasgupta et al.
Axis-level Symmetry Detection with Group-Equivariant Representation
Wongyun Yu, Ahyun Seo, Minsu Cho
BabyVLM: Data-Efficient Pretraining of VLMs Inspired by Infant Learning
Shengao Wang, Arjun Chandra, Aoming Liu et al.
Backdoor Attacks on Neural Networks via One-Bit Flip
Xiang Li, Lannan Luo, Qiang Zeng
Backdoor Defense via Enhanced Splitting and Trap Isolation
Hongrui Yu, Lu Qi, Wanyu Lin et al.
Backdooring Self-Supervised Contrastive Learning by Noisy Alignment
Tuo Chen, Jie Gui, Minjing Dong et al.
Backdoor Mitigation by Distance-Driven Detoxification
Shaokui Wei, Jiayin Liu, Hongyuan Zha
Background Invariance Testing According to Semantic Proximity
Zukang Liao, Min Chen
Back on Track: Bundle Adjustment for Dynamic Scene Reconstruction
Weirong Chen, Ganlin Zhang, Felix Wimbauer et al.
BadVideo: Stealthy Backdoor Attack against Text-to-Video Generation
Ruotong Wang, Mingli Zhu, Jiarong Ou et al.
Baking Gaussian Splatting into Diffusion Denoiser for Fast and Scalable Single-stage Image-to-3D Generation and Reconstruction
Yuanhao Cai, He Zhang, Kai Zhang et al.
Balanced Image Stylization with Style Matching Score
Yuxin Jiang, Liming Jiang, Shuai Yang et al.
Balanced Sharpness-Aware Minimization for Imbalanced Regression
Yahao Liu, Qin Wang, Lixin Duan et al.
Balancing Conservatism and Aggressiveness: Prototype-Affinity Hybrid Network for Few-Shot Segmentation
Tianyu Zou, Shengwu Xiong, Ruilin Yao et al.
Balancing Task-invariant Interaction and Task-specific Adaptation for Unified Image Fusion
Xingyu Hu, Junjun Jiang, Chenyang Wang et al.
BANet: Bilateral Aggregation Network for Mobile Stereo Matching
Gangwei Xu, Jiaxin Liu, Xianqi Wang et al.
BASIC: Boosting Visual Alignment with Intrinsic Refined Embeddings in Multimodal Large Language Models
Jianting Tang, Yubo Wang, Haoyu Cao et al.
BATCLIP: Bimodal Online Test-Time Adaptation for CLIP
Sarthak Maharana, Baoming Zhang, Leonid Karlinsky et al.
Bayesian-Inspired Space-Time Superpixels
Kent Gauen, Stanley Chan
Benchmarking and Learning Multi-Dimensional Quality Evaluator for Text-to-3D Generation
Yujie Zhang, Bingyang Cui, Qi Yang et al.
Benchmarking Burst Super-Resolution for Polarization Images: Noise Dataset and Analysis
Inseung Hwang, Kiseok Choi, Hyunho Ha et al.
Benchmarking Egocentric Visual-Inertial SLAM at City Scale
Anusha Krishnan, Shaohui Liu, Paul-Edouard Sarlin et al.
Benchmarking Multimodal CoT Reward Model Stepwise by Visual Program
Minghe Gao, Xuqi Liu, Zhongqi Yue et al.
Benchmarking Multimodal Large Language Models Against Image Corruptions
Xinkuan Qiu, Meina Kan, Yongbin Zhou et al.