Papers
36,276 papers found
Abstractive Summarization of Bengali Academic Videos Based on Audio Subtitles
Lamisa Bintee Mizan Deya, Farhatun Shama, Abdul Aziz et al.
Common Sense or Ableism? Rethinking Commonsense Reasoning Through the Lens of Disability
Karina H Halevy, Kimi Wenzel, Seyun Kim et al.
Dialogue is Better Than Monologue: Instructing Meidcal LLMs via Strategic Conversations
Zijie Liu, Xinyu Zhao, Jie Peng et al.
GRAFF: GRaph-Augmented Fine-grained Fusion for Large Language Models
Himanshu Chaudhary, Ruida Wang, Gowtham Ramesh et al.
KNN-SSD: Enabling Dynamic Self-Speculative Decoding via Nearest Neighbor Layer Set Optimization
Mingbo Song, Heming Xia, Jun Zhang et al.
Learning to Ask: Multi-Decoder Fine-Tuning for Multi-Hop Visual Question Generation with External Knowledge
Arpan Phukan, Manish Gupta, Asif Ekbal
Persona Jailbreaking in Large Language Models
Jivnesh Sandhan, Fei Cheng, Tushar Sandhan et al.
SCATR: Mitigating New Instance Suppression in LiDAR-based Tracking-by-Attention via Second Chance Assignment and Track Query Dropout
Brian Cheong, Letian Wang, Sandro Papais et al.
2026
WACV
SD-E2: Semantic Exploration for Reasoning Under Token Budgets
Kshitij Mishra, Nils Lukas, Salem Lahlou
$∞$-Video: A Training-Free Approach to Long Video Understanding via Continuous-Time Memory Consolidation
Saul Santos, António Farinhas, Daniel C Mcnamee et al.
$Door(s)$: Junction State Estimation for Efficient Exploration in Reinforcement Learning
Benjamin Fele, Jan Babic
$f$-PO: Generalizing Preference Optimization with $f$-divergence Minimization
Jiaqi Han, Mingjian Jiang, Yuxuan Song et al.
$K^2$VAE: A Koopman-Kalman Enhanced Variational AutoEncoder for Probabilistic Time Series Forecasting
Xingjian Wu, Xiangfei Qiu, Hongfan Gao et al.
$\mathcalVista\mathcalDPO$: Video Hierarchical Spatial-Temporal Direct Preference Optimization for Large Video Models
Haojian Huang, Haodong Chen, Shengqiong Wu et al.
$\mathrmμ$nit Scaling: Simple and Scalable FP8 LLM Training
Saaketh Narayan, Abhay Gupta, Mansheej Paul et al.
$\pi_0.5$: a Vision-Language-Action Model with Open-World Generalization
Kevin Black, Noah Brown, James Darpinian et al.
$S^2$FGL: Spatial Spectral Federated Graph Learning
Zihan Tan, Suyuan Huang, Guancheng Wan et al.
$\textttI$^2$MoE$: Interpretable Multimodal Interaction-aware Mixture-of-Experts
Jiayi Xin, Sukwon Yun, Jie Peng et al.
$\textttSPIN$: distilling $\textttSkill-RRT$ for long-horizon prehensile and non-prehensile manipulation
Haewon Jung, Donguk Lee, Haecheol Park et al.
$α$-RACER: Real-Time Algorithm for Game-Theoretic Motion Planning and Control in Autonomous Racing using Near-Potential Function
Dvij Kalaria, Chinmay Maheshwari, Shankar Sastry
$β$-th order Acyclicity Derivatives for DAG Learning
Madhumitha Shridharan, Garud Iyengar
$σ$-Maximal Ancestral Graphs
Binghua Yao, Joris Marten Mooij
3D Acetabular Surface Reconstruction from 2D Pre-operative X-ray Images using SRVF Elastic Registration and Deformation Graph
Shuai Zhang, Jinliang Wang, Xu Wang et al.
3D Dynamic Prediction of Missing Teeth in Diverse Patterns via Centroid-prompted Diffusion Model
Zongrui Ji, Na Li, Peng Xue et al.