Papers
2,121 papers found
DiRe: Diversity-promoting Regularization for Dataset Condensation
Saumyaranjan Mohanty, Aravind Reddy, Konda Reddy Mopuri
Beyond the Highlights: Video Retrieval with Salient and Surrounding Contexts
Jaehun Bang, Moon Ye-Bin, Tae-Hyun Oh et al.
LogicCBMs: Logic-Enhanced Concept-Based Learning
Deepika SN Vemuri, Gautham Bellamkonda, Aditya Pola et al.
3DSceneEditor: Controllable 3D Scene Editing with Gaussian Splatting
Ziyang Yan, Yihua Shao, Minwen Liao et al.
SCORE: Soft Label Compression-Centric Dataset Condensation via Coding Rate Optimization
Bowen Yuan, Yuxia Fu, Zijian Wang et al.
SceneProp: Combining Neural Network and Markov Random Field for Scene-Graph Grounding
Keita Otani, Tatsuya Harada
PointSt3R: Point Tracking through 3D Ground Correspondence
Rhodri Guerrier, Adam W. Harley, Dima Damen
FreeCond: Free Lunch in the Input Conditions of Text-Guided Inpainting
Teng-Fang Hsiao, Bo-Kai Ruan, Sung-Lin Tsai et al.
Extending Audio Context for Long-Form Understanding in Large Audio-Language Models
Yuatyong Chaichana, Pittawat Taveekitworachai, Warit Sirichotedumrong et al.
Multimodal Conversation Structure Understanding
Kent K. Chang, Mackenzie Hanh Cramer, Anna Ho et al.
Complexity-aware fine-tuning
Andrey Goncharov, Daniil Vyazhev, Petr Sychev et al.
RECAP: REwriting Conversations for Intent Understanding in Agentic Planning
Kushan Mitra, Dan Zhang, Hannah Kim et al.
Beyond Sampling: Self-Sorting for Long-Context Ranking
Juseon Do, Sungwoo Han, Jingun Kwon et al.
Too Long, Didn’t Model: Decomposing LLM Long Context Understanding With Novels
Sil Hamilton, Rebecca Hicke, Mia Ferrante et al.
Human-Centric Video Generation via Collaborative Multi-Modal Conditioning
Liyang Chen, Tianxiang Ma, Jiawei Liu et al.
Decoupling Continual Semantic Segmentation
Yifu Guo, Yuquan Lu, Wentao Zhang et al.
Understanding Dynamic Scenes in Ego Centric 4D Point Clouds
Junsheng Huang, Shengyu Hao, Bo-Cheng Hu et al.
BokehFlow: Depth-Free Controllable Bokeh Rendering via Flow Matching
Yachuan Huang, Xianrui Luo, Qiwen Wang et al.
Multiple Human Motion Understanding
Lei Li, Sen Jia, Jenq-Neng Hwang
Connecting the Dots: Training-Free Visual Grounding via Agentic Reasoning
Liqin Luo, Guangyao Chen, Xiawu Zheng et al.
Edge Consistency for 4D Gaussian Splatting in Dynamic Scene Rendering
Boya Shi, Thomas N Guan, Yi Xiaodong
FRBAT: Conditionally-Visible Physical Backdoor Attack via Fluorescence
Yalun Wu, Liu Liu, Endong Tong et al.
OmniVDiff: Omni Controllable Video Diffusion for Generation and Understanding
Dianbing Xi, Jiepeng Wang, Yuanzhi Liang et al.
CAG-GS: Consistent Anchor Guided Gaussian Splatting for Large-scale Scene Rendering
Shijie Xu, Qiulei Dong