Papers
2,737 papers found
Sequential Transformer for End-to-End Video Text Detection
Jun-Bo Zhang, Meng-Biao Zhao, Fei Yin et al.
Dynamic Token-Pass Transformers for Semantic Segmentation
Yuang Liu, Qiang Zhou, Jing Wang et al.
Disentangled Pre-Training for Image Matting
Yanda Li, Zilong Huang, Gang Yu et al.
Torque Based Structured Pruning for Deep Neural Network
Arshita Gupta, Tien Bau, Joonsoo Kim et al.
GRIT: GAN Residuals for Paired Image-to-Image Translation
Saksham Suri, Moustafa Meshry, Larry S. Davis et al.
Learning To Generate Training Datasets for Robust Semantic Segmentation
Marwane Hariat, Olivier Laurent, Rémi Kazmierczak et al.
BroadTrack: Broadcast Camera Tracking for Soccer
Floriane Magera, Thomas Hoyoux, Olivier Barnich et al.
Uncertainty-Aware Regularization for Image-to-Image Translation
Anuja Vats, Ivar Farup, Marius Pedersen et al.
SegBuilder: A Semi-Automatic Annotation Tool for Segmentation
Md Alimoor Reza, Eric Manley, Sean Chen et al.
VideoGameBunny: Towards Vision Assistants for Video Games
Mohammad Reza Taesiri, Cor-Paul Bezemer
HeightMapNet: Explicit Height Modeling for End-to-End HD Map Learning
Wenzhao Qiu, Shanmin Pang, Hao Zhang et al.
Patch Ranking: Token Pruning as Ranking Prediction for Efficient CLIP
Cheng-En Wu, Jinhong Lin, Yu Hen Hu et al.
Towards a Training Free Approach for 3D Scene Editing
Vivek Madhavaram, Shivangana Rawat, Chaitanya Devaguptapu et al.
EgoPoints: Advancing Point Tracking for Egocentric Videos
Ahmad Darkhalil, Rhodri Guerrier, Adam W. Harley et al.
Non-Cross Diffusion for Semantic Consistency
Ziyang Zheng, Ruiyuan Gao, Qiang Xu
MLLM-Tool: A Multimodal Large Language Model for Tool Agent Learning
Chenyu Wang, Weixin Luo, Sixun Dong et al.
Semantic Prompting with Image Token for Continual Learning
Jisu Han, Jaemin Na, Wonjun Hwang
MAISI: Medical AI for Synthetic Imaging
Pengfei Guo, Can Zhao, Dong Yang et al.
ATM: Enhanced Alignment for Text-to-Motion Generation
Ke Han, Yueming Lyu, Weichen Yu et al.
Lorentz Entailment Cone for Semantic Segmentation
Zahid Hasan, Masud Ahmed, Nirmalya Roy
S2O: Static to Openable Enhancement for Articulated 3D Objects
Denys Iliash, Hanxiao Jiang, Yiming Zhang et al.
Diffusion Noise Optimization for Synthetic VLM Training
Ren Ohkubo, Rintaro Yanagi, Hirokatsu Kataoka et al.
TDFlow: Agentic Workflows for Test Driven Development
Kevin Han, Siddharth Maddikayala, Tim Knappe et al.
Live API-Bench: 2500+ Live APIs for Testing Multi-Step Tool Calling
Benjamin Elder, Anupama Murthi, Jungkoo Kang et al.