Papers
22,524 papers found
Efficient LLM-Jailbreaking via Multimodal-LLM Jailbreak
Haoxuan Ji, Zheng Lin, Zhenxing Niu et al.
The Other Mind: How Language Models Exhibit Human Temporal Cognition
Lingyu Li, Yang Yao, Yixu Wang et al.
GeoShield: Safeguarding Geolocation Privacy from Vision-Language Models via Adversarial Perturbations
Xinwei Liu, Xiaojun Jia, Yuan Xun et al.
BLM-Guard: Explainable Multimodal Ad Moderation with Chain-of-Thought and Policy-Aligned Rewards
Yiran Yang, Zhaowei Liu, Yuan Yuan et al.
SafeR-CLIP: Mitigating NSFW Content in Vision-Language Models While Preserving Pre-Trained Knowledge
Adeel Yousaf, Joseph Fioresi, James Beetham et al.
First-Order Representation Languages for Goal-Conditioned RL
Simon Ståhlberg, Hector Geffner
Targeting in Multi-Criteria Decision Making
Nicolas Schwind, Patricia Everaere, Sébastien Konieczny et al.
MegaCoin: Enhancing Medium-Grained Color Perception for Vision-Language Models
Ming-Chang Chiu, Shicheng Wen, Pin-Yu Chen et al.
AMaPO: Adaptive Margin-attached Preference Optimization for Language Model Alignment
Ruibo Deng, Duanyu Feng, Wenqiang Lei
LieCraft: A Multi-Agent Framework for Evaluating Deceptive Capabilities in Language Models
Matthew Lyle Olson, Neale Ratzlaff, Musashi Hinck et al.
Backdoor Attacks on Open Vocabulary Object Detectors via Multi-Modal Prompt Tuning
Ankita Raj, Chetan Arora
Chain-of-Thought Driven Adversarial Scenario Extrapolation for Robust Language Models
Md Rafi Ur Rashid, Vishnu Asutosh Dasu, Ye Wang et al.
Polarity-Aware Probing for Quantifying Latent Alignment in Language Models
Sabrina Sadiekh, Elena Ericheva, Chirag Agarwal
EASE: Practical and Efficient Safety Alignment for Small Language Models
Haonan Shi, Guoli Wang, Tu Ouyang et al.
Beyond Verdicts: Evaluating Language Model Moral Competence
Aaron J Snoswell, Daniel Kilov, Seth Lazar
Benchmarking Trustworthiness in Multimodal LLMs for Video Understanding
Youze Wang, Zijun Chen, Ruoyu Chen et al.
Safe Multi-agent Reinforcement Learning with Natural Language Constraints
Ziyan Wang, Meng Fang, Tristan Tomilin et al.
MCA-Bench: A Multimodal Benchmark for Evaluating CAPTCHA Robustness Against VLM-based Attacks
Zonglin Wu, Yule Xue, Yaoyao Feng et al.
Multi-Faceted Attack: Exposing Cross-Model Vulnerabilities in Defense-Equipped Vision-Language Models
Yijun Yang, Lichao Wang, Jianping Zhang et al.
Can LLMs Detect Their Confabulations? Estimating Reliability in Uncertainty-Aware Language Models
Tianyi Zhou, Johanne Medina, Sanjay Chawla
On the Feasibility of Using MultiModal LLMs to Execute AR Social Engineering Attacks
Ting Bi, Chenghang Ye, Zheyu Yang et al.
SatSolarCast: A Flexible Framework for Multimodal Solar Irradiance Forecasting via Memory-Alignment Learning
Kuai Dai, Hui Su, Chengxing Zhai et al.
Crossing Borders: A Multimodal Challenge for Indian Poetry Translation and Image Generation
Sofia Jamil, Kotla Sai Charan, Sriparna Saha et al.
Language Models and Logic Programs for Trustworthy Tax Reasoning
William Jurayj, Nils Holzenberger, Benjamin Van Durme
TRACE: Textual Relevance Augmentation and Contextual Encoding for Multimodal Hate Detection
Girish A. Koushik, Helen Treharne, Aditya Joshi et al.