Papers
16,557 papers found
Mitigating Self-Preference by Authorship Obfuscation
Taslim Mahbub, Shi Feng
SharedRep-RLHF: A Shared Representation Approach to RLHF with Diverse Preferences
Arpan Mukherjee, Marcello Bullo, Deniz Gündüz
CTPD: Cross Tokenizer Preference Distillation
Truong Nguyen, Phi Van Dat, Ngan Nguyen et al.
Realist and Pluralist Conceptions of Intelligence and Their Implications on AI Research
Ninell Oldenburg, Ruchira Dhar, Anders Søgaard
SMPRO: Self-Supervised Visual Preference Alignment via Differentiable Multi-Preference Multi-Group Ranking
Sirnam Swetha, Rui Meng, Shwetha Ram et al.
When Human Preferences Flip: An Instance-Dependent Robust Loss for RLHF
Yifan Xu, Xichen Ye, Yifan Chen et al.
GEM: Generative Entropy-Guided Preference Modeling for Few-Shot Alignment of LLMs
Yiyang Zhao, Huiyu Bai, Xuejiao Zhao
Mapping on a Budget: Optimizing Spatial Data Collection for ML
Livia Betti, Farooq Sanni, Gnouyaro Z. Sogoyou et al.
Measuring Model Performance in the Presence of an Intervention
Winston Chen, Michael W. Sjoding, Jenna Wiens
Leveraging Sparse Observations to Predict Species Abundance Across Space and Time
Md Zahidul Islam, Cameron S. Fletcher, Ke Sun et al.
Optimizing Ride-Pooling Operations with Extended Pickup and Drop-Off Flexibility
Hao Jiang, Yixin Xu, Pradeep Varakantham
Preference Robustness for DPO with Applications to Public Health
Cheol Woo Kim, Shresth Verma, Mauricio Tec et al.
CyPortQA: Benchmarking Multimodal Large Language Models for Cyclone Preparedness in Port Operation
Chenchen Kuai, Chenhao Wu, Yang Zhou et al.
AlignSurvey: A Comprehensive Benchmark for Human Preferences Alignment in Social Surveys
Chenxi Lin, Weikang Yuan, Zhuoren Jiang et al.
Optimizing Urban Service Allocation with Time-Constrained Restless Bandits
Yi Mao, Andrew Perrault
Aligning Generative Music AI with Human Preferences: Methods and Challenges
Dorien Herremans, Abhinaba Roy
All-Purpose Mean Estimation over R
Jasper C.H. Lee
A Simple Proof-Theoretic Characterization of Stable Models: Reduction to Difference Logic and Experiments (Abstract Reprint)
Martin Gebser, Enrico Giunchiglia, Marco Maratea et al.
Hypertension and Total-Order Forward Decomposition Optimizations (Abstract Reprint)
Maurício Cecílio Magnaguagno, Felipe Meneguzzi, Lavindra de Silva
SICNav: Safe and Interactive Crowd Navigation Using Model Predictive Control and Bilevel Optimization (Abstract Reprint)
Sepehr Samavi, James R. Han, Florian Shkurti et al.
Optimizing Preferential Rate in Retail Lending with Causal Inference and Domain Adaptation
Jimyung Choi, Yujin Lee, Hyeryeong Oh et al.
PRECISE: Reducing the Bias of LLM Evaluations Using Prediction-Powered Ranking Estimation
Abhishek Divekar, Anirban Majumder