PRIORITY2REWARD: Incorporating Healthworker Preferences for Resource Allocation Planning

Shresth Verma; Alayna Nguyen; Niclas Boehmer; Lingkai Kong; Milind Tambe

2025 AAAI AAAI 2025

PRIORITY2REWARD: Incorporating Healthworker Preferences for Resource Allocation Planning

Abstract

Abstract In this paper, we present PRIORITY2REWARD a Large Language Model (LLM) based application which incorporates health worker preferences for resource allocation planning in public health programs. LLMs are increasingly used to design reward functions based on human preferences in Reinforcement Learning problems. We focus on LLM-designed rewards for Restless Multi-Armed Bandits, a framework for allocating limited resources among agents. In the context of public health, our approach empowers grassroots health workers to tailor automated allocation decisions to community needs. We showcase a simulated application of PRIORITY2REWARD for a large-scale mobile health program in India. The tool allows health workers to enter natural language preferences and leverages LLMs to search for reward functions aligned with the preference. Our tool then dynamically showcases how the LLM generated reward function modifies the policy outcomes with respect to different demographic groups in the population. This can help inform policy implementation at a community level.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Healthcare & Medicine and Machine Learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Shresth Verma , Alayna Nguyen , Niclas Boehmer , Lingkai Kong , Milind Tambe

Topics

Artificial Intelligence > Core AI > Multi-Agent Systems Artificial Intelligence > Learning Paradigms > Transfer Learning Machine Learning > Core Methods > Classification Artificial Intelligence > Core AI > Large Language Models Machine Learning > Learning Types > Multi-Armed Bandits Healthcare & Medicine > Clinical > Medical AI

Keywords

resource allocation reward function preference modeling multi-armed bandit restless multi-armed bandit public health large language model

Download PDF

Related papers

BEV-TSR: Text-Scene Retrieval in BEV Space for Autonomous Driving 2025

APIRL: Deep Reinforcement Learning for REST API Fuzzing 2025

Anywhere: A Multi-Agent Framework for User-Guided, Reliable, and Diverse Foreground-Conditioned Image Generation 2025

3CAD: A Large-Scale Real-World 3C Product Dataset for Unsupervised Anomaly Detection 2025

Collaborative Learning for 3D Hand-Object Reconstruction and Compositional Action Recognition from Egocentric RGB Videos Using Superquadrics 2025