Align-Pro: A Principled Approach to Prompt Optimization for LLM Alignment

Prashant Trivedi; Souradip Chakraborty; Avinash Reddy; Vaneet Aggarwal; Amrit Singh Bedi; George K. Atia

2025 AAAI AAAI 2025

Align-Pro: A Principled Approach to Prompt Optimization for LLM Alignment

Abstract

Abstract The alignment of large language models (LLMs) with human values is critical as these models become increasingly integrated into various societal and decision-making processes. Traditional methods, such as reinforcement learning from human feedback (RLHF), achieve alignment by fine-tuning model parameters, but these approaches are often computationally expensive and impractical when models are frozen or inaccessible for parameter modification. In contrast, prompt optimization is a viable alternative to RLHF for LLM alignment. While the existing literature has shown empirical promise of prompt optimization, its theoretical underpinning remains under-explored. We address this gap by formulating prompt optimization as an optimization problem and try to provide theoretical insights into the optimality of such a framework. To analyze the performance of the prompt optimization, we study theoretical suboptimality bounds and provide insights in terms of how prompt optimization depends upon the given prompter and target model. We also provide empirical validation through experiments on various datasets, demonstrating that prompt optimization can effectively align LLMs, even when parameter fine-tuning is not feasible.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning and Natural Language Processing

🧭 Keyword Pioneer — parameter-efficient alignment

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Security & Privacy, Speech & Audio

Authors

Prashant Trivedi , Souradip Chakraborty , Avinash Reddy , Vaneet Aggarwal , Amrit Singh Bedi , George K. Atia

Topics

Artificial Intelligence > Core AI > AI Safety Machine Learning > Optimization & Theory > Theory Natural Language Processing > Generation > Language Modeling Artificial Intelligence > Core AI > Large Language Models Machine Learning > Learning Types > Prompt Engineering

Keywords

reinforcement learning from human feedback theoretical bound prompt optimization large language model alignment llm alignment parameter-efficient alignment theoretical suboptimality bound

Download PDF

Related papers

BEV-TSR: Text-Scene Retrieval in BEV Space for Autonomous Driving 2025

APIRL: Deep Reinforcement Learning for REST API Fuzzing 2025

Anywhere: A Multi-Agent Framework for User-Guided, Reliable, and Diverse Foreground-Conditioned Image Generation 2025

3CAD: A Large-Scale Real-World 3C Product Dataset for Unsupervised Anomaly Detection 2025

Collaborative Learning for 3D Hand-Object Reconstruction and Compositional Action Recognition from Egocentric RGB Videos Using Superquadrics 2025