2025 AISTATS AISTATS 2025

Bilevel Reinforcement Learning via the Development of Hyper-gradient without Lower-Level Convexity