2024 ICML ICML 2024

Towards Global Optimality for Practical Average Reward Reinforcement Learning without Mixing Time Oracles