2025 UAI UAI 2025

Order-Optimal Global Convergence for Actor-Critic with General Policy and Neural Critic Parametrization