2025 COLT COLT 2025

Instance-Dependent Regret Bounds for Learning Two-Player Zero-Sum Games with Bandit Feedback