2025 ICML ICML 2025

Towards Efficient Online Tuning of VLM Agents via Counterfactual Soft Reinforcement Learning