Warm-Starting Nested Rollout Policy Adaptation with Optimal Stopping

Chen Dang; Cristina Bazgan; Tristan Cazenave; Morgan Chopin; Pierre-Henri Wuillemin

2023 AAAI AAAI 2023

Warm-Starting Nested Rollout Policy Adaptation with Optimal Stopping

Abstract

Abstract Nested Rollout Policy Adaptation (NRPA) is an approach using online learning policies in a nested structure. It has achieved a great result in a variety of difficult combinatorial optimization problems. In this paper, we propose Meta-NRPA, which combines optimal stopping theory with NRPA for warm-starting and significantly improves the performance of NRPA. We also present several exploratory techniques for NRPA which enable it to perform better exploration. We establish this for three notoriously difficult problems ranging from telecommunication, transportation and coding theory namely Minimum Congestion Shortest Path Routing, Traveling Salesman Problem with Time Windows and Snake-in-the-Box. We also improve the lower bounds of the Snake-in-the-Box problem for multiple dimensions.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Mathematics & Optimization and Reinforcement Learning

🧭 Keyword Pioneer — nested rollout policy adaptation

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Chen Dang , Cristina Bazgan , Tristan Cazenave , Morgan Chopin , Pierre-Henri Wuillemin

Topics

Artificial Intelligence > Core AI > Agent Systems Reinforcement Learning > Methods > Policy Learning Mathematics & Optimization > Optimization > Combinatorial Optimization

Keywords

combinatorial optimization online learning traveling salesman problem optimal stopping nested rollout policy adaptation online learning policy

Download PDF

Related papers

A Model-Agnostic Heuristics for Selective Classification 2023

Tackling Safe and Efficient Multi-Agent Reinforcement Learning via Dynamic Shielding (Student Abstract) 2023

Head-Free Lightweight Semantic Segmentation with Linear Transformer 2023

Hierarchical ConViT with Attention-Based Relational Reasoner for Visual Analogical Reasoning 2023

Deep Spiking Neural Networks with High Representation Similarity Model Visual Pathways of Macaque and Mouse 2023