Generative Flow Networks as Entropy-Regularized RL

Daniil Tiapkin; Nikita Morozov; Alexey Naumov; Dmitry P Vetrov

2024 AISTATS AISTATS 2024

Generative Flow Networks as Entropy-Regularized RL

Abstract

The recently proposed generative flow networks (GFlowNets) are a method of training a policy to sample compositional discrete objects with probabilities proportional to a given reward via a sequence of actions. GFlowNets exploit the sequential nature of the problem, drawing parallels with reinforcement learning (RL). Our work extends the connection between RL and GFlowNets to a general case. We demonstrate how the task of learning a generative flow network can be efficiently redefined as an entropy-regularized RL problem with a specific reward and regularizer structure. Furthermore, we illustrate the practical efficiency of this reformulation by applying standard soft RL algorithms to GFlowNet training across several probabilistic modeling tasks. Contrary to previously reported results, we show that entropic RL approaches can be competitive against established GFlowNet training methods. This perspective opens a direct path for integrating RL principles into the realm of generative flow networks.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning and Reinforcement Learning

🧭 Keyword Pioneer — soft reinforcement learning

🐝 Cross-Pollinator — Artificial Intelligence, Deep Learning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning

Authors

Daniil Tiapkin , Nikita Morozov , Alexey Naumov , Dmitry P Vetrov

Topics

Artificial Intelligence > Bayesian & Probabilistic > Probabilistic Modeling Machine Learning > Learning Types > Self-Supervised Learning Deep Learning > Models > Generative Models Reinforcement Learning > Methods > Policy Learning Machine Learning > Learning Types > Reinforcement Learning Deep Learning > Learning Types > Reinforcement Learning

Keywords

probabilistic modeling policy learning reward function entropy regularization generative flow network soft reinforcement learning compositional sampling reward structure entropy-regularized reinforcement learning

Download PDF

Related papers

Causal Bandits with General Causal Models and Interventions 2024

Boundary-Aware Uncertainty for Feature Attribution Explainers 2024

Better Representations via Adversarial Training in Pre-Training: A Theoretical Perspective 2024

A Primal-Dual-Critic Algorithm for Offline Constrained Reinforcement Learning 2024

Pure Exploration in Bandits with Linear Constraints 2024