2024
AAAI
AAAI 2024
A PAC Learning Algorithm for LTL and Omega-Regular Objectives in MDPs
Abstract
Abstract Linear temporal logic (LTL) and omega-regular objectives---a superset of LTL---have seen recent use as a way to express non-Markovian objectives in reinforcement learning. We introduce a model-based probably approximately correct (PAC) learning algorithm for omega-regular objectives in Markov decision processes (MDPs). As part of the development of our algorithm, we introduce the epsilon-recurrence time: a measure of the speed at which a policy converges to the satisfaction of the omega-regular objective in the limit. We prove that our algorithm only requires a polynomial number of samples in the relevant parameters, and perform experiments which confirm our theory.
🌉
Interdisciplinary Bridge
— Artificial Intelligence and Knowledge & Reasoning and Machine Learning and Mathematics & Optimization and Reinforcement Learning
🧭
Keyword Pioneer
— epsilon-recurrence time
🐝
Cross-Pollinator
— Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy
Authors
Topics
Machine Learning > Optimization & Theory > Learning Theory
Reinforcement Learning > Methods > Policy Learning
Knowledge & Reasoning > Reasoning > Automated Planning
Machine Learning > Learning Types > Reinforcement Learning
Artificial Intelligence > Core AI > Reasoning
Mathematics & Optimization > Optimization > Game Theory