Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Learning Types
Machine Learning
›
Learning Types
›
Exploration-Exploitation
154 directly classified papers
Papers per year
2006: 1
2008: 2
2009: 1
2010: 3
2011: 3
2012: 4
2013: 3
2014: 8
2015: 5
2016: 4
2017: 11
2018: 11
2019: 21
2020: 12
2021: 23
2022: 18
2023: 11
2024: 6
2025: 7
Papers
Trading Off Quality and Uncertainty Through Multi-Objective Optimisation in Batch Bayesian Optimisation
AAAI 2025
Non-stochastic Budgeted Online Pricing with Semi-Bandit Feedback
AAAI 2025
p-Mean Regret for Stochastic Bandits
AAAI 2025
𝜙-Decoding: Adaptive Foresight Sampling for Balanced Inference-Time Exploration and Exploitation
ACL 2025
Batch Ensemble for Variance Dependent Regret in Stochastic Bandits
AAAI 2025
Bayesian Optimization for Unknown Cost-Varying Variable Subsets with No-Regret Costs
AAAI 2025
User Feedback Alignment for LLM-powered Exploration in Large-scale Recommendation Systems
ACL 2025
Convergence of No-Swap-Regret Dynamics in Self-Play
NIPS 2024
A Nearly Optimal and Low-Switching Algorithm for Reinforcement Learning with General Function Approximation
NIPS 2024
Improving Health Information Access in the World’s Largest Maternal Mobile Health Program via Bandit Algorithms
AAAI 2024
Using Adaptive Bandit Experiments to Increase and Investigate Engagement in Mental Health
AAAI 2024
Unveiling User Satisfaction and Creator Productivity Trade-Offs in Recommendation Platforms
NIPS 2024
On the Minimax Regret for Contextual Linear Bandits and Multi-Armed Bandits with Expert Advice
NIPS 2024
CEM: Constrained Entropy Maximization for Task-Agnostic Safe Exploration
AAAI 2023
On the Sublinear Regret of GP-UCB
NIPS 2023
Exploring the Sensitivity of LLMs’ Decision-Making Capabilities: Insights from Prompt Variations and Hyperparameters
EMNLP 2023
Stochastic Multi-armed Bandits: Optimal Trade-off among Optimality, Consistency, and Tail Risk
NIPS 2023
Efficient Explorative Key-Term Selection Strategies for Conversational Contextual Bandits
AAAI 2023
Optimistic Whittle Index Policy: Online Learning for Restless Bandits
AAAI 2023
Maximize to Explore: One Objective Function Fusing Estimation, Planning, and Exploration
NIPS 2023
Fully Dynamic Online Selection through Online Contention Resolution Schemes
AAAI 2023
Regret Minimization via Saddle Point Optimization
NIPS 2023
Multi-Fidelity Multi-Armed Bandits Revisited
NIPS 2023
High-dimensional Contextual Bandit Problem without Sparsity
NIPS 2023
Stochastic Goal Recognition Design Problems with Suboptimal Agents
AAAI 2022
<
1
2
3
4
5
6
7
>