← Learning Types

Machine Learning › Learning Types ›

Exploration-Exploitation

154 directly classified papers

Papers per year

Papers

Trading Off Quality and Uncertainty Through Multi-Objective Optimisation in Batch Bayesian Optimisation AAAI 2025

Non-stochastic Budgeted Online Pricing with Semi-Bandit Feedback AAAI 2025

p-Mean Regret for Stochastic Bandits AAAI 2025

𝜙-Decoding: Adaptive Foresight Sampling for Balanced Inference-Time Exploration and Exploitation ACL 2025

Batch Ensemble for Variance Dependent Regret in Stochastic Bandits AAAI 2025

Bayesian Optimization for Unknown Cost-Varying Variable Subsets with No-Regret Costs AAAI 2025

User Feedback Alignment for LLM-powered Exploration in Large-scale Recommendation Systems ACL 2025

Convergence of No-Swap-Regret Dynamics in Self-Play NIPS 2024

A Nearly Optimal and Low-Switching Algorithm for Reinforcement Learning with General Function Approximation NIPS 2024

Improving Health Information Access in the World’s Largest Maternal Mobile Health Program via Bandit Algorithms AAAI 2024

Using Adaptive Bandit Experiments to Increase and Investigate Engagement in Mental Health AAAI 2024

Unveiling User Satisfaction and Creator Productivity Trade-Offs in Recommendation Platforms NIPS 2024

On the Minimax Regret for Contextual Linear Bandits and Multi-Armed Bandits with Expert Advice NIPS 2024

CEM: Constrained Entropy Maximization for Task-Agnostic Safe Exploration AAAI 2023

On the Sublinear Regret of GP-UCB NIPS 2023

Exploring the Sensitivity of LLMs’ Decision-Making Capabilities: Insights from Prompt Variations and Hyperparameters EMNLP 2023

Stochastic Multi-armed Bandits: Optimal Trade-off among Optimality, Consistency, and Tail Risk NIPS 2023

Efficient Explorative Key-Term Selection Strategies for Conversational Contextual Bandits AAAI 2023

Optimistic Whittle Index Policy: Online Learning for Restless Bandits AAAI 2023

Maximize to Explore: One Objective Function Fusing Estimation, Planning, and Exploration NIPS 2023

Fully Dynamic Online Selection through Online Contention Resolution Schemes AAAI 2023

Regret Minimization via Saddle Point Optimization NIPS 2023

Multi-Fidelity Multi-Armed Bandits Revisited NIPS 2023

High-dimensional Contextual Bandit Problem without Sparsity NIPS 2023

Stochastic Goal Recognition Design Problems with Suboptimal Agents AAAI 2022