2017 IJCAI IJCAI 2017

Crowd Learning: Improving Online Decision Making Using Crowdsourced Data

Abstract

We analyze an online learning problem that arises in crowdsourcing systems for users facing crowdsourced data: a user at each discrete time step t can choose K out of a total of N options (bandits), and receives randomly generated rewards dependent on user-specific and option-specific statistics unknown to the user. Each user aims to maximize her expected total rewards over a certain time horizon through a sequence of exploration and exploitation steps. Different from the typical regret/bandit learning setting, in this case a user may also exploit crowdsourced information to augment her learning process, i.e., other users' choices or rewards from using these options. We consider two scenarios, one in which only their choices are shared, and the other in which users share full information including their choices and subsequent rewards. In both cases we derive bounds on the weak regret, the difference between the user's expected total reward and the reward from a user-specific best single-action policy; and show how they improve over their individual-learning counterpart. We also evaluate the performance of our algorithms using simulated data as well as the real-world movie ratings dataset MovieLens.

🧭 Keyword Pioneer — weak regret
🐣 Hot Topic Early Bird — multi-armed bandit
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio
🌉 Interdisciplinary Bridge — Machine Learning and Mathematics & Optimization
📈 Trend Setter — Federated Learning

Authors