Department of Computer Science and Automation
Speaker : Mr. Karthik Abinav Sankararaman
Department of Computer Science
University of Maryland
Title : Adversarial Bandits with Knapsacks
Date : Friday, June 07, 2019
Time : 4:00 PM
Venue : CSA Seminar Hall (Room No. 254, First Floor)
“Bandits with Knapsacks” (BwK) is a general model for multi-armed bandits under supply/budget constraints, with motivating examples ranging from dynamic pricing to repeated auctions to dynamic ad allocation to network routing and scheduling. While the prior work on BwK focused on the stochastic version, we pioneer the other extreme in which the outcomes can be chosen adversarially. The objective is to minimize the “competitive ratio”: the ratio of the benchmark reward to algorithm’s reward, as regret minimization is no longer feasible.
We design an algorithm with competitive ratio O(log T) relative to the best fixed distribution over actions, where T is the time horizon; we also prove a matching lower bound. The key conceptual contribution is a new algorithm for the stochastic version of the problem, which builds on the framework of regret minimization in repeated games and admits a substantially simpler analysis compared to prior work. We then analyze this algorithm for the adversarial version, and use it as a subroutine to solve the latter. This approach also leads to several extensions, both for stochastic and adversarial versions such as contextual BwK and Constrained Bandit Convex Optimization.
Based on this (https://arxiv.org/pdf/1811.11881.pdf) joint work with Nicole Immorlica, Rob Schapire and Alex Slivkins.
Host Faculty : Dr. Anand Louis