Bandit Algorithms for Prophet Inequality and Pandora's Box

ACO Student Seminar
Friday, September 2, 2022 - 1:00pm for 1 hour (actually 50 minutes)
Skiles 368
Yifan Wang – Georgia Tech CS – ywang3782@gatech.edu
Abhishek Dhawan

The Prophet Inequality and Pandora's Box problems are fundamental stochastic problems. A usual assumption for both problems is that the probability distributions of the n underlying random variables are given as input to the algorithm. In this talk, we assume the distributions are unknown, and study them in the Multi-Armed Bandits model: We interact with the unknown distributions over T rounds. In each round we play a policy and receive only bandit feedback. The goal is to minimize the regret, which is the difference in the total value of the optimal algorithm that knows the distributions vs. the total value of our algorithm that learns the distributions from the bandit feedback. Our main results give near-optimal  O(poly(n)sqrt{T}) total regret algorithms for both Prophet Inequality and Pandora's Box.