Bandit Algorithms for Prophet Inequality and Pandora's Box

Series
ACO Student Seminar
Time
Friday, September 2, 2022 - 1:00pm for 1 hour (actually 50 minutes)
Location
Skiles 368
Speaker
Yifan Wang – Georgia Tech CS – ywang3782@gatech.eduhttps://litcwyf.github.io/
Organizer
Abhishek Dhawan

The Prophet Inequality and Pandora's Box problems are fundamental stochastic problems. A usual assumption for both problems is that the probability distributions of the n underlying random variables are given as input to the algorithm. In this talk, we assume the distributions are unknown, and study them in the Multi-Armed Bandits model: We interact with the unknown distributions over T rounds. In each round we play a policy and receive only bandit feedback. The goal is to minimize the regret, which is the difference in the total value of the optimal algorithm that knows the distributions vs. the total value of our algorithm that learns the distributions from the bandit feedback. Our main results give near-optimal  O(poly(n)sqrt{T}) total regret algorithms for both Prophet Inequality and Pandora's Box.