title

booktitle

year

volume

series

address

month

publisher

pdf

url

abstract

layout

id

tex_title

bibtex_author

firstpage

lastpage

page

order

cycles

editor

author

date

container-title

genre

issued

extras

Dueling Bandits with Weak Regret

Proceedings of the 34th International Conference on Machine Learning

2017

70

Proceedings of Machine Learning Research

0

PMLR

http://proceedings.mlr.press/v70/chen17c/chen17c.pdf

http://proceedings.mlr.press/v70/chen17c.html

We consider online content recommendation with implicit feedback through pairwise comparisons, formalized as the so-called dueling bandit problem. We study the dueling bandit problem in the Condorcet winner setting, and consider two notions of regret: the more well-studied strong regret, which is 0 only when both arms pulled are the Condorcet winner; and the less well-studied weak regret, which is 0 if either arm pulled is the Condorcet winner. We propose a new algorithm for this problem, Winner Stays (WS), with variations for each kind of regret: WS for weak regret (WS-W) has expected cumulative weak regret that is $O(N^2)$, and $O(N\log(N))$ if arms have a total order; WS for strong regret (WS-S) has expected cumulative strong regret of $O(N^2 + N \log(T))$, and $O(N\log(N)+N\log(T))$ if arms have a total order. WS-W is the first dueling bandit algorithm with weak regret that is constant in time. WS is simple to compute, even for problems with many arms, and we demonstrate through numerical experiments on simulated and real data that WS has significantly smaller regret than existing algorithms in both the weak- and strong-regret settings.

inproceedings

chen17c

Dueling Bandits with Weak Regret

Bangrui Chen and Peter I. Frazier

731

739

731-739

731

false

given	family
Doina	Precup

given	family
Yee Whye	Teh

given	family
Bangrui	Chen

given	family
Peter I.	Frazier

2017-07-17

Proceedings of the 34th International Conference on Machine Learning

inproceedings

date-parts

2017

7

17

label	link
Supplementary PDF	http://proceedings.mlr.press/v70/chen17c/chen17c-supp.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2017-07-17-chen17c.md

2017-07-17-chen17c.md

Files

2017-07-17-chen17c.md

Latest commit

History

2017-07-17-chen17c.md

File metadata and controls