Skip to content

Latest commit

 

History

History
59 lines (59 loc) · 2.12 KB

2017-07-17-busa-fekete17a.md

File metadata and controls

59 lines (59 loc) · 2.12 KB
title booktitle year volume series address month publisher pdf url abstract layout id tex_title bibtex_author firstpage lastpage page order cycles editor author date container-title genre issued extras
Multi-objective Bandits: Optimizing the Generalized Gini Index
Proceedings of the 34th International Conference on Machine Learning
2017
70
Proceedings of Machine Learning Research
0
PMLR
We study the multi-armed bandit (MAB) problem where the agent receives a vectorial feedback that encodes many possibly competing objectives to be optimized. The goal of the agent is to find a policy, which can optimize these objectives simultaneously in a fair way. This multi-objective online optimization problem is formalized by using the Generalized Gini Index (GGI) aggregation function. We propose an online gradient descent algorithm which exploits the convexity of the GGI aggregation function, and controls the exploration in a careful way achieving a distribution-free regret $\tilde{O}(T^{-1/2} )$ with high probability. We test our algorithm on synthetic data as well as on an electric battery control problem where the goal is to trade off the use of the different cells of a battery in order to balance their respective degradation rates.
inproceedings
busa-fekete17a
Multi-objective Bandits: Optimizing the Generalized {G}ini Index
R{\'o}bert Busa-Fekete and Bal{\'a}zs Sz{\"o}r{\'e}nyi and Paul Weng and Shie Mannor
625
634
625-634
625
false
given family
Doina
Precup
given family
Yee Whye
Teh
given family
Róbert
Busa-Fekete
given family
Balázs
Szörényi
given family
Paul
Weng
given family
Shie
Mannor
2017-07-17
Proceedings of the 34th International Conference on Machine Learning
inproceedings
date-parts
2017
7
17