We provide the first algorithm for online bandit linear optimization whose regret after T rounds is of order Td lnN on any finite class X ⊆ Rd of N actions, and of order d T (up to log factors) when X is infinite. These bounds are not improvable in general. The basic idea utilizes tools from convex geometry to construct what is essentially an optimal exploration basis. We also present an application to a model of linear bandits with expert advice. Interestingly, these results show that bandit linear optimization with expert advice in d dimensions is no more difficult (in terms of the achievable regret) than the online d-armed bandit problem with expert advice (where EXP4 is optimal). 1 Introduction and Related Work The problem of bandit lin...
We present a modification of the algorithm of Dani et al. [8] for the online linear optimization pro...
Abstract. We give an algorithm for the bandit version of a very general online optimization problem ...
We present a modification of the algorithm of Dani et al. [8] for the online linear optimization pro...
In the online linear optimization problem, a learner must choose, in each round, a decision from a s...
In the online linear optimization problem, a learner must choose, in each round, a decision from a s...
We study the attainable regret for online linear optimization problems with bandit feedback, where u...
In the classical stochastic k-armed bandit problem, in each of a sequence of rounds, a decision make...
We introduce an efficient algorithm for the problem of online linear optimization in the bandit sett...
Bandit convex optimization is a special case of online convex optimization with partial information....
We demonstrate a modification of the algorithm of Dani et al for the online linear optimization prob...
On-line linear optimization on combinatorial ac-tion sets (d-dimensional actions) with bandit feedba...
Inspired by advertising markets, we consider large-scale sequential decision making problems in whic...
Inspired by advertising markets, we consider large-scale sequential decision making problems in whic...
Consider the online convex optimization problem, in which a player has to choose ac-tions iterativel...
We address online linear optimization problems when the possible actions of the decision maker are r...
We present a modification of the algorithm of Dani et al. [8] for the online linear optimization pro...
Abstract. We give an algorithm for the bandit version of a very general online optimization problem ...
We present a modification of the algorithm of Dani et al. [8] for the online linear optimization pro...
In the online linear optimization problem, a learner must choose, in each round, a decision from a s...
In the online linear optimization problem, a learner must choose, in each round, a decision from a s...
We study the attainable regret for online linear optimization problems with bandit feedback, where u...
In the classical stochastic k-armed bandit problem, in each of a sequence of rounds, a decision make...
We introduce an efficient algorithm for the problem of online linear optimization in the bandit sett...
Bandit convex optimization is a special case of online convex optimization with partial information....
We demonstrate a modification of the algorithm of Dani et al for the online linear optimization prob...
On-line linear optimization on combinatorial ac-tion sets (d-dimensional actions) with bandit feedba...
Inspired by advertising markets, we consider large-scale sequential decision making problems in whic...
Inspired by advertising markets, we consider large-scale sequential decision making problems in whic...
Consider the online convex optimization problem, in which a player has to choose ac-tions iterativel...
We address online linear optimization problems when the possible actions of the decision maker are r...
We present a modification of the algorithm of Dani et al. [8] for the online linear optimization pro...
Abstract. We give an algorithm for the bandit version of a very general online optimization problem ...
We present a modification of the algorithm of Dani et al. [8] for the online linear optimization pro...