In this paper, we study a sequential decision making problem. The objective is to maximize the average reward accumulated over time subject to temporal cost constraints. The novelty of our setup is that the rewards and constraints are controlled by an adverse opponent. To solve our problem in a practical way, we propose an expert algorithm that guarantees both a vanish-ing regret and a sublinear number of violated constraints. The quality of this solution is demonstrated on a real-world power management problem. Our results support the hypothesis that online learning with convex cost constraints can be performed successfully in practice
In this paper, we study the problem of efficient online reinforcement learning in the infinite horiz...
Abstract. We study online learning where the objective of the decision maker is to maximize her aver...
In this research we study some online learning algorithms in the online convex optimization framewor...
In this paper, we study a sequential decision making problem. The objective is to maximize the avera...
We study online learning where a decision maker interacts with Nature with the objective of maximiz...
Abstract We consider the problem of online optimization, where a learner chooses a decision from a g...
We consider the fundamental problem of prediction with expert advice where the experts are "optimiza...
The framework of online learning with memory naturally captures learning problems with temporal effe...
International audienceWe study a class of online convex optimization problems with long-term budget ...
This thesis studies two online learning problems in which the efficiency of the proposed strategies ...
This paper proposes a novel algorithm for solving discrete online learning prob-lems under stochasti...
We consider a budgeted variant of the problem of learning from expert advice with N experts. Each qu...
We study the performance of an online learner under a framework in which it receives partial informa...
We consider the decision-making framework of online convex optimization with a very large number of ...
149 pagesThis dissertation focuses on risk and safety considerations in the design and analysis of o...
In this paper, we study the problem of efficient online reinforcement learning in the infinite horiz...
Abstract. We study online learning where the objective of the decision maker is to maximize her aver...
In this research we study some online learning algorithms in the online convex optimization framewor...
In this paper, we study a sequential decision making problem. The objective is to maximize the avera...
We study online learning where a decision maker interacts with Nature with the objective of maximiz...
Abstract We consider the problem of online optimization, where a learner chooses a decision from a g...
We consider the fundamental problem of prediction with expert advice where the experts are "optimiza...
The framework of online learning with memory naturally captures learning problems with temporal effe...
International audienceWe study a class of online convex optimization problems with long-term budget ...
This thesis studies two online learning problems in which the efficiency of the proposed strategies ...
This paper proposes a novel algorithm for solving discrete online learning prob-lems under stochasti...
We consider a budgeted variant of the problem of learning from expert advice with N experts. Each qu...
We study the performance of an online learner under a framework in which it receives partial informa...
We consider the decision-making framework of online convex optimization with a very large number of ...
149 pagesThis dissertation focuses on risk and safety considerations in the design and analysis of o...
In this paper, we study the problem of efficient online reinforcement learning in the infinite horiz...
Abstract. We study online learning where the objective of the decision maker is to maximize her aver...
In this research we study some online learning algorithms in the online convex optimization framewor...