In this paper, we study a sequential decision making problem. The objective is to maximize the average reward accumulated over time subject to temporal cost constraints. The novelty of our setup is that the rewards and constraints are controlled by an adverse opponent. To solve the problem in a practical way, we propose an expert algorithm that guarantees both a vanish-ing regret and a sublinear number of violated constraints. The quality of the proposed algorithm is evaluated on a real-world power management problem. Our results clearly demonstrate that online learning with temporal cost constraints can be car-ried out successfully in practice
AbstractIn an online decision problem, one makes a sequence of decisions without knowledge of the fu...
We discuss multi-task online learning when a de-cision maker has to deal simultaneously with M tasks...
The greedy algorithm is extensively studied in the field of combinatorial optimiza-tion for decades....
In this paper, we study a sequential decision making problem. The objective is to maximize the avera...
This thesis studies two online learning problems in which the efficiency of the proposed strategies ...
Abstract We consider the problem of online optimization, where a learner chooses a decision from a g...
We consider the fundamental problem of prediction with expert advice where the experts are "optimiza...
We consider a budgeted variant of the problem of learning from expert advice with N experts. Each qu...
We study online learning where a decision maker interacts with Nature with the objective of maximiz...
This paper proposes a novel algorithm for solving discrete online learning prob-lems under stochasti...
149 pagesThis dissertation focuses on risk and safety considerations in the design and analysis of o...
Existing episodic reinforcement algorithms assume that the length of an episode is fixed across tim...
In this paper, we study the problem of efficient online reinforcement learning in the infinite horiz...
We discuss multi-task online learning when a decision maker has to deal simultaneously with M tasks....
We consider a budgeted variant of the problem of learning from expert advice with N experts. Each qu...
AbstractIn an online decision problem, one makes a sequence of decisions without knowledge of the fu...
We discuss multi-task online learning when a de-cision maker has to deal simultaneously with M tasks...
The greedy algorithm is extensively studied in the field of combinatorial optimiza-tion for decades....
In this paper, we study a sequential decision making problem. The objective is to maximize the avera...
This thesis studies two online learning problems in which the efficiency of the proposed strategies ...
Abstract We consider the problem of online optimization, where a learner chooses a decision from a g...
We consider the fundamental problem of prediction with expert advice where the experts are "optimiza...
We consider a budgeted variant of the problem of learning from expert advice with N experts. Each qu...
We study online learning where a decision maker interacts with Nature with the objective of maximiz...
This paper proposes a novel algorithm for solving discrete online learning prob-lems under stochasti...
149 pagesThis dissertation focuses on risk and safety considerations in the design and analysis of o...
Existing episodic reinforcement algorithms assume that the length of an episode is fixed across tim...
In this paper, we study the problem of efficient online reinforcement learning in the infinite horiz...
We discuss multi-task online learning when a decision maker has to deal simultaneously with M tasks....
We consider a budgeted variant of the problem of learning from expert advice with N experts. Each qu...
AbstractIn an online decision problem, one makes a sequence of decisions without knowledge of the fu...
We discuss multi-task online learning when a de-cision maker has to deal simultaneously with M tasks...
The greedy algorithm is extensively studied in the field of combinatorial optimiza-tion for decades....