This paper studies the problem of expected loss minimization given a data distribution that is dependent on the decision-maker's action and evolves dynamically in time according to a geometric decay process. Novel algorithms for both the information setting in which the decision-maker has a first order gradient oracle and the setting in which they have simply a loss function oracle are introduced. The algorithms operate on the same underlying principle: the decision-maker deploys a fixed decision repeatedly over the length of an epoch, thereby allowing the dynamically changing environment to sufficiently mix before updating the decision. The iteration complexity in each of the settings is shown to match existing rates for first and zero ...
The paper investigates the possibility of applying value function based reinforcement learn-ing (RL)...
Markov decision processes (MDPs) and their variants are widely studied in the theory of controls for...
Abstract. We consider a discrete time, ®nite state Markov reward process that depends on a set of pa...
Stochastic sequential decision-making problems are generally modeled and solved as Markov decision p...
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer...
A standard assumption of most sequential sampling models is that decision-makers rely on a decision ...
We investigate algorithms for different steps in the decision making process, focusing on systems wh...
We study a sequential variance reduction technique for Monte Carlo estimation of functionals in Mark...
The purpose of this thesis is to study the hedging of financial derivatives, using the so-called loc...
In many sequential decision-making problems we may want to manage risk by minimizing some measure of...
This article describes a formal approach to decision making optimization in commodity futures marke...
We consider the problem of determining a strategy that is efficient in the sense that it minimizes t...
Rapid development of data science technologies have enabled data-driven algorithms for many importan...
summary:This paper presents a study the risk probability optimality for finite horizon continuous-ti...
184 pagesUtilizing structure in mathematical modeling is instrumental for better model de- sign, cre...
The paper investigates the possibility of applying value function based reinforcement learn-ing (RL)...
Markov decision processes (MDPs) and their variants are widely studied in the theory of controls for...
Abstract. We consider a discrete time, ®nite state Markov reward process that depends on a set of pa...
Stochastic sequential decision-making problems are generally modeled and solved as Markov decision p...
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer...
A standard assumption of most sequential sampling models is that decision-makers rely on a decision ...
We investigate algorithms for different steps in the decision making process, focusing on systems wh...
We study a sequential variance reduction technique for Monte Carlo estimation of functionals in Mark...
The purpose of this thesis is to study the hedging of financial derivatives, using the so-called loc...
In many sequential decision-making problems we may want to manage risk by minimizing some measure of...
This article describes a formal approach to decision making optimization in commodity futures marke...
We consider the problem of determining a strategy that is efficient in the sense that it minimizes t...
Rapid development of data science technologies have enabled data-driven algorithms for many importan...
summary:This paper presents a study the risk probability optimality for finite horizon continuous-ti...
184 pagesUtilizing structure in mathematical modeling is instrumental for better model de- sign, cre...
The paper investigates the possibility of applying value function based reinforcement learn-ing (RL)...
Markov decision processes (MDPs) and their variants are widely studied in the theory of controls for...
Abstract. We consider a discrete time, ®nite state Markov reward process that depends on a set of pa...