We study Bayesian optimal control of a general class of smoothly parameterized Markov deci-sion problems (MDPs). We propose a lazy ver-sion of the so-called posterior sampling method, a method that goes back to Thompson and Strens, more recently studied by Osband, Russo and van Roy. While Osband et al. derived a bound on the (Bayesian) regret of this method for undis-counted total cost episodic, finite state and ac-tion problems, we consider the continuing, av-erage cost setting with no cardinality restric-tions on the state or action spaces. While in the episodic setting, it is natural to switch to a new policy at the episode-ends, in the continu-ing average cost framework we must introduce switching points explicitly and in a principled f...
International audienceOptimization problems where the objective and constraint functions take minute...
International audienceWe propose a general framework for studying optimal impulse control problem in...
In this paper we investigate optimal Bayesian learning and control with lagged dependent vari-ables ...
Abstract. This paper considers Bayesian parameter estimation and an associated adaptive control sche...
In this paper, we present a model and an algorithm for the calculation of the optimal control limit,...
We consider the Bayesian formulation of a number of learning problems, where we focus on sequential ...
This paper considers Bayesian parameter estimation and an associated adaptive control scheme for con...
This work addresses the problem of estimating the optimal value function in a MarkovDecision Process...
We consider the problem of chance constrained optimization where it is sought to optimize a function...
Bayesian process control is a statistical process control (SPC) scheme that uses the posterior state...
This work addresses the problem of estimating the optimal value function in a Markov Decision Proces...
In this paper, we seek robust policies for uncertain Markov Decision Processes (MDPs). Most robust o...
We consider the Bayesian formulation of a number of learning problems, where we focus on sequential ...
We consider the problem of "optimal learning" for Markov decision processes with uncertain...
We introduce an on-line algorithm for finding local maxima of the average reward in a Partially Obse...
International audienceOptimization problems where the objective and constraint functions take minute...
International audienceWe propose a general framework for studying optimal impulse control problem in...
In this paper we investigate optimal Bayesian learning and control with lagged dependent vari-ables ...
Abstract. This paper considers Bayesian parameter estimation and an associated adaptive control sche...
In this paper, we present a model and an algorithm for the calculation of the optimal control limit,...
We consider the Bayesian formulation of a number of learning problems, where we focus on sequential ...
This paper considers Bayesian parameter estimation and an associated adaptive control scheme for con...
This work addresses the problem of estimating the optimal value function in a MarkovDecision Process...
We consider the problem of chance constrained optimization where it is sought to optimize a function...
Bayesian process control is a statistical process control (SPC) scheme that uses the posterior state...
This work addresses the problem of estimating the optimal value function in a Markov Decision Proces...
In this paper, we seek robust policies for uncertain Markov Decision Processes (MDPs). Most robust o...
We consider the Bayesian formulation of a number of learning problems, where we focus on sequential ...
We consider the problem of "optimal learning" for Markov decision processes with uncertain...
We introduce an on-line algorithm for finding local maxima of the average reward in a Partially Obse...
International audienceOptimization problems where the objective and constraint functions take minute...
International audienceWe propose a general framework for studying optimal impulse control problem in...
In this paper we investigate optimal Bayesian learning and control with lagged dependent vari-ables ...