The stochastic approximation (SA) algorithm is a widely used probabilistic method for finding a zero or a fixed point of a vector-valued funtion, when only noisy measurements of the function are available. In the literature to date, one makes a distinction between ``synchronous'' updating, whereby every component of the current guess is updated at each time, and ``asynchronous'' updating, whereby only one component is updated. In this paper, we study an intermediate situation that we call ``batch asynchronous stochastic approximation'' (BASA), in which, at each time instant, \textit{some but not all} components of the current estimated solution are updated. BASA allows the user to trade off memory requirements against time complexity. We de...
Finding convergence rates for numerical optimization algorithms is an important task, because it giv...
This thesis is focused on the convergence analysis of some popular stochastic approximation methods ...
The asymptotic behavior of a distributed, asynchronous stochastic approximation scheme is analyzed i...
Using a martingale concentration inequality, concentration bounds `from time $n_0$ on' are derived f...
It is shown here that stability of the stochastic approximation algorithm is implied by the asymptot...
Includes bibliographical references (p. 18-20).Supported by the National Science Foundation. ECS-921...
This paper develops an unified framework to study finite-sample convergence guarantees of a large cl...
Recent developments in the area of reinforcement learning have yielded a number of new algorithms ...
Following the novel paradigm developed by Van Roy and coauthors for reinforcement learning in arbitr...
We are interested in understanding stability (almost sure boundedness) of stochastic approximation a...
It is shown here that stability of the stochastic approximation algorithm is implied by the asymptot...
It is shown that the stability of the stochastic approximation algorithm is implied by the asymptoti...
Reinforcement learning is a framework for solving sequential decision-making problems without requir...
This paper investigates to what extent one can improve reinforcement learning algorithms. Our study ...
This paper gives the first rigorous convergence analysis of analogues of Watkins's Q-learning algori...
Finding convergence rates for numerical optimization algorithms is an important task, because it giv...
This thesis is focused on the convergence analysis of some popular stochastic approximation methods ...
The asymptotic behavior of a distributed, asynchronous stochastic approximation scheme is analyzed i...
Using a martingale concentration inequality, concentration bounds `from time $n_0$ on' are derived f...
It is shown here that stability of the stochastic approximation algorithm is implied by the asymptot...
Includes bibliographical references (p. 18-20).Supported by the National Science Foundation. ECS-921...
This paper develops an unified framework to study finite-sample convergence guarantees of a large cl...
Recent developments in the area of reinforcement learning have yielded a number of new algorithms ...
Following the novel paradigm developed by Van Roy and coauthors for reinforcement learning in arbitr...
We are interested in understanding stability (almost sure boundedness) of stochastic approximation a...
It is shown here that stability of the stochastic approximation algorithm is implied by the asymptot...
It is shown that the stability of the stochastic approximation algorithm is implied by the asymptoti...
Reinforcement learning is a framework for solving sequential decision-making problems without requir...
This paper investigates to what extent one can improve reinforcement learning algorithms. Our study ...
This paper gives the first rigorous convergence analysis of analogues of Watkins's Q-learning algori...
Finding convergence rates for numerical optimization algorithms is an important task, because it giv...
This thesis is focused on the convergence analysis of some popular stochastic approximation methods ...
The asymptotic behavior of a distributed, asynchronous stochastic approximation scheme is analyzed i...