We propose an algorithm that uses linear function approximation (LFA) for stochastic shortest path (SSP). Under minimal assumptions, it obtains sublinear regret, is computationally efficient, and uses stationary policies. To our knowledge, this is the first such algorithm in the LFA literature (for SSP or other formulations). Our algorithm is a special case of a more general one, which achieves regret square root in the number of episodes given access to a certain computation oracle.Comment: This version removes most assumptions of the prior on
International audienceThis paper introduces and addresses a wide class of stochastic bandit problems...
In this paper, we give a new framework for the stochastic shortest path problem in finite state and ...
While the complexity of min-max and min-max regret versions of most classical combinatorial optimiza...
International audienceWe study the problem of learning in the stochastic shortest path (SSP) setting...
This paper investigates, for the first time in the literature, the approximation of min-max (regret)...
In this invited contribution, we revisit the stochastic shortest path problem, and show how recent r...
We consider recently-derived error bounds that can be used to bound the quality of solutions found b...
Minmax regret optimization aims at finding robust solutions that perform best in the worst-case, com...
This paper investigates, for the first time in the literature, the approximation of min–max (regret)...
This paperc onsidersS tochasticS hortestP ath( SSP)p roblemsi n probabilisticn etworks.A variety of ...
This paper considers a stochastic shortest path problem where the arc lengths are independent random...
Abstract Minmax regret optimization aims at finding robust solutions that perform best in the worst-...
We study stochastic linear payoff bandit prob-lems and give a simple, computationally ef-ficient alg...
International audienceWe consider the objective of computing an ε-optimal policy in a stochastic sho...
The stochastic shortest path problem lies at the heart of many questions in the formal verification ...
International audienceThis paper introduces and addresses a wide class of stochastic bandit problems...
In this paper, we give a new framework for the stochastic shortest path problem in finite state and ...
While the complexity of min-max and min-max regret versions of most classical combinatorial optimiza...
International audienceWe study the problem of learning in the stochastic shortest path (SSP) setting...
This paper investigates, for the first time in the literature, the approximation of min-max (regret)...
In this invited contribution, we revisit the stochastic shortest path problem, and show how recent r...
We consider recently-derived error bounds that can be used to bound the quality of solutions found b...
Minmax regret optimization aims at finding robust solutions that perform best in the worst-case, com...
This paper investigates, for the first time in the literature, the approximation of min–max (regret)...
This paperc onsidersS tochasticS hortestP ath( SSP)p roblemsi n probabilisticn etworks.A variety of ...
This paper considers a stochastic shortest path problem where the arc lengths are independent random...
Abstract Minmax regret optimization aims at finding robust solutions that perform best in the worst-...
We study stochastic linear payoff bandit prob-lems and give a simple, computationally ef-ficient alg...
International audienceWe consider the objective of computing an ε-optimal policy in a stochastic sho...
The stochastic shortest path problem lies at the heart of many questions in the formal verification ...
International audienceThis paper introduces and addresses a wide class of stochastic bandit problems...
In this paper, we give a new framework for the stochastic shortest path problem in finite state and ...
While the complexity of min-max and min-max regret versions of most classical combinatorial optimiza...