In the field of sequential decision making and reinforcement learning, it has been observed that good policies for most problems exhibit a significant amount of structure. In practice, this implies that when a learning agent discovers an action is better than any other in a given state, this action actually happens to also dominate in a certain neighbourhood around that state. This paper presents new results proving that this notion of locality in action domination can be linked to the smoothness of the environment's underlying stochastic model. Namely, we link the Lipschitz continuity of a Markov Decision Process to the Lispchitz continuity of its policies' value functions and introduce the key concept of influence radius to describe the n...
We provide steps towards a welfare analysis of a two-country endogenous growth model where a relativ...
Linear control theory has been long established and a myriad of techniques are available for designi...
This paper presents a new integrated procedure to tune a control law for overactuated mechanical sys...
This work tackles the problem of robust zero-shot planning in non-stationary stochastic environments...
Summarization: In the field of sequential decision making and reinforcement learning, it has been ob...
Time is a crucial variable in planning and often requires special attention since it introduces a sp...
In the field of sequential decision making and reinforcement learning, it has been observed that goo...
The training of autonomous agents often requires expensive and unsafe trial-and-error interactions w...
In the context of time-dependent problems of planning under uncertainty, most of the problem's compl...
Recent work on Markov Decision Processes (MDPs) covers the use of continuous variables and resources...
Accurate modeling of boundary conditions is crucial in com- putational physics. The ever increasing ...
In the context of tree-search stochastic planning algorithms where a generative model is available, ...
Bandits are one of the most basic examples of decision-making with uncertainty. A Markovian restless...
AbstractIn this paper a Markov model for Evolutionary Multi-Agent System is recalled. The model allo...
This paper tackles a problem of UAV safe path planning in an urban environment where the onboard sen...
We provide steps towards a welfare analysis of a two-country endogenous growth model where a relativ...
Linear control theory has been long established and a myriad of techniques are available for designi...
This paper presents a new integrated procedure to tune a control law for overactuated mechanical sys...
This work tackles the problem of robust zero-shot planning in non-stationary stochastic environments...
Summarization: In the field of sequential decision making and reinforcement learning, it has been ob...
Time is a crucial variable in planning and often requires special attention since it introduces a sp...
In the field of sequential decision making and reinforcement learning, it has been observed that goo...
The training of autonomous agents often requires expensive and unsafe trial-and-error interactions w...
In the context of time-dependent problems of planning under uncertainty, most of the problem's compl...
Recent work on Markov Decision Processes (MDPs) covers the use of continuous variables and resources...
Accurate modeling of boundary conditions is crucial in com- putational physics. The ever increasing ...
In the context of tree-search stochastic planning algorithms where a generative model is available, ...
Bandits are one of the most basic examples of decision-making with uncertainty. A Markovian restless...
AbstractIn this paper a Markov model for Evolutionary Multi-Agent System is recalled. The model allo...
This paper tackles a problem of UAV safe path planning in an urban environment where the onboard sen...
We provide steps towards a welfare analysis of a two-country endogenous growth model where a relativ...
Linear control theory has been long established and a myriad of techniques are available for designi...
This paper presents a new integrated procedure to tune a control law for overactuated mechanical sys...