The exploration-exploitation tradeoff is crucial to reinforcement-learning (RL) agents, and a significant number of sample complexity results have been derived for agents in propositional domains. These results guarantee, with high probability, near-optimal behavior in all but a polynomial number of timesteps in the agent’s lifetime. In this work, we prove similar results for certain relational representations, primarily a class we call “relational action schemas”. These generalized models allow us to specify state transitions in a compact form, for instance describing the effect of picking up a generic block instead of picking up 10 different specific blocks. We present theoretical results on crucial subproblems in action-schema lea...
Reinforcement learning, and Q-learning in particular, encounter two major problems when dealing with...
Abstract. Reinforcement learning, and Q-learning in particular, encounter two major problems when de...
We present an algorithm that derives actions ’ effects and preconditions in partially observable, re...
The exploration-exploitation tradeoff is crucial to reinforcement-learning (RL) agents, and a signif...
A fundamental problem in reinforcement learning is balancing exploration and exploitation. We addres...
In this paper we report on using a relational state space in multi-agent reinforcement learning. The...
Abstract. In this paper we report on using a relational state space in multi-agent reinforcement lea...
In recent years, there has been a growing interest in using rich representations such as relational...
In this paper we present a new method for reinforcement learning in relational domains. A logical la...
We present an instance-based, online method for learning action models in unanticipated, relational ...
Abstract. In reinforcement learning, an agent tries to learn a policy, i.e., how to select an action...
In recent years, there has been a growing interest in using rich representations such as relational ...
Agents (humans, mice, computers) need to constantly make decisions to survive and thrive in their e...
© Springer-Verlag Berlin Heidelberg 1998. Relational reinforcement learning is presented, a learning...
Reinforcement learning (RL) is a machine learning paradigm where an agent learns to interact with an...
Reinforcement learning, and Q-learning in particular, encounter two major problems when dealing with...
Abstract. Reinforcement learning, and Q-learning in particular, encounter two major problems when de...
We present an algorithm that derives actions ’ effects and preconditions in partially observable, re...
The exploration-exploitation tradeoff is crucial to reinforcement-learning (RL) agents, and a signif...
A fundamental problem in reinforcement learning is balancing exploration and exploitation. We addres...
In this paper we report on using a relational state space in multi-agent reinforcement learning. The...
Abstract. In this paper we report on using a relational state space in multi-agent reinforcement lea...
In recent years, there has been a growing interest in using rich representations such as relational...
In this paper we present a new method for reinforcement learning in relational domains. A logical la...
We present an instance-based, online method for learning action models in unanticipated, relational ...
Abstract. In reinforcement learning, an agent tries to learn a policy, i.e., how to select an action...
In recent years, there has been a growing interest in using rich representations such as relational ...
Agents (humans, mice, computers) need to constantly make decisions to survive and thrive in their e...
© Springer-Verlag Berlin Heidelberg 1998. Relational reinforcement learning is presented, a learning...
Reinforcement learning (RL) is a machine learning paradigm where an agent learns to interact with an...
Reinforcement learning, and Q-learning in particular, encounter two major problems when dealing with...
Abstract. Reinforcement learning, and Q-learning in particular, encounter two major problems when de...
We present an algorithm that derives actions ’ effects and preconditions in partially observable, re...