In this thesis, neural-fitted temporal difference learning, a form of reinforcement learning, is used to learn to play the game of Connect Four. Seven different artificial players are compared, using five different neural networks. While the first network only uses the basic board state as input, the larger ones also use specific features: rows of two, three and four in the different rows of the board state. It is shown that these features dramatically improve the performance of the agent. Furthermore, two different exploration strategies are used: Boltzmann with constant temperature and epsilon-greedy. The results show that epsilon-greedy gives the most stable result. Finally, the smallest network was given the same number of hidden nodes ...
The thesis is dedicated to the study and implementation of methods used for learning from the course...
Abstract. Temporal difference (TD) learning has been used to learn strong evaluation functions in a ...
This paper presents an adaptive 'rock, scissors and paper' artificial player. The artificial player ...
When reinforcement learning is applied to large state spaces, such as those occurring in playing boa...
When reinforcement learning is applied to large state spaces, such as those occurring in playing boa...
Reinforcement learning is applied to computer-based playing of 5x5 Go. We have found that incorporat...
We compare classic scalar temporal difference learning with three new distributional algorithms for ...
Abstract—The highly addictive stochastic puzzle game 2048 has recently invaded the Internet and mobi...
This paper describes a methodology for quickly learning to play games at a strong level. The methodo...
In this paper, we investigate an integration of individual and social learning, utilising co-evoluti...
NeuroDraughts is a draughts playing program similar in approach to NeuroGammon and NeuroChess [Tesau...
The success of neural networks and temporal dif-ference methods in complex tasks such as in (Tesauro...
A common approach to game playing in Artificial Intelligence involves the use of the Minimax algorit...
We present an experimental methodology and results for a machine learning approach to learning openi...
Over the past two decades, Reinforcement Learning has emerged as a promising Machine Learning techni...
The thesis is dedicated to the study and implementation of methods used for learning from the course...
Abstract. Temporal difference (TD) learning has been used to learn strong evaluation functions in a ...
This paper presents an adaptive 'rock, scissors and paper' artificial player. The artificial player ...
When reinforcement learning is applied to large state spaces, such as those occurring in playing boa...
When reinforcement learning is applied to large state spaces, such as those occurring in playing boa...
Reinforcement learning is applied to computer-based playing of 5x5 Go. We have found that incorporat...
We compare classic scalar temporal difference learning with three new distributional algorithms for ...
Abstract—The highly addictive stochastic puzzle game 2048 has recently invaded the Internet and mobi...
This paper describes a methodology for quickly learning to play games at a strong level. The methodo...
In this paper, we investigate an integration of individual and social learning, utilising co-evoluti...
NeuroDraughts is a draughts playing program similar in approach to NeuroGammon and NeuroChess [Tesau...
The success of neural networks and temporal dif-ference methods in complex tasks such as in (Tesauro...
A common approach to game playing in Artificial Intelligence involves the use of the Minimax algorit...
We present an experimental methodology and results for a machine learning approach to learning openi...
Over the past two decades, Reinforcement Learning has emerged as a promising Machine Learning techni...
The thesis is dedicated to the study and implementation of methods used for learning from the course...
Abstract. Temporal difference (TD) learning has been used to learn strong evaluation functions in a ...
This paper presents an adaptive 'rock, scissors and paper' artificial player. The artificial player ...