In this paper we present TDLeaf(), a variation on the TD() algorithm that enables it to be used in conjunction with minimax search. We present some experiments in which our chess program, "KnightCap," used TDLeaf() to learn its evaluation function while playing on the Free Ineternet Chess Server (FICS, fics.onenet.net). It improved from a 1650 rating to a 2100 rating in just 308 games and 3 days of play. We discuss some of the reasons for this success and also the relationship between our results and Tesauro's results in backgammon. 1 Introduction Temporal Difference learning or TD(), first introduced by Sutton [9], is an elegant algorithm for approximating the expected long term future cost (or cost-to-go) of a stochastic ...
Recently, the seminal algorithms AlphaGo and AlphaZero have started a new era in game learning and d...
PhDThis thesis adapts and improves on the methods of TD(k) (Sutton 1988) that were successfully use...
NeuroDraughts is a draughts playing program similar in approach to NeuroGammon and NeuroChess [Tesau...
In this paper we present TDLeaf(), a variation on the TD() algorithm that enables it to be used in c...
In this paper we present TDLeaf(), a variation on the TD() algorithm that enables it to be used in c...
Computers have developed to the point where searching through a large set of data to find an optimum...
Computers have developed to the point where searching through a large set of data to find an optimum...
AbstractThis paper describes the application of temporal difference (TD) learning to minimax searche...
Research in computer game playing has relied primarily on brute force searching approaches rather th...
textabstractThis paper introduces a new paradigm for minimax game-tree search algorithms. MT is a me...
AbstractA chess program usually consists of three main parts, that is, a move generator to generate ...
Temporal-difference (TD) learning is one of the most successful and broadly applied solutions to the...
This paper presents a study of several dedicated Temporal-Difference (TD) reinforcement learning alg...
Summarization: Game playing has always been considered an intellectual activity requiring a good lev...
Reinforcement learning is applied to computer-based playing of 5x5 Go. We have found that incorporat...
Recently, the seminal algorithms AlphaGo and AlphaZero have started a new era in game learning and d...
PhDThis thesis adapts and improves on the methods of TD(k) (Sutton 1988) that were successfully use...
NeuroDraughts is a draughts playing program similar in approach to NeuroGammon and NeuroChess [Tesau...
In this paper we present TDLeaf(), a variation on the TD() algorithm that enables it to be used in c...
In this paper we present TDLeaf(), a variation on the TD() algorithm that enables it to be used in c...
Computers have developed to the point where searching through a large set of data to find an optimum...
Computers have developed to the point where searching through a large set of data to find an optimum...
AbstractThis paper describes the application of temporal difference (TD) learning to minimax searche...
Research in computer game playing has relied primarily on brute force searching approaches rather th...
textabstractThis paper introduces a new paradigm for minimax game-tree search algorithms. MT is a me...
AbstractA chess program usually consists of three main parts, that is, a move generator to generate ...
Temporal-difference (TD) learning is one of the most successful and broadly applied solutions to the...
This paper presents a study of several dedicated Temporal-Difference (TD) reinforcement learning alg...
Summarization: Game playing has always been considered an intellectual activity requiring a good lev...
Reinforcement learning is applied to computer-based playing of 5x5 Go. We have found that incorporat...
Recently, the seminal algorithms AlphaGo and AlphaZero have started a new era in game learning and d...
PhDThis thesis adapts and improves on the methods of TD(k) (Sutton 1988) that were successfully use...
NeuroDraughts is a draughts playing program similar in approach to NeuroGammon and NeuroChess [Tesau...