In this thesis, we study the problem of buying or selling a given volume of a financial asset within a given time horizon to the best possible price, a problem formally known as optimized trade execution. Our approach is an empirical one. We use historical data to simulate the process of placing artificial orders in a market. This simulation enables us to model the problem as a Markov decision process (MDP). Given this MDP, we train and evaluate a set of reinforcement learning (RL) algorithms all with the objective to minimize the transaction cost on unseen test data. We train and evaluate these for various instruments and problem settings, such as different trading horizons. Our first model was developed with the goal to validate results a...