This work investigates Monte-Carlo planning for agents in stochastic environments, with multiple objectives. We propose the Convex Hull Monte-Carlo Tree-Search (CHMCTS) framework, which builds upon Trial Based Heuristic Tree Search and Convex Hull Value Iteration (CHVI), as a solution to multi-objective planning in large environments. Moreover, we consider how to pose the problem of approximating multi-objective planning solutions as a contextual multi-armed bandits problem, giving a principled motivation for how to select actions from the view of contextual regret. This leads us to the use of Contextual Zooming for action selection, yielding Zooming CHMCTS. We evaluate our algorithm using the Generalised Deep Sea Treasure environment, demo...
Concerned with multi-objective reinforcement learning (MORL), this paper presents MO-MCTS, an extens...
Many important applications, including robotics, data-center management, and process control, requir...
Online solvers for partially observable Markov decision processes have difficulty scaling to problem...
International audienceMonte Carlo Tree Search is a recent algorithm that achieves more and more succ...
International audienceMonte Carlo Tree Search is a recent algorithm that achieves more and more succ...
International audienceMonte Carlo Tree Search is a recent algorithm that achieves more and more succ...
Monte Carlo tree search (MCTS) is a sampling and simulation based technique for searching in large s...
Abstract. Monte Carlo Tree Search is a recent algorithm that achieves more and more successes in var...
Monte-Carlo planning and Reinforcement Learning (RL) are essential to sequential decision making. Th...
International audienceThe MASH project is a collaborative platform with the aim to experiment differ...
International audienceThe MASH project is a collaborative platform with the aim to experiment differ...
International audienceConcerned with multi-objective reinforcement learning (MORL), this paper prese...
Many important applications, including robotics, data-center management, and process control, requir...
Many important applications, including robotics, data-center management, and process control, requir...
International audienceConcerned with multi-objective reinforcement learning (MORL), this paper prese...
Concerned with multi-objective reinforcement learning (MORL), this paper presents MO-MCTS, an extens...
Many important applications, including robotics, data-center management, and process control, requir...
Online solvers for partially observable Markov decision processes have difficulty scaling to problem...
International audienceMonte Carlo Tree Search is a recent algorithm that achieves more and more succ...
International audienceMonte Carlo Tree Search is a recent algorithm that achieves more and more succ...
International audienceMonte Carlo Tree Search is a recent algorithm that achieves more and more succ...
Monte Carlo tree search (MCTS) is a sampling and simulation based technique for searching in large s...
Abstract. Monte Carlo Tree Search is a recent algorithm that achieves more and more successes in var...
Monte-Carlo planning and Reinforcement Learning (RL) are essential to sequential decision making. Th...
International audienceThe MASH project is a collaborative platform with the aim to experiment differ...
International audienceThe MASH project is a collaborative platform with the aim to experiment differ...
International audienceConcerned with multi-objective reinforcement learning (MORL), this paper prese...
Many important applications, including robotics, data-center management, and process control, requir...
Many important applications, including robotics, data-center management, and process control, requir...
International audienceConcerned with multi-objective reinforcement learning (MORL), this paper prese...
Concerned with multi-objective reinforcement learning (MORL), this paper presents MO-MCTS, an extens...
Many important applications, including robotics, data-center management, and process control, requir...
Online solvers for partially observable Markov decision processes have difficulty scaling to problem...