\u3cp\u3eMulti-objective multi-armed bandits (MOMAB) is a multiarm bandit variant that uses stochastic reward vectors. In this paper, we propose three MOMAB algorithms. The first algorithm uses a fixed set of linear scalarization functions to identify the Pareto front. Two topological approaches identify thePareto front using linearweighted combinations of reward vectors. The weight hyper-rectangle decomposition algorithm explores a convex shape Pareto front by grouping scalarization functions that optimise the same arm intoweight hyperrectangles. It is generally acknowledged that linear scalarization is not able to identify all the Pareto front for non-convex shapes. The hierarchical PAC algorithm iteratively decomposes the Pareto front in...