International audienceNowadays, in most fields of activities, companies are strengthening their digitization process and offer new services to their users. In recent years, many of these services have relied on machine learning techniques. Concerning combinatorial multi-armed bandit algorithms, which are particularly employed for recommendation, user feedbacks play a crucial role for online learning. However, strategies for considering those feedbacks are essentially based on the observation of a full rewards vector which can be hard to acquire when users must be directly and too frequently solicited. Herein, we propose a novel approach which overcomes these limitations, while providing a level of global accuracy similar to that obtained by...