We study identifying user clusters in contextual multi-armed bandits (MAB). Contextual MAB is an effective tool for many real applications, such as content recommendation and online advertisement. In practice, user dependency plays an essential role in the user's actions, and thus the rewards. Clustering similar users can improve the quality of reward estimation, which in turn leads to more effective content recommendation and targeted advertising. Different from traditional clustering settings, we cluster users based on the unknown bandit parameters, which will be estimated incrementally. In particular, we define the problem of cluster detection in contextual MAB, and propose a bandit algorithm, LOCB, embedded with local clustering procedu...
In machine learning predictive area, unsupervised learning will be applied when the labels of the da...
We consider running multiple instances of multi-armed bandit (MAB) problems in parallel. A main moti...
Contextual multi-armed bandit (MAB) algorithms have been shown promising for maximizing cumulative r...
In this work, we study recommendation systems modelled as contextual multi-armed bandit (MAB) proble...
Multi-armed bandit problems are receiving a great deal of attention because they adequately formaliz...
Classical collaborative filtering, and content-based filtering methods try to learn a static recomme...
We propose algorithms based on a multi-level Thompson sampling scheme, for the stochastic multi-arme...
Stochastic bandit algorithms are increasingly being used in the domain of recommender systems, when ...
We introduce a novel algorithmic approach to content recommendation based on adaptive clustering of ...
Multi-armed bandits (MAB) provide a principled online learning approach to attain the balance betwee...
Multi-armed bandit problems formalize the exploration-exploitation trade-offs arising in several ind...
Multi-armed bandit problems formalize the exploration-exploitation trade-offs arising in several ind...
We consider a new setting of online clustering of contextual cascading bandits, an online learning p...
Master of ScienceDepartment of Computer ScienceWilliam H. HsuThis work compares two methods, the mul...
Multi-Armed bandit (MAB) framework is a widely used sequential decision making framework in which a ...
In machine learning predictive area, unsupervised learning will be applied when the labels of the da...
We consider running multiple instances of multi-armed bandit (MAB) problems in parallel. A main moti...
Contextual multi-armed bandit (MAB) algorithms have been shown promising for maximizing cumulative r...
In this work, we study recommendation systems modelled as contextual multi-armed bandit (MAB) proble...
Multi-armed bandit problems are receiving a great deal of attention because they adequately formaliz...
Classical collaborative filtering, and content-based filtering methods try to learn a static recomme...
We propose algorithms based on a multi-level Thompson sampling scheme, for the stochastic multi-arme...
Stochastic bandit algorithms are increasingly being used in the domain of recommender systems, when ...
We introduce a novel algorithmic approach to content recommendation based on adaptive clustering of ...
Multi-armed bandits (MAB) provide a principled online learning approach to attain the balance betwee...
Multi-armed bandit problems formalize the exploration-exploitation trade-offs arising in several ind...
Multi-armed bandit problems formalize the exploration-exploitation trade-offs arising in several ind...
We consider a new setting of online clustering of contextual cascading bandits, an online learning p...
Master of ScienceDepartment of Computer ScienceWilliam H. HsuThis work compares two methods, the mul...
Multi-Armed bandit (MAB) framework is a widely used sequential decision making framework in which a ...
In machine learning predictive area, unsupervised learning will be applied when the labels of the da...
We consider running multiple instances of multi-armed bandit (MAB) problems in parallel. A main moti...
Contextual multi-armed bandit (MAB) algorithms have been shown promising for maximizing cumulative r...