Multi-Armed Bandit (MAB) framework has been successfully applied in many web applications. However, many complex real-world applications that involve multiple content recommendations cannot fit into the traditional MAB setting. To address this issue, we consider an ordered combinatorial semi-bandit problem where the learner recommends S actions from a base set of K actions, and displays the results in S (out of M) different positions. The aim is to maximize the cumulative reward with respect to the best possible subset and positions in hindsight. By the adaptation of a minimum-cost maximum-flow network, a practical algorithm based on Thompson sampling is derived for the (contextual) combinatorial problem, thus resolving the problem of compu...
Multi-Armed bandit (MAB) framework is a widely used sequential decision making framework in which a ...
Machine Learning algorithms play an active role in modern day business activities and have been put ...
Smooth functions on graphs have wide applications in man-ifold and semi-supervised learning. In this...
The probabilistic ranking principle advocates ranking documents in order of de-creasing probability ...
Contextual multi-armed bandit (MAB) algorithms have been shown promising for maximizing cumulative r...
We consider the query recommendation problem in closed loop interactive learning settings like onli...
Multi-armed bandit problems formalize the exploration-exploitation trade-offs arising in several ind...
We tackle the online learning to rank problem of assigning L items to K predefined positions on a we...
International audienceMultiple-play bandits aim at displaying relevant items at relevant positions o...
Multi-armed bandit (MAB) problem is derived from slot machines in the casino. It is about how a gamb...
The aim of the research presented in this dissertation is to construct a model for personalised item...
In this paper, we consider efficient learning in large-scale combinatorial semi-bandits with linear ...
Cataloged from PDF version of article.Thesis (M.S.): Bilkent University, Department of Industrial En...
This work presents an extension of Thompson Sampling bandit policy for orchestrating the collection ...
The cold-start problem has attracted extensive attention among various online services that provide ...
Multi-Armed bandit (MAB) framework is a widely used sequential decision making framework in which a ...
Machine Learning algorithms play an active role in modern day business activities and have been put ...
Smooth functions on graphs have wide applications in man-ifold and semi-supervised learning. In this...
The probabilistic ranking principle advocates ranking documents in order of de-creasing probability ...
Contextual multi-armed bandit (MAB) algorithms have been shown promising for maximizing cumulative r...
We consider the query recommendation problem in closed loop interactive learning settings like onli...
Multi-armed bandit problems formalize the exploration-exploitation trade-offs arising in several ind...
We tackle the online learning to rank problem of assigning L items to K predefined positions on a we...
International audienceMultiple-play bandits aim at displaying relevant items at relevant positions o...
Multi-armed bandit (MAB) problem is derived from slot machines in the casino. It is about how a gamb...
The aim of the research presented in this dissertation is to construct a model for personalised item...
In this paper, we consider efficient learning in large-scale combinatorial semi-bandits with linear ...
Cataloged from PDF version of article.Thesis (M.S.): Bilkent University, Department of Industrial En...
This work presents an extension of Thompson Sampling bandit policy for orchestrating the collection ...
The cold-start problem has attracted extensive attention among various online services that provide ...
Multi-Armed bandit (MAB) framework is a widely used sequential decision making framework in which a ...
Machine Learning algorithms play an active role in modern day business activities and have been put ...
Smooth functions on graphs have wide applications in man-ifold and semi-supervised learning. In this...