The data explosion and development of artificial intelligence (AI) has fueled the demand for recommendation systems, information retrieval, personalization, among others. Consequently, the need of a solution to optimize these systems “on-the-fly” has also grown rapidly. Contextual bandit is a machine learning framework designed to tackle complex situations in an online manner, where the agent can select actions (i.e., arms) based on available context information. Based the feedback, the agent can learn the relations between context information and rewards for each arm, which further improves arm selection in the future. In practice, however, the learning environment may be far from being perfect. For example, the available context informati...
AI systems that learn through reward feedback about the actions they take are increasingly deployed ...
© 2019 Neural information processing systems foundation. All rights reserved. In the classical conte...
Artificially intelligent assistive agents are playing an increased role in our work and homes. In co...
The data explosion and development of artificial intelligence (AI) has fueled the demand for recomme...
A standard assumption in contextual multi-arm bandit is that the true context is perfectly known bef...
We consider a novel variant of the contextual bandit problem (i.e., the multi-armed bandit with side...
International audienceContextual bandit algorithms are essential for solving many real-world interac...
The bandit problem models a sequential decision process between a player and an environment. In the ...
We propose a new sequential decision-making setting, combining key aspects of two established online...
Contextual bandits are canonical models for sequential decision-making under uncertainty in environm...
We consider a remote contextual multi-armed bandit (CMAB) problem, in which the decision-maker obser...
We present a new algorithm for the contextual bandit learning problem, where the learner repeat-edly...
Contextual bandit algorithms are sensitive to the estimation method of the outcome model as well as ...
We present a new algorithm for the contextual bandit learning problem, where the learner repeatedly ...
Learning action policy for autonomous agents in a decentralized multi-agent environment has remained...
AI systems that learn through reward feedback about the actions they take are increasingly deployed ...
© 2019 Neural information processing systems foundation. All rights reserved. In the classical conte...
Artificially intelligent assistive agents are playing an increased role in our work and homes. In co...
The data explosion and development of artificial intelligence (AI) has fueled the demand for recomme...
A standard assumption in contextual multi-arm bandit is that the true context is perfectly known bef...
We consider a novel variant of the contextual bandit problem (i.e., the multi-armed bandit with side...
International audienceContextual bandit algorithms are essential for solving many real-world interac...
The bandit problem models a sequential decision process between a player and an environment. In the ...
We propose a new sequential decision-making setting, combining key aspects of two established online...
Contextual bandits are canonical models for sequential decision-making under uncertainty in environm...
We consider a remote contextual multi-armed bandit (CMAB) problem, in which the decision-maker obser...
We present a new algorithm for the contextual bandit learning problem, where the learner repeat-edly...
Contextual bandit algorithms are sensitive to the estimation method of the outcome model as well as ...
We present a new algorithm for the contextual bandit learning problem, where the learner repeatedly ...
Learning action policy for autonomous agents in a decentralized multi-agent environment has remained...
AI systems that learn through reward feedback about the actions they take are increasingly deployed ...
© 2019 Neural information processing systems foundation. All rights reserved. In the classical conte...
Artificially intelligent assistive agents are playing an increased role in our work and homes. In co...