Datasets used for the offline evaluation of recommender systems are collected through user interactions with an already deployed recommender system. However, such datasets can be subject to different types of biases including a system’s popularity bias. In this paper, we focus on assessing the influence of popularity on the offline evaluation of recommendation systems. Our insights from a deeper analysis based on popularity-stratified sampling reveal that the current offline evaluation of recommendation systems are sensitive to popular items, raising questions about conclusions driven from the offline comparison of recommendation models
Offline evaluations of recommender systems attempt to estimate users’ satisfaction with recommendati...
Most recommender systems are evaluated on how they accurately predict user ratings. However, individ...
Recommender system evaluation usually focuses on the overall effectiveness of the algorithms, either...
Datasets used for the offline evaluation of recommender systems are collected through user interacti...
Recommendation systems are often evaluated based on user’s interactions that were collected from an ...
International audienceRecommendation systems have been integrated into the majority of large online ...
Popularity is often included in experimental evaluation to provide a reference performance for a rec...
In response to the quantity of information available on the Internet, many online service providers ...
International audienceRecommendation systems have been integrated into the majority of large online ...
Recommender systems help people find relevant content in a personalized way. One main promise of suc...
Recommender systems learn from historical users’ feedback that is often non-uniformly distributed ac...
Abstract. In academic studies, the evaluation of recommender system (RS) algorithms is often limited...
Recently, a few papers report counter-intuitive observations made from experiments on recommender sy...
In this paper, we present the results of an empirical evaluation investigating how recommendation a...
Offline evaluation of recommender systems mostly relies on historical data, which is often biased by...
Offline evaluations of recommender systems attempt to estimate users’ satisfaction with recommendati...
Most recommender systems are evaluated on how they accurately predict user ratings. However, individ...
Recommender system evaluation usually focuses on the overall effectiveness of the algorithms, either...
Datasets used for the offline evaluation of recommender systems are collected through user interacti...
Recommendation systems are often evaluated based on user’s interactions that were collected from an ...
International audienceRecommendation systems have been integrated into the majority of large online ...
Popularity is often included in experimental evaluation to provide a reference performance for a rec...
In response to the quantity of information available on the Internet, many online service providers ...
International audienceRecommendation systems have been integrated into the majority of large online ...
Recommender systems help people find relevant content in a personalized way. One main promise of suc...
Recommender systems learn from historical users’ feedback that is often non-uniformly distributed ac...
Abstract. In academic studies, the evaluation of recommender system (RS) algorithms is often limited...
Recently, a few papers report counter-intuitive observations made from experiments on recommender sy...
In this paper, we present the results of an empirical evaluation investigating how recommendation a...
Offline evaluation of recommender systems mostly relies on historical data, which is often biased by...
Offline evaluations of recommender systems attempt to estimate users’ satisfaction with recommendati...
Most recommender systems are evaluated on how they accurately predict user ratings. However, individ...
Recommender system evaluation usually focuses on the overall effectiveness of the algorithms, either...