International audienceFrequent itemset mining presents one of the fundamental building blocks in data mining. However, despite the crucial recent advances that have been made in data mining literature, few of both standard and improved solutions scale. This is particularly the case when i) the quantity of data tends to be very large and/or ii) the minimum support is very low. In this paper, we address the problem of parallel frequent itemset mining (PFIM) in very large databases, and study the impact and effectiveness of using specific data placement strategies in a massively distributed environment. By offering a clever data placement and an optimal organization of the extraction algorithms, we show that the arrangement of both the data an...
Abstract Traditional methods for data mining typically make the assumption that data is centralized ...
International audienceMining big datasets poses a number of challenges which are not easily addresse...
Several classes of scientific and commercial applications require the execution of a large number of...
International audienceFrequent itemset mining (FIM) is one of the fundamental cornerstones in data m...
International audienceDespite crucial recent advances, the problem of frequent itemset mining is sti...
Itemset mining is a well-known exploratory technique used to discover interesting correlations hidde...
In recent years, knowledge discovery in databases provides a powerful capability to discover meaning...
International audienceData analytics in general, and data mining primitives in particular , are a ma...
Frequent Itemsets Mining (FIM) is a fundamental mining model and plays an important role in Data Min...
Itemset mining is a well-known exploratory data mining technique used to discover interesting correl...
Traditional methods for frequent itemset mining typically assume that data is centralized and static...
Recently, several algorithms based on the MapReduce framework have been proposed for frequent patter...
Frequent itemset mining is an important building block in many data mining applications like market ...
Abstract. When computationally feasible, mining huge databases produces tremendously large numbers o...
Frequent pattern mining is an essential data mining task, with a goal of discovering knowledge in th...
Abstract Traditional methods for data mining typically make the assumption that data is centralized ...
International audienceMining big datasets poses a number of challenges which are not easily addresse...
Several classes of scientific and commercial applications require the execution of a large number of...
International audienceFrequent itemset mining (FIM) is one of the fundamental cornerstones in data m...
International audienceDespite crucial recent advances, the problem of frequent itemset mining is sti...
Itemset mining is a well-known exploratory technique used to discover interesting correlations hidde...
In recent years, knowledge discovery in databases provides a powerful capability to discover meaning...
International audienceData analytics in general, and data mining primitives in particular , are a ma...
Frequent Itemsets Mining (FIM) is a fundamental mining model and plays an important role in Data Min...
Itemset mining is a well-known exploratory data mining technique used to discover interesting correl...
Traditional methods for frequent itemset mining typically assume that data is centralized and static...
Recently, several algorithms based on the MapReduce framework have been proposed for frequent patter...
Frequent itemset mining is an important building block in many data mining applications like market ...
Abstract. When computationally feasible, mining huge databases produces tremendously large numbers o...
Frequent pattern mining is an essential data mining task, with a goal of discovering knowledge in th...
Abstract Traditional methods for data mining typically make the assumption that data is centralized ...
International audienceMining big datasets poses a number of challenges which are not easily addresse...
Several classes of scientific and commercial applications require the execution of a large number of...