Itemset mining is a well-known exploratory technique used to discover interesting correlations hidden in a data collection. Since ever increasing amounts of data are being collected and stored (e.g., business transactions, medical and biological data, context-aware applications), scalable and efficient approaches are needed to analyzing these large data collections. This paper proposes a parallel disk-based approach to efficiently supporting frequent itemset mining on a multi-core processor. Our parallel strategy is presented in the context of the VLDB-Mine persistent data structure. Different techniques have been proposed to optimize both data- and compute-intensive aspects of the mining algorithm. Preliminary experiments, performed on bot...
Frequent-itemset mining is an essential part of the association rule mining process, which has many ...
Recently, several algorithms based on the MapReduce framework have been proposed for frequent patter...
Data mining is proving itself to be a very important fi eld as the data available is increasing expo...
Itemset mining looks for correlations among data items in large transactional datasets. Traditional...
Frequent itemset mining is an important building block in many data mining applications like market ...
International audienceFrequent itemset mining presents one of the fundamental building blocks in dat...
International audienceFrequent itemset mining (FIM) is one of the fundamental cornerstones in data m...
In this paper, we propose an algorithm to partition both the search space and the database for the p...
The main focus of this report is on frequent intra- and inter-transaction itemset mining, specifical...
Abstract Traditional methods for data mining typically make the assumption that data is centralized ...
This thesis addresses the issue of enhancing the scalability of data mining techniques, with specifi...
International audienceDespite crucial recent advances, the problem of frequent itemset mining is sti...
Frequent-itemset mining is an important part of data mining. It is a computational and memory intens...
We present a survey of the most important algorithms that have been pro- posed in the context of the...
Itemset mining is a well-known exploratory data mining technique used to discover interesting correl...
Frequent-itemset mining is an essential part of the association rule mining process, which has many ...
Recently, several algorithms based on the MapReduce framework have been proposed for frequent patter...
Data mining is proving itself to be a very important fi eld as the data available is increasing expo...
Itemset mining looks for correlations among data items in large transactional datasets. Traditional...
Frequent itemset mining is an important building block in many data mining applications like market ...
International audienceFrequent itemset mining presents one of the fundamental building blocks in dat...
International audienceFrequent itemset mining (FIM) is one of the fundamental cornerstones in data m...
In this paper, we propose an algorithm to partition both the search space and the database for the p...
The main focus of this report is on frequent intra- and inter-transaction itemset mining, specifical...
Abstract Traditional methods for data mining typically make the assumption that data is centralized ...
This thesis addresses the issue of enhancing the scalability of data mining techniques, with specifi...
International audienceDespite crucial recent advances, the problem of frequent itemset mining is sti...
Frequent-itemset mining is an important part of data mining. It is a computational and memory intens...
We present a survey of the most important algorithms that have been pro- posed in the context of the...
Itemset mining is a well-known exploratory data mining technique used to discover interesting correl...
Frequent-itemset mining is an essential part of the association rule mining process, which has many ...
Recently, several algorithms based on the MapReduce framework have been proposed for frequent patter...
Data mining is proving itself to be a very important fi eld as the data available is increasing expo...