We present a message-passing based parallel version of the Space Saving algorithm designed to solve the $k$--majority problem. The algorithm determines in parallel frequent items, i.e., those whose frequency is greater than a given threshold, and is therefore useful for iceberg queries and many other different contexts. We apply our algorithm to the detection of frequent items in both real and synthetic datasets whose probability distribution functions are a Hurwitz and a Zipf distribution respectively. Also, we compare its parallel performances and accuracy against a parallel algorithm recently proposed for merging summaries derived by the Space Saving or Frequent algorithms
We introduce a transaction database distribution scheme that divides the frequent item set mining ta...
Frequent itemset mining is an important building block in many data mining applications like market ...
In big data analysis, frequent itemsets mining plays a key role in mining associations, correlations...
We present a message-passing based parallel version of the Space Saving algorithm designed to solve ...
Given an array A of n elements and a value 2≤k≤n, a frequent item or k-majority element is an elemen...
Recently, several algorithms based on the MapReduce framework have been proposed for frequent patter...
We present a deterministic parallel algorithm for the k-majority problem, that can be used to find i...
In this paper we present PFDCMSS (Parallel Forward Decay Count-Min Space Saving) which, to the best ...
Data mining is an emerging research area, whose goal is to discover potentially useful information e...
International audienceThe problem of closed frequent itemset discovery is a fundamental problem of d...
In today’s world, large volumes of data are being continuously generated by many scientific applicat...
Frequent-itemset mining is an important part of data mining. It is a computational and memory intens...
Cataloged from PDF version of article.We introduce a transaction database distribution scheme that d...
We present scalable parallel algorithms with sublinear per-processor communication volume and low la...
Ankara : The Department of Computer Engineering and the Graduate School of Engineering and Science o...
We introduce a transaction database distribution scheme that divides the frequent item set mining ta...
Frequent itemset mining is an important building block in many data mining applications like market ...
In big data analysis, frequent itemsets mining plays a key role in mining associations, correlations...
We present a message-passing based parallel version of the Space Saving algorithm designed to solve ...
Given an array A of n elements and a value 2≤k≤n, a frequent item or k-majority element is an elemen...
Recently, several algorithms based on the MapReduce framework have been proposed for frequent patter...
We present a deterministic parallel algorithm for the k-majority problem, that can be used to find i...
In this paper we present PFDCMSS (Parallel Forward Decay Count-Min Space Saving) which, to the best ...
Data mining is an emerging research area, whose goal is to discover potentially useful information e...
International audienceThe problem of closed frequent itemset discovery is a fundamental problem of d...
In today’s world, large volumes of data are being continuously generated by many scientific applicat...
Frequent-itemset mining is an important part of data mining. It is a computational and memory intens...
Cataloged from PDF version of article.We introduce a transaction database distribution scheme that d...
We present scalable parallel algorithms with sublinear per-processor communication volume and low la...
Ankara : The Department of Computer Engineering and the Graduate School of Engineering and Science o...
We introduce a transaction database distribution scheme that divides the frequent item set mining ta...
Frequent itemset mining is an important building block in many data mining applications like market ...
In big data analysis, frequent itemsets mining plays a key role in mining associations, correlations...