Compression based pattern mining has been successfully applied to many data mining tasks. We propose an approach based on the minimum description length principle to extract sequential patterns that compress a database of sequences well. We show that mining compressing patterns is NP-Hard and belongs to the class of inapproximable problems. We propose two heuristic algorithms to mining compressing patterns. The ¿rst uses a two-phase approach similar to Krimp for itemset data. To overcome performance with the required candidate generation we propose GoKrimp, an e¿ective greedy algorithm that directly mines compressing patterns. We conduct an empirical study on six real-life datasets to compare the proposed algorithms by run time, compressibi...
Abstract. Sequential pattern mining is an important data mining task with wide applications. However...
International audienceConstraint-based pattern discovery is at the core of numerous data mining task...
Data mining is an interactive and iterative process. It is very likely that a user will execute a s...
Compression based pattern mining has been successfully applied to many data mining tasks. We propose...
Pattern mining based on data compression has been successfully applied in many data mining tasks. Fo...
Current sequential pattern mining algorithms often produce a large number of patterns. It is difficu...
Pattern mining is one of the best-known concepts in Data Mining. A big problem in pattern mining is ...
We propose a streaming algorithm, based on the minimal description length (MDL) principle, for extra...
Distinguishing sequential patterns are useful in characterizing a given sequence class and contrasti...
Sequential pattern mining first proposed by Agrawal and Srikant has received intensive research due ...
Most pattern mining methods yield a large number of frequent patterns, and isolating a small relevan...
The main advantage of Constraint Programming (CP) approaches for sequential pattern mining (SPM) is ...
研究了如何使用SP-Feature来压缩序列模式.SP-Feature是一种简洁表示序列模式的新颖结构.一种新的相似性度量被用来聚类SP-Feature,同时也给出了SP-Feature的合并方法.基...
Data mining is a set of methods used in the process of KDD ( Knowledge Discovery in Data) in order t...
Abstract:- This paper propose a novel algorithm for mining closed frequent sequences, a scalable, co...
Abstract. Sequential pattern mining is an important data mining task with wide applications. However...
International audienceConstraint-based pattern discovery is at the core of numerous data mining task...
Data mining is an interactive and iterative process. It is very likely that a user will execute a s...
Compression based pattern mining has been successfully applied to many data mining tasks. We propose...
Pattern mining based on data compression has been successfully applied in many data mining tasks. Fo...
Current sequential pattern mining algorithms often produce a large number of patterns. It is difficu...
Pattern mining is one of the best-known concepts in Data Mining. A big problem in pattern mining is ...
We propose a streaming algorithm, based on the minimal description length (MDL) principle, for extra...
Distinguishing sequential patterns are useful in characterizing a given sequence class and contrasti...
Sequential pattern mining first proposed by Agrawal and Srikant has received intensive research due ...
Most pattern mining methods yield a large number of frequent patterns, and isolating a small relevan...
The main advantage of Constraint Programming (CP) approaches for sequential pattern mining (SPM) is ...
研究了如何使用SP-Feature来压缩序列模式.SP-Feature是一种简洁表示序列模式的新颖结构.一种新的相似性度量被用来聚类SP-Feature,同时也给出了SP-Feature的合并方法.基...
Data mining is a set of methods used in the process of KDD ( Knowledge Discovery in Data) in order t...
Abstract:- This paper propose a novel algorithm for mining closed frequent sequences, a scalable, co...
Abstract. Sequential pattern mining is an important data mining task with wide applications. However...
International audienceConstraint-based pattern discovery is at the core of numerous data mining task...
Data mining is an interactive and iterative process. It is very likely that a user will execute a s...