We propose a streaming algorithm, based on the minimal description length (MDL) principle, for extracting non-redundant sequential patterns. For static databases, the MDL-based approach that selects patterns based on their capacity to compress data rather than their frequency, was shown to be remarkably effective for extracting meaningful patterns and solving the redundancy issue in frequent itemset and sequence mining. The existing MDL-based algorithms, however, either start from a seed set of frequent patterns, or require multiple passes through the data. As such, the existing approaches scale poorly and are unsuitable for large datasets. Therefore, our main contribution is the proposal of a new, streaming algorithm, called Zips, that doe...
Current sequential pattern mining algorithms often produce a large number of patterns. It is difficu...
Abstract Sequential pattern mining is an interesting data mining problem with many real-world applic...
In recent years the emergence of new real-world applications such as network traffic monitoring, int...
We propose a streaming algorithm, based on the minimal description length (MDL) principle, for extra...
Pattern mining based on data compression has been successfully applied in many data mining tasks. Fo...
Compression based pattern mining has been successfully applied to many data mining tasks. We propose...
International audienceIn recent years, emerging applications introduced new constraints for data min...
Pattern mining is one of the best-known concepts in Data Mining. A big problem in pattern mining is ...
Sequential pattern mining in data streams environment is an interesting data mining problem. The pro...
Most pattern mining methods yield a large number of frequent patterns, and isolating a small relevan...
International audienceIn recent years the emergence of new real-world applications such as network t...
Abstract. Discovering frequent patterns over event sequences is an important data mining problem. Ex...
International audienceIn recent years, emerging applications introduced new constraints for data min...
Discovering the key structure of a database is one of the main goals of data mining. In pattern set ...
National audienceIn recent years the emergence of new real-world applications such as network traffi...
Current sequential pattern mining algorithms often produce a large number of patterns. It is difficu...
Abstract Sequential pattern mining is an interesting data mining problem with many real-world applic...
In recent years the emergence of new real-world applications such as network traffic monitoring, int...
We propose a streaming algorithm, based on the minimal description length (MDL) principle, for extra...
Pattern mining based on data compression has been successfully applied in many data mining tasks. Fo...
Compression based pattern mining has been successfully applied to many data mining tasks. We propose...
International audienceIn recent years, emerging applications introduced new constraints for data min...
Pattern mining is one of the best-known concepts in Data Mining. A big problem in pattern mining is ...
Sequential pattern mining in data streams environment is an interesting data mining problem. The pro...
Most pattern mining methods yield a large number of frequent patterns, and isolating a small relevan...
International audienceIn recent years the emergence of new real-world applications such as network t...
Abstract. Discovering frequent patterns over event sequences is an important data mining problem. Ex...
International audienceIn recent years, emerging applications introduced new constraints for data min...
Discovering the key structure of a database is one of the main goals of data mining. In pattern set ...
National audienceIn recent years the emergence of new real-world applications such as network traffi...
Current sequential pattern mining algorithms often produce a large number of patterns. It is difficu...
Abstract Sequential pattern mining is an interesting data mining problem with many real-world applic...
In recent years the emergence of new real-world applications such as network traffic monitoring, int...