Abstract. We consider the problem of mining subsequences with sur-prising event counts. When mining patterns, we often test a very large number of potentially present patterns, leading to a high likelihood of finding spurious results. Typically, this problem grows as the size of the data increases. Existing methods for statistical testing are not usable for mining patterns in big data, because they are either computationally too demanding, or fail to take into account the dependency structure between patterns, leading to true findings going unnoticed. We propose a new method to compute the significance of event frequencies in sub-sequences of a long data sequence. The method is based on analyzing the joint distribution of the patterns, omit...
Abstract. We consider the sequence comparison problem, also known as “hidden ” pattern problem, wher...
In this thesis, we study scalable and general purpose methods for mining frequent sequences that sat...
We study a problem of mining frequently occurring periodic patterns with a gap requirement from sequ...
1 Introduction Detecting subsequence patterns in event sequences is important in many applications, ...
We propose new frequent substring pattern mining which can enumerate all substrings with statistical...
We propose new frequent substring pattern mining which can enumerate all substrings with statistical...
Abstract — There is a huge wealth of sequence data available, for example, customer purchase histori...
We study here the so called subsequence pattern matching also known as hidden pattern matching in wh...
Hypothesis testing using constrained null models can be used to compute the significance of data min...
In order to find patterns in data, it is often necessary to aggregate or summarise data at a higher ...
International audienceDiscovering interesting patterns in event sequences is a popular taskin the fi...
Many types of data, e.g., natural language texts, biological sequences, or time series of sensor dat...
Many types of data, e.g., natural language texts, biological sequences, or time series of sensor dat...
International audienceDiscovering interesting patterns in event sequences is a popular taskin the fi...
In order to find patterns in data, it is often necessary to aggregate or summarise data at a higher ...
Abstract. We consider the sequence comparison problem, also known as “hidden ” pattern problem, wher...
In this thesis, we study scalable and general purpose methods for mining frequent sequences that sat...
We study a problem of mining frequently occurring periodic patterns with a gap requirement from sequ...
1 Introduction Detecting subsequence patterns in event sequences is important in many applications, ...
We propose new frequent substring pattern mining which can enumerate all substrings with statistical...
We propose new frequent substring pattern mining which can enumerate all substrings with statistical...
Abstract — There is a huge wealth of sequence data available, for example, customer purchase histori...
We study here the so called subsequence pattern matching also known as hidden pattern matching in wh...
Hypothesis testing using constrained null models can be used to compute the significance of data min...
In order to find patterns in data, it is often necessary to aggregate or summarise data at a higher ...
International audienceDiscovering interesting patterns in event sequences is a popular taskin the fi...
Many types of data, e.g., natural language texts, biological sequences, or time series of sensor dat...
Many types of data, e.g., natural language texts, biological sequences, or time series of sensor dat...
International audienceDiscovering interesting patterns in event sequences is a popular taskin the fi...
In order to find patterns in data, it is often necessary to aggregate or summarise data at a higher ...
Abstract. We consider the sequence comparison problem, also known as “hidden ” pattern problem, wher...
In this thesis, we study scalable and general purpose methods for mining frequent sequences that sat...
We study a problem of mining frequently occurring periodic patterns with a gap requirement from sequ...