We propose new frequent substring pattern mining which can enumerate all substrings with statistically significant frequencies of their locally optimal occurrences from a given single sequence. Our target application is genome sequences, around a half being said to be covered by interspersed and consecutive (tandem) repeats, and detecting these repeats is an important task in molecular life sciences. We evaluate the statistical significance of frequent substrings by using a string generation model with a memoryless stationary information source. We combine this idea with an existing algorithm, ESFLOO-0G.C (Nakamura et al. 2016), to enumerate all statistically significant substrings with locally optimal occurrences. We further develop a para...
De-identifying textual data is an important task for publishing and sharing the data among researche...
The problem of discovering frequent arrangements of regions of high occurrence of one or more items ...
The enormous growth of biomolecular databases makes it increasingly important to have fast and autom...
We propose new frequent substring pattern mining which can enumerate all substrings with statistical...
We consider a data mining problem in a large collection of unstructured texts based on association r...
We study a problem of mining frequently occurring periodic patterns with a gap requirement from sequ...
We study a problem of mining frequently occurring periodic patterns with a gap requirement from sequ...
Bio-data analysis deals with the most vital discovering problem of similarity search and finding rel...
In this paper, we study a problem which is, given a set of genome sequences, to find common subseque...
Frequent graph mining has received considerable attention from researchers. Existing algorithms for ...
We study a problem of mining frequently occurring periodic patterns with a gap requirement from sequ...
The problem of characterizing and detecting recurrent sequence patterns such as substrings or motifs...
Since early stages of bioinformatics, substrings played a crucial role in the search and discovery o...
The problem of characterizing and detecting recurrent sequence patterns such as substrings or motifs...
Abstract. We consider the problem of mining subsequences with sur-prising event counts. When mining ...
De-identifying textual data is an important task for publishing and sharing the data among researche...
The problem of discovering frequent arrangements of regions of high occurrence of one or more items ...
The enormous growth of biomolecular databases makes it increasingly important to have fast and autom...
We propose new frequent substring pattern mining which can enumerate all substrings with statistical...
We consider a data mining problem in a large collection of unstructured texts based on association r...
We study a problem of mining frequently occurring periodic patterns with a gap requirement from sequ...
We study a problem of mining frequently occurring periodic patterns with a gap requirement from sequ...
Bio-data analysis deals with the most vital discovering problem of similarity search and finding rel...
In this paper, we study a problem which is, given a set of genome sequences, to find common subseque...
Frequent graph mining has received considerable attention from researchers. Existing algorithms for ...
We study a problem of mining frequently occurring periodic patterns with a gap requirement from sequ...
The problem of characterizing and detecting recurrent sequence patterns such as substrings or motifs...
Since early stages of bioinformatics, substrings played a crucial role in the search and discovery o...
The problem of characterizing and detecting recurrent sequence patterns such as substrings or motifs...
Abstract. We consider the problem of mining subsequences with sur-prising event counts. When mining ...
De-identifying textual data is an important task for publishing and sharing the data among researche...
The problem of discovering frequent arrangements of regions of high occurrence of one or more items ...
The enormous growth of biomolecular databases makes it increasingly important to have fast and autom...