In this paper, we study a problem which is, given a set of genome sequences, to find common subsequences. We assume that the sequences are generated by some fixed but unknown pattern. The authors developed a method, called “substring amplification,” to find the template part of a pattern from semi-structured documents, such as HTML files, generated by the pattern. Substring amplification exploits the disparity of frequency distributions between the template and background parts, and so requires only positive data. In HTML files, many characters are used and the length of a successive part of a template is enough long compared to genome sequences. In this paper, we examine the applicability of the method to genome sequences in which a consta...
Abstract—Most of existing sequence mining algorithms focuses on mining for subsequences. A large cla...
Bio-data analysis deals with the most vital discovering problem of similarity search and finding rel...
The emergence of automated high-throughput sequencing technologies has resulted in a huge increase o...
13pIn this paper, we consider to find a set of substrings common to given strings. We define this pr...
We propose new frequent substring pattern mining which can enumerate all substrings with statistical...
We propose new frequent substring pattern mining which can enumerate all substrings with statistical...
The enormous growth of biomolecular databases makes it increasingly important to have fast and autom...
The enormous growth of biomolecular databases makes it increasingly important to have fast and autom...
The enormous growth of biomolecular databases makes it increasingly important to have fast and autom...
In recent years, we have seen a rapid increase in the available DNA and protein data coming from var...
In recent years, we have seen a rapid increase in the available DNA and protein data coming from var...
Genomics, with the high amount of heterogeneous data that it is generating, is opening many interest...
Genomics, with the high amount of heterogeneous data that it is generating, is opening many interest...
Genomics, with the high amount of heterogeneous data that it is generating, is opening many interest...
Genomics, with the high amount of heterogeneous data that it is generating, is opening many interest...
Abstract—Most of existing sequence mining algorithms focuses on mining for subsequences. A large cla...
Bio-data analysis deals with the most vital discovering problem of similarity search and finding rel...
The emergence of automated high-throughput sequencing technologies has resulted in a huge increase o...
13pIn this paper, we consider to find a set of substrings common to given strings. We define this pr...
We propose new frequent substring pattern mining which can enumerate all substrings with statistical...
We propose new frequent substring pattern mining which can enumerate all substrings with statistical...
The enormous growth of biomolecular databases makes it increasingly important to have fast and autom...
The enormous growth of biomolecular databases makes it increasingly important to have fast and autom...
The enormous growth of biomolecular databases makes it increasingly important to have fast and autom...
In recent years, we have seen a rapid increase in the available DNA and protein data coming from var...
In recent years, we have seen a rapid increase in the available DNA and protein data coming from var...
Genomics, with the high amount of heterogeneous data that it is generating, is opening many interest...
Genomics, with the high amount of heterogeneous data that it is generating, is opening many interest...
Genomics, with the high amount of heterogeneous data that it is generating, is opening many interest...
Genomics, with the high amount of heterogeneous data that it is generating, is opening many interest...
Abstract—Most of existing sequence mining algorithms focuses on mining for subsequences. A large cla...
Bio-data analysis deals with the most vital discovering problem of similarity search and finding rel...
The emergence of automated high-throughput sequencing technologies has resulted in a huge increase o...