This thesis describes an approach to data-driven discovery of decision trees or rules for assigning protein sequences to functional families using sequence motifs. This method is able to capture regularities that can be described in terms of presence or absence of arbitrary combinations of motifs. A training set of peptidase sequences labeled with the corresponding MEROPS functional families or clans is used to automatically construct decision trees that capture regularities that are sufficient to assign the sequences to their respective functional families. The performance of the resulting decision tree classifiers is then evaluated on an independent test set. Results of experiments that proposed approach matches or outperforms protein fun...
The algorithm of extracting motifs from a family or subfamily is still a hot spot in bioinformatics....
Biology has become a data‐intensive research field. Coping with the flood of data from the new genom...
Given a functionally heterogeneous group of proteins, such as a large superfamily, or an entire data...
www.cs.iastate.edu/~honavar/aigroup.html This paper describes an approach to data-driven discovery o...
We use methods from Data Mining and Knowledge Discovery to design an algorithm for detecting motifs ...
Predicting the function of a protein from its sequence is typically addressed using sequence-similar...
Summary. Protein function prediction, i.e. classification of protein sequences according to their bi...
We describe a method for discovering active motifs in a set of related protein sequences. The method...
Discrete motifs that discriminate functional classes of proteins are useful for classifying new sequ...
Abstract32 consensus patterns for a set of functional regions and structural motifs in protein seque...
We introduce an unsupervised method for extracting meaningful motifs from biological sequence data. ...
Protein sequence motifs are gathering more and more attention in the field of sequence analysis. Th...
To classify proteins into functional families based on their primary sequences, existing classificat...
We use methods from data mining and knowledge discovery to design an algorithm for detecting motifs ...
Part 1: ANN-Classification and Pattern RecognitionInternational audienceIn this study protein sequen...
The algorithm of extracting motifs from a family or subfamily is still a hot spot in bioinformatics....
Biology has become a data‐intensive research field. Coping with the flood of data from the new genom...
Given a functionally heterogeneous group of proteins, such as a large superfamily, or an entire data...
www.cs.iastate.edu/~honavar/aigroup.html This paper describes an approach to data-driven discovery o...
We use methods from Data Mining and Knowledge Discovery to design an algorithm for detecting motifs ...
Predicting the function of a protein from its sequence is typically addressed using sequence-similar...
Summary. Protein function prediction, i.e. classification of protein sequences according to their bi...
We describe a method for discovering active motifs in a set of related protein sequences. The method...
Discrete motifs that discriminate functional classes of proteins are useful for classifying new sequ...
Abstract32 consensus patterns for a set of functional regions and structural motifs in protein seque...
We introduce an unsupervised method for extracting meaningful motifs from biological sequence data. ...
Protein sequence motifs are gathering more and more attention in the field of sequence analysis. Th...
To classify proteins into functional families based on their primary sequences, existing classificat...
We use methods from data mining and knowledge discovery to design an algorithm for detecting motifs ...
Part 1: ANN-Classification and Pattern RecognitionInternational audienceIn this study protein sequen...
The algorithm of extracting motifs from a family or subfamily is still a hot spot in bioinformatics....
Biology has become a data‐intensive research field. Coping with the flood of data from the new genom...
Given a functionally heterogeneous group of proteins, such as a large superfamily, or an entire data...