Protein sequences vary in their length and are not readily amenable to conventional data mining techniques that need mapping in a fixed dimensional space. Thus, majority of the current methods for protein sequence classification are based on alignment of the query sequence either with a sequence or a pro. le of the sequence family. We present a method for mapping of protein sequences in a fixed dimensional descriptor space. The descriptors such as amino acid content and amino acid pair association rules were used along with routinely available classification methods. An experiment on one hundred Pfam families showed classification accuracy of 98% with support vector machines classifier. Information gain based feature selection helped simpli...
Comparing protein sequences is an essential procedure that has many applications in the field of bio...
For the vast majority of proteins no experimental information about the three-dimensional structure ...
Motivation: Many proteins with vastly dissimilar sequences are found to share a common fold, as evid...
Predicting protein structure and function from amino acid sequences is a central aim of bioinformati...
To classify proteins into functional families based on their primary sequences, existing classificat...
Protein classification is an important problem in automated protein functional and structural annota...
AbstractComputational methods of predicting protein functions rely on detecting similarities among p...
The classification of protein sequences provides valuable insights into bioinformatics. Most existin...
In this paper, we propose a new protein map which incorporates with various properties of amino acid...
Establishing functional relationships between multi-domain protein sequences is a non-trivial task. ...
Modern sequencing initiatives have uncovered a large number of protein sequence data. The exponentia...
The need for quick gene categorization tools is growing as more genomes are sequenced. To evaluate a...
This capstone project examines the performance of existing embedding based alignment-free methods f...
In this paper, we have proposed a novel alignment-free method for comparing the similarity of protei...
<div><p>In this paper, we have proposed a novel alignment-free method for comparing the similarity o...
Comparing protein sequences is an essential procedure that has many applications in the field of bio...
For the vast majority of proteins no experimental information about the three-dimensional structure ...
Motivation: Many proteins with vastly dissimilar sequences are found to share a common fold, as evid...
Predicting protein structure and function from amino acid sequences is a central aim of bioinformati...
To classify proteins into functional families based on their primary sequences, existing classificat...
Protein classification is an important problem in automated protein functional and structural annota...
AbstractComputational methods of predicting protein functions rely on detecting similarities among p...
The classification of protein sequences provides valuable insights into bioinformatics. Most existin...
In this paper, we propose a new protein map which incorporates with various properties of amino acid...
Establishing functional relationships between multi-domain protein sequences is a non-trivial task. ...
Modern sequencing initiatives have uncovered a large number of protein sequence data. The exponentia...
The need for quick gene categorization tools is growing as more genomes are sequenced. To evaluate a...
This capstone project examines the performance of existing embedding based alignment-free methods f...
In this paper, we have proposed a novel alignment-free method for comparing the similarity of protei...
<div><p>In this paper, we have proposed a novel alignment-free method for comparing the similarity o...
Comparing protein sequences is an essential procedure that has many applications in the field of bio...
For the vast majority of proteins no experimental information about the three-dimensional structure ...
Motivation: Many proteins with vastly dissimilar sequences are found to share a common fold, as evid...