This thesis explores the utility of representation learning for bioinformatics applications. It proposes approaches for generating low dimensional embeddings for molecular sequences, which proved effective for downstream bioinformatics tasks such as protein classification and protein-protein interaction prediction. One specific theme of this thesis is to develop a scalable and computationally effective solution for large scale sequence comparisons and two successful approaches – one based on a hierarchy of models, the other on a hybrid of two methods – are presented. The representation learning approaches proposed in this thesis are generic and can be adapted for similar problems within bioinformatics and other domains
In this thesis we are concerned with constructing algorithms that address problems of biological rel...
Abstract. The classification of biological sequences is one of the significant challenges in bioinfo...
While many excellent induction algorithms are known for making predictions from databases in well-st...
A representation method is an algorithm that calculates numerical feature vectors for samples in a d...
Sequences are a crucial concept in the bioinformatics field, as they can represent different kinds o...
Motivation: Machine-learning models trained on protein sequences and their measured functions can in...
We introduce a new representation and feature extraction method for biological sequences. Named bio-...
Protein-Protein Interactions (PPIs) are a crucial mechanism underpinning the function of the cell. S...
Todays we have the opportunity to analyze huge sets of genomics and proteomics data. In my bachaleor...
In recent years we have witnessed an exponential increase in the amount of biological information, e...
An important open problem in molecular biology is how to use computational methods to understand the...
A number of protein sequences are found and added to the database but its functional properties are ...
DNA sequences are the basic data type that is processed to perform a generic study of biological dat...
An important open problem in molecular biology is how to use computational methods to under-stand th...
Abstract Background Machine learning techniques have been widely applied to biological sequences, e....
In this thesis we are concerned with constructing algorithms that address problems of biological rel...
Abstract. The classification of biological sequences is one of the significant challenges in bioinfo...
While many excellent induction algorithms are known for making predictions from databases in well-st...
A representation method is an algorithm that calculates numerical feature vectors for samples in a d...
Sequences are a crucial concept in the bioinformatics field, as they can represent different kinds o...
Motivation: Machine-learning models trained on protein sequences and their measured functions can in...
We introduce a new representation and feature extraction method for biological sequences. Named bio-...
Protein-Protein Interactions (PPIs) are a crucial mechanism underpinning the function of the cell. S...
Todays we have the opportunity to analyze huge sets of genomics and proteomics data. In my bachaleor...
In recent years we have witnessed an exponential increase in the amount of biological information, e...
An important open problem in molecular biology is how to use computational methods to understand the...
A number of protein sequences are found and added to the database but its functional properties are ...
DNA sequences are the basic data type that is processed to perform a generic study of biological dat...
An important open problem in molecular biology is how to use computational methods to under-stand th...
Abstract Background Machine learning techniques have been widely applied to biological sequences, e....
In this thesis we are concerned with constructing algorithms that address problems of biological rel...
Abstract. The classification of biological sequences is one of the significant challenges in bioinfo...
While many excellent induction algorithms are known for making predictions from databases in well-st...