Advances in high-throughput sequencing have increased the availability of microbiome sequencing data that can be exploited to characterize microbiome community structure in situ. We explore using word and sentence embedding approaches for nucleotide sequences since they may be a suitable numerical representation for downstream machine learning applications (especially deep learning). This work involves first encoding (“embedding”) each sequence into a dense, low-dimensional, numeric vector space. Here, we use Skip-Gram word2vec to embed k-mers, obtained from 16S rRNA amplicon surveys, and then leverage an existing sentence embedding technique to embed all sequences belonging to specific body sites or samples. We demonstrate that these repre...
To analyze complex biodiversity in microbial communities, 16S rRNA marker gene sequences are often a...
The human microbiome, the ensemble of microorganisms found in and on the human body, plays a key rol...
Motivation: Self-Organizing Maps (SOMs) are readily-available bioinformatics methods for clustering ...
Advances in high-throughput sequencing have increased the availability of microbiome sequencing data...
The increasing availability of microbiome survey data has led to the use of complex machine learning...
The increasing availability of microbiome survey data has led to the use of complex machine learning...
Embedding results were generated using 256 dimensional embeddings of 10-mers that were denoised. A: ...
Surveys of microbial populations in environmental niches of interest often utilize sequence variatio...
The 16S ribosomal RNA gene commonly serves as a molecular marker for investigating microbial communi...
MotivationCombining a 16S rRNA (16S) gene database with metagenomic shotgun sequences promises unbia...
BackgroundMost of our knowledge about the remarkable microbial diversity on Earth comes from sequenc...
Microbial communities play an essential role in Earth’s ecosystems. The goal of this study was to in...
Abstract Background Most of our knowledge about the remarkable microbial diversity on Earth comes fr...
To analyze complex biodiversity in microbial communities, 16S rRNA marker gene sequences are often a...
<div><p>Massively parallel high throughput sequencing technologies allow us to interrogate the micro...
To analyze complex biodiversity in microbial communities, 16S rRNA marker gene sequences are often a...
The human microbiome, the ensemble of microorganisms found in and on the human body, plays a key rol...
Motivation: Self-Organizing Maps (SOMs) are readily-available bioinformatics methods for clustering ...
Advances in high-throughput sequencing have increased the availability of microbiome sequencing data...
The increasing availability of microbiome survey data has led to the use of complex machine learning...
The increasing availability of microbiome survey data has led to the use of complex machine learning...
Embedding results were generated using 256 dimensional embeddings of 10-mers that were denoised. A: ...
Surveys of microbial populations in environmental niches of interest often utilize sequence variatio...
The 16S ribosomal RNA gene commonly serves as a molecular marker for investigating microbial communi...
MotivationCombining a 16S rRNA (16S) gene database with metagenomic shotgun sequences promises unbia...
BackgroundMost of our knowledge about the remarkable microbial diversity on Earth comes from sequenc...
Microbial communities play an essential role in Earth’s ecosystems. The goal of this study was to in...
Abstract Background Most of our knowledge about the remarkable microbial diversity on Earth comes fr...
To analyze complex biodiversity in microbial communities, 16S rRNA marker gene sequences are often a...
<div><p>Massively parallel high throughput sequencing technologies allow us to interrogate the micro...
To analyze complex biodiversity in microbial communities, 16S rRNA marker gene sequences are often a...
The human microbiome, the ensemble of microorganisms found in and on the human body, plays a key rol...
Motivation: Self-Organizing Maps (SOMs) are readily-available bioinformatics methods for clustering ...