Gene content has been shown to contain a strong phylogenetic signal, yet its usage for phylogenetic questions is hampered by horizontal gene transfer and parallel gene loss and until now required completely sequenced genomes. Here, we introduce an approach that allows the phylogenetic signal in gene content to be applied to any set of sequences, using signature genes for phylogenetic classification. The hundreds of publicly available genomes allow us to identify signature genes at various taxonomic depths, and we show how the presence of signature genes in an unspecified sample can be used to characterize its taxonomic composition. We identify 8,362 signature genes specific for 112 prokaryotic taxa. We show that these signature genes can be...
Abstract Background Previous methods of detecting the taxonomic origins of arbitrary sequence collec...
With the astonishing rate that genomic and metagenomic sequence data sets are accumulating, there ar...
Phylogenomics heavily relies on well-curated sequence data sets that comprise, for each gene, exclus...
Gene content has been shown to contain a strong phylogenetic signal, yet its usage for phylogenetic ...
Contains fulltext : 71026.pdf (publisher's version ) (Closed access)Gene content h...
Signature genes are genes that are unique to a taxonomic clade and are common within it. They contai...
BACKGROUND: Molecular phylogenetic methods are based on alignments of nucleic or peptidic sequences....
Abstract Background Molecular phylogenetic methods are based on alignments of nucleic or peptidic se...
Organisms are unique physical entities in which information is stored and continuously processed. Th...
BACKGROUND:With the increased availability of sequenced genomes there have been several initiatives ...
Item does not contain fulltextSpecies phylogenies derived from comparisons of single genes are rarel...
Mathematical characterizations of biological sequences form one of the main elements of bioinformati...
<p>For a variety of kinds of studies it is useful to have a collection of so-called “phylogenetic ma...
Phylogenetic research is often stymied by selection of a marker that leads to poor phylogenetic reso...
With the astonishing rate that genomic and metagenomic sequence data sets are accumulating, there ar...
Abstract Background Previous methods of detecting the taxonomic origins of arbitrary sequence collec...
With the astonishing rate that genomic and metagenomic sequence data sets are accumulating, there ar...
Phylogenomics heavily relies on well-curated sequence data sets that comprise, for each gene, exclus...
Gene content has been shown to contain a strong phylogenetic signal, yet its usage for phylogenetic ...
Contains fulltext : 71026.pdf (publisher's version ) (Closed access)Gene content h...
Signature genes are genes that are unique to a taxonomic clade and are common within it. They contai...
BACKGROUND: Molecular phylogenetic methods are based on alignments of nucleic or peptidic sequences....
Abstract Background Molecular phylogenetic methods are based on alignments of nucleic or peptidic se...
Organisms are unique physical entities in which information is stored and continuously processed. Th...
BACKGROUND:With the increased availability of sequenced genomes there have been several initiatives ...
Item does not contain fulltextSpecies phylogenies derived from comparisons of single genes are rarel...
Mathematical characterizations of biological sequences form one of the main elements of bioinformati...
<p>For a variety of kinds of studies it is useful to have a collection of so-called “phylogenetic ma...
Phylogenetic research is often stymied by selection of a marker that leads to poor phylogenetic reso...
With the astonishing rate that genomic and metagenomic sequence data sets are accumulating, there ar...
Abstract Background Previous methods of detecting the taxonomic origins of arbitrary sequence collec...
With the astonishing rate that genomic and metagenomic sequence data sets are accumulating, there ar...
Phylogenomics heavily relies on well-curated sequence data sets that comprise, for each gene, exclus...