Three files containing sequences extracted from 1,049,210 bacterial genomes available from GenBank (release 252). Protein coding sequences were annotated with IDTAXA (PMID: 34541527) using taxon-specific KEGG groups (Bacteria_Protein_subset.fas.gz). These annotations were transferred to their corresponding (nucleotide) coding sequences (Bacteria_Nucleotide_subset.fas.gz). Intergenic regions were extracted from each genome and annotated by FindNonCoding (PMID: 34636849) for their overlap with any of 25 common bacterial non-coding RNAs in Rfam (v14). Intergenic regions were required to be at least 100 nucleotides long and contain no ambiguities (Bacteria_Intergenic_subset.fas.gz). Each subset contains only distinct sequences randomly ordered....
BG7 is a new system for de novo bacterial, archaeal and viral genome annotation based on a new appro...
International audienceGenome annotation is subdivided into 2 phases: syntactical annotation, i.e. pr...
Motivation: Novel sequencing techniques can give access to organisms that are difficult to cultivate...
Genome sequences are annotated by computational prediction of coding sequences, followed by similari...
In this report we address the problem of accurate statistical modeling of DNA sequences, either codi...
Over the last years a great number of bacterial genomes were sequenced. Now one of the most importan...
Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, p...
Accurate automatic assignment of protein functions remains a chal-lenge for genome annotation. We ha...
Motivation: The number of bacterial genomes being sequenced is increasing very rapidly and hence, it...
Bacterial genome annotations are accumulating rapidly in the GenBank database and the use of automat...
Background: Analysis of non- coding sequences in several bacterial genomes brought to the identifica...
The annotation of genomes from next-generation sequencing platforms needs to ...
FASTA files containing all bacterial proteins used to build databases in this study. "NCBI_Bacteria"...
The availability of next-generation sequences of transcripts from prokaryotic organisms offers the o...
The increasing availability of bacterial genome sequences and genome-wide laboratory analyses has op...
BG7 is a new system for de novo bacterial, archaeal and viral genome annotation based on a new appro...
International audienceGenome annotation is subdivided into 2 phases: syntactical annotation, i.e. pr...
Motivation: Novel sequencing techniques can give access to organisms that are difficult to cultivate...
Genome sequences are annotated by computational prediction of coding sequences, followed by similari...
In this report we address the problem of accurate statistical modeling of DNA sequences, either codi...
Over the last years a great number of bacterial genomes were sequenced. Now one of the most importan...
Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, p...
Accurate automatic assignment of protein functions remains a chal-lenge for genome annotation. We ha...
Motivation: The number of bacterial genomes being sequenced is increasing very rapidly and hence, it...
Bacterial genome annotations are accumulating rapidly in the GenBank database and the use of automat...
Background: Analysis of non- coding sequences in several bacterial genomes brought to the identifica...
The annotation of genomes from next-generation sequencing platforms needs to ...
FASTA files containing all bacterial proteins used to build databases in this study. "NCBI_Bacteria"...
The availability of next-generation sequences of transcripts from prokaryotic organisms offers the o...
The increasing availability of bacterial genome sequences and genome-wide laboratory analyses has op...
BG7 is a new system for de novo bacterial, archaeal and viral genome annotation based on a new appro...
International audienceGenome annotation is subdivided into 2 phases: syntactical annotation, i.e. pr...
Motivation: Novel sequencing techniques can give access to organisms that are difficult to cultivate...