Background: Accurate structural annotation depends on well-trained gene prediction programs. Training data for gene prediction programs are often chosen randomly from a subset of high-quality genes that ideally represent the variation found within a genome. One aspect of gene variation is GC content, which differs across species and is bimodal in grass genomes. When gene prediction programs are trained on a subset of grass genes with random GC content, they are effectively being trained on two classes of genes at once, and this can be expected to result in poor results when genes are predicted in new genome sequences. Results: We find that gene prediction programs trained on grass genes with random GC content do not completely predict all ...
With the increasing number of plant genomes being sequenced, a major challenge is to accurately tran...
The African cultivated rice (Oryza glaberrima) was domesticated in West Africa 3000 years ago. Altho...
In the last years, a series of methods for genomic prediction (GP) have been established, and the ad...
Background: Accurate structural annotation depends on well-trained gene prediction programs. Trainin...
Abstract Background Accurate structural annotation depends on well-trained gene prediction programs....
Distribution of GC content, MAKER six HMMs gene predictions and novel genes predicted by the high an...
List of MSU-RGAP genes that are not in the MAKER six HMMs annotation, functional descriptions and th...
OrthoMCL orthogroups containing novel high and low GC gene predictions. OrthoMCL output listing the ...
AED curves from various MAKER annotation methods. Figure S1. AED curves of MAKER annotations of Oryz...
Venn diagram depicting the overlap between the rice GC-specific sixHMM annotation and IGRSP v7 annot...
Automated evidence-based gene building is a rapid and cost-effective way to provide reliable gene an...
RNA-sequencing data used to assess tissue and treatment specificity of the novel high and low GC gen...
The use of draft genomes of different species and re-sequencing of accessions and populations are no...
Nucleotide landscapes, which is the way base composition is distributed along a genome, strongly var...
The large size and relative complexity of many plant genomes makes creation, quality control, and di...
With the increasing number of plant genomes being sequenced, a major challenge is to accurately tran...
The African cultivated rice (Oryza glaberrima) was domesticated in West Africa 3000 years ago. Altho...
In the last years, a series of methods for genomic prediction (GP) have been established, and the ad...
Background: Accurate structural annotation depends on well-trained gene prediction programs. Trainin...
Abstract Background Accurate structural annotation depends on well-trained gene prediction programs....
Distribution of GC content, MAKER six HMMs gene predictions and novel genes predicted by the high an...
List of MSU-RGAP genes that are not in the MAKER six HMMs annotation, functional descriptions and th...
OrthoMCL orthogroups containing novel high and low GC gene predictions. OrthoMCL output listing the ...
AED curves from various MAKER annotation methods. Figure S1. AED curves of MAKER annotations of Oryz...
Venn diagram depicting the overlap between the rice GC-specific sixHMM annotation and IGRSP v7 annot...
Automated evidence-based gene building is a rapid and cost-effective way to provide reliable gene an...
RNA-sequencing data used to assess tissue and treatment specificity of the novel high and low GC gen...
The use of draft genomes of different species and re-sequencing of accessions and populations are no...
Nucleotide landscapes, which is the way base composition is distributed along a genome, strongly var...
The large size and relative complexity of many plant genomes makes creation, quality control, and di...
With the increasing number of plant genomes being sequenced, a major challenge is to accurately tran...
The African cultivated rice (Oryza glaberrima) was domesticated in West Africa 3000 years ago. Altho...
In the last years, a series of methods for genomic prediction (GP) have been established, and the ad...