The most widely appreciated role of DNA is to encode protein, yet the exact portion of the human genome that is translated remains to be ascertained. We previously developed PhyloCSF, a widely used tool to identify evolutionary signatures of protein-coding regions using multispecies genome alignments. Here, we present the first whole-genome PhyloCSF prediction tracks for human, mouse, chicken, fly, worm, and mosquito. We develop a workflow that uses machine learning to predict novel conserved protein-coding regions and efficiently guide their manual curation. We analyze more than 1000 high-scoring human PhyloCSF regions and confidently add 144 conserved protein-coding genes to the GENCODE gene set, as well as additional coding regions withi...
The comparison of related genomes has emerged as a powerful lens for genome interpretation. Here we ...
In the last ten years, numerous complete and almost complete genome sequences have been made availab...
Large-scale reference data sets of human genetic variation are critical for the medical and function...
The most widely appreciated role of DNA is to encode protein, yet the exact portion of the human gen...
The most widely appreciated role of DNA is to encode protein, yet the exact portion of the human gen...
The most widely appreciated role of DNA is to encode protein, yet the exact portion of the human gen...
Motivation: As high-throughput transcriptome sequencing provides evidence for novel transcripts in m...
A complete and accurate set of human protein-coding gene annotations is perhaps the single most impo...
We present here a novel methodology for the identification of genome regions potentially spanning on...
<div><p>Human gene catalogs are fundamental to the study of human biology and medicine. But they are...
Human gene catalogs are fundamental to the study of human biology and medicine. But they are all bas...
<p>(n.b. figshare bug prevents me adding OGS Team as first authors. You could cite with Bentham doi ...
Human gene catalogs are fundamental to the study of human biology and medicine. But they are all bas...
The comparison of related genomes has emerged as a powerful lens for genome interpretation. Here we ...
A primary motivation for sequencing the mouse genome was to accelerate the discovery of mammalian ge...
The comparison of related genomes has emerged as a powerful lens for genome interpretation. Here we ...
In the last ten years, numerous complete and almost complete genome sequences have been made availab...
Large-scale reference data sets of human genetic variation are critical for the medical and function...
The most widely appreciated role of DNA is to encode protein, yet the exact portion of the human gen...
The most widely appreciated role of DNA is to encode protein, yet the exact portion of the human gen...
The most widely appreciated role of DNA is to encode protein, yet the exact portion of the human gen...
Motivation: As high-throughput transcriptome sequencing provides evidence for novel transcripts in m...
A complete and accurate set of human protein-coding gene annotations is perhaps the single most impo...
We present here a novel methodology for the identification of genome regions potentially spanning on...
<div><p>Human gene catalogs are fundamental to the study of human biology and medicine. But they are...
Human gene catalogs are fundamental to the study of human biology and medicine. But they are all bas...
<p>(n.b. figshare bug prevents me adding OGS Team as first authors. You could cite with Bentham doi ...
Human gene catalogs are fundamental to the study of human biology and medicine. But they are all bas...
The comparison of related genomes has emerged as a powerful lens for genome interpretation. Here we ...
A primary motivation for sequencing the mouse genome was to accelerate the discovery of mammalian ge...
The comparison of related genomes has emerged as a powerful lens for genome interpretation. Here we ...
In the last ten years, numerous complete and almost complete genome sequences have been made availab...
Large-scale reference data sets of human genetic variation are critical for the medical and function...