Concept recognition tools rely on the availability of textual corpora to assess their performance and enable the identification of areas for improvement. Typically, corpora are developed for specific purposes, such as gene name recognition. Gene and protein name identification are longstanding goals of biomedical text mining, and therefore a number of different corpora exist. However, phenotypes only recently became an entity of interest for specialized concept recognition systems, and hardly any annotated text is available for performance testing and training. Here, we present a unique corpus, capturing text spans from 228 abstracts manually annotated with Human Phenotype Ontology (HPO) concepts and harmonized by three curators, which can ...
Natural language descriptions of organismal phenotypes, a principal object of study in biology, are ...
Extracting biomedical named entities is one of the major challenges in automatic processing of biome...
<p>This dataset documents the results achieved by applying four concept recognition systems to text ...
Concept recognition tools rely on the availability of textual corpora to assess their performance an...
MOTIVATION: Automatic phenotype concept recognition from unstructured text remains a challenging tas...
Named-Entity Recognition is commonly used to identify biological entities such as proteins, genes, a...
MOTIVATION: Significant effort has been spent by curators to create coding systems for phenotypes su...
The Human Phenotype Ontology (HPO) is widely used in the rare disease community for differential dia...
The Human Phenotype Ontology (HPO) project, available at http://www.human-phenotype-ontology.org, pr...
Electronic health records and scientific articles possess differing linguistic characteristics that ...
Abstract Background Named entity recognition is critical for biomedical text mining, where it is not...
The Human Phenotype Ontology (HPO) project, available at http://www.human-phenotype-ontology.org, pr...
A standardized, controlled vocabulary allows phenotypic information to be described in an unambiguou...
A standardized, controlled vocabulary allows phenotypic information to be described in an unambiguou...
Electronic health records and scientific articles possess differing linguistic characteristics that ...
Natural language descriptions of organismal phenotypes, a principal object of study in biology, are ...
Extracting biomedical named entities is one of the major challenges in automatic processing of biome...
<p>This dataset documents the results achieved by applying four concept recognition systems to text ...
Concept recognition tools rely on the availability of textual corpora to assess their performance an...
MOTIVATION: Automatic phenotype concept recognition from unstructured text remains a challenging tas...
Named-Entity Recognition is commonly used to identify biological entities such as proteins, genes, a...
MOTIVATION: Significant effort has been spent by curators to create coding systems for phenotypes su...
The Human Phenotype Ontology (HPO) is widely used in the rare disease community for differential dia...
The Human Phenotype Ontology (HPO) project, available at http://www.human-phenotype-ontology.org, pr...
Electronic health records and scientific articles possess differing linguistic characteristics that ...
Abstract Background Named entity recognition is critical for biomedical text mining, where it is not...
The Human Phenotype Ontology (HPO) project, available at http://www.human-phenotype-ontology.org, pr...
A standardized, controlled vocabulary allows phenotypic information to be described in an unambiguou...
A standardized, controlled vocabulary allows phenotypic information to be described in an unambiguou...
Electronic health records and scientific articles possess differing linguistic characteristics that ...
Natural language descriptions of organismal phenotypes, a principal object of study in biology, are ...
Extracting biomedical named entities is one of the major challenges in automatic processing of biome...
<p>This dataset documents the results achieved by applying four concept recognition systems to text ...