MOTIVATION: The identification of protein and gene names (PGNs) from the scientific literature requires semantic resources: Terminological and lexical resources deliver the term candidates into PGN tagging solutions and the gold standard corpora (GSC) train them to identify term parameters and contextual features. Ideally all three resources, i.e. corpora, lexica and taggers, cover the same domain knowledge, and thus support identification of the same types of PGNs and cover all of them. Unfortunately, none of the three serves as a predominant standard and for this reason it is worth exploring, how these three resources comply with each other. We systematically compare different PGN taggers against publicly available corpora and analyze the...
The recognition and normalization of gene mentions in biomedical literature are crucial steps in bio...
AbstractThe immense volume of data which is now available from experiments in molecular biology has ...
We studied contrast and variability in a corpus of gene names to identify potential heuristics for u...
MotivationThe identification of protein and gene names (PGNs) from the scientific literature require...
BACKGROUND: Previously, gene normalization (GN) systems are mostly focused on disambiguation using c...
Background: Identification of gene and protein names in biomedical text is a challenging task as the...
Motivation: The recognition and normalization of textual mentions of gene and protein names is both ...
The automatic recognition of gene names and their corresponding database identifiers in biomedical t...
Background: Frequently, several alternative names are in use for biological objects such as genes an...
MOTIVATION: Biomedical entities, their identifiers and names, are essential in the representation of...
AbstractNamed entity (NE) recognition is a fundamental task in biological relationship mining. This ...
Background: Identification of gene and protein names in biomedical text is a challenging task as the...
In this paper we discuss five different corpora annotated for protein names. We present several with...
Researchers tend to use their own or favourite gene names in scientific literature, even though ther...
Motivation: Biomedical entities, their identifiers and names, are essential in the representation of...
The recognition and normalization of gene mentions in biomedical literature are crucial steps in bio...
AbstractThe immense volume of data which is now available from experiments in molecular biology has ...
We studied contrast and variability in a corpus of gene names to identify potential heuristics for u...
MotivationThe identification of protein and gene names (PGNs) from the scientific literature require...
BACKGROUND: Previously, gene normalization (GN) systems are mostly focused on disambiguation using c...
Background: Identification of gene and protein names in biomedical text is a challenging task as the...
Motivation: The recognition and normalization of textual mentions of gene and protein names is both ...
The automatic recognition of gene names and their corresponding database identifiers in biomedical t...
Background: Frequently, several alternative names are in use for biological objects such as genes an...
MOTIVATION: Biomedical entities, their identifiers and names, are essential in the representation of...
AbstractNamed entity (NE) recognition is a fundamental task in biological relationship mining. This ...
Background: Identification of gene and protein names in biomedical text is a challenging task as the...
In this paper we discuss five different corpora annotated for protein names. We present several with...
Researchers tend to use their own or favourite gene names in scientific literature, even though ther...
Motivation: Biomedical entities, their identifiers and names, are essential in the representation of...
The recognition and normalization of gene mentions in biomedical literature are crucial steps in bio...
AbstractThe immense volume of data which is now available from experiments in molecular biology has ...
We studied contrast and variability in a corpus of gene names to identify potential heuristics for u...