The Groningen Meaning Bank (GMB) project develops a corpus with rich syntactic and semantic annotations. Annotations in GMB are generated semi-automatically and stem from two sources: (i) Initial annotations from a set of standard NLP tools, (ii) Corrections/refinements by human annotators. For example, on the part-of-speech level of annotation there are currently 18,000 of those corrections, so called Bits of Wisdom (BOWs). For applying this information to boost the NLP processing we experiment how to use the BOWs in retraining the part-of-speech tagger and found that it can be improved to correct up to 70% of identified errors within held-out data. Moreover an improved tagger helps to raise the performance of the parser. Preferring senten...
International audienceThe coverage of a parser depends mostly on the quality of the underlying gramm...
The use of a corpus as a language resource is enhanced when it is part of speech (POS) tagged. There...
In natural language processing (NLP) an-notation projects, we use inter-annotator agreement measures...
The Groningen Meaning Bank (GMB) project develops a corpus with rich syntactic and semantic annotati...
© 2005 Andrew MacKinlayIn natural language processing (NLP), a crucial subsystem in a wide range of ...
Corpus linguistic and language technological research needs empirical corpus data with nearly correc...
Part-of-speech tagging represents an important first step for most medical natural language processi...
This paper reports on one of the first steps in building a very large annotated database of American...
Part-of-speech tagging represents an important first step for most medical natural language processi...
This article details a series of carefully de-signed experiments aiming at evaluating the influence ...
The creation of a gold standard corpus (GSC) is a very laborious and costly process. Silver standard...
Excluding any errors in manual annotation is hard to be achieved especially when the annotation stru...
In this thesis, we investigate methods for automatic detection, and to some extent correction, of gr...
Background: To train chunkers in recognizing noun phrases and verb phrases in biomedical text, an an...
Typically, accuracy is used to represent the performance of an NLP system. However, accuracy attainm...
International audienceThe coverage of a parser depends mostly on the quality of the underlying gramm...
The use of a corpus as a language resource is enhanced when it is part of speech (POS) tagged. There...
In natural language processing (NLP) an-notation projects, we use inter-annotator agreement measures...
The Groningen Meaning Bank (GMB) project develops a corpus with rich syntactic and semantic annotati...
© 2005 Andrew MacKinlayIn natural language processing (NLP), a crucial subsystem in a wide range of ...
Corpus linguistic and language technological research needs empirical corpus data with nearly correc...
Part-of-speech tagging represents an important first step for most medical natural language processi...
This paper reports on one of the first steps in building a very large annotated database of American...
Part-of-speech tagging represents an important first step for most medical natural language processi...
This article details a series of carefully de-signed experiments aiming at evaluating the influence ...
The creation of a gold standard corpus (GSC) is a very laborious and costly process. Silver standard...
Excluding any errors in manual annotation is hard to be achieved especially when the annotation stru...
In this thesis, we investigate methods for automatic detection, and to some extent correction, of gr...
Background: To train chunkers in recognizing noun phrases and verb phrases in biomedical text, an an...
Typically, accuracy is used to represent the performance of an NLP system. However, accuracy attainm...
International audienceThe coverage of a parser depends mostly on the quality of the underlying gramm...
The use of a corpus as a language resource is enhanced when it is part of speech (POS) tagged. There...
In natural language processing (NLP) an-notation projects, we use inter-annotator agreement measures...