The paper describes the algorithmic methods used in a German monolingual lexicon project dealing with a multimillion entry lexicon. We describe the usability of different information which can be extracted from the lexicon: For German nouns and adjectives, candidates for their inflexion classes are automatically detected. Forms which do not fit in these classes are good error candidates. A n-gram model is used to find unusual combinations of letters which also indicate an error or foreign language entries. Regularity is used especially for compounds to get inflection information. In all algorithms, frequency information is used to select terms for correction. Quality information is attached to all entries. Generation and use of this quality...
One of the biggest challenges in compiling a dictionary of a minority language is managing the large...
melamedunagicisupennedu Wordlevel translational equivalences can be extracted from parallel texts by...
The article presents the scientific and methodological challenges for the development of an innovati...
International audienceThe coverage of a parser depends mostly on the quality of the underlying gramm...
This paper presents linguistics-based methods and engineering methods for quality assurance in semi-...
The paper describes a procedure for the automatic generation of a large full-form lexicon of English...
An automated approach of extracting bilingual lexicon (or dictionary) from comparable, non-parallel ...
The Leipzig Corpora Collection offers free online access to 136 monolingual dictionaries enriched wi...
This article describes a lexical substitution dataset for German. The whole dataset contains 2,040 s...
The paper presents a large-scale computational subcategorisation lexicon for several thousand German...
This dissertation describes a natural language processing research in the field of nominal compounds...
Language documentation projects typically invest a lot of effort in creating digitized lexical resou...
In this thesis, three possible aspects of using linguistic (i.e. morpho-syntactic) knowledge for sta...
During the recent years, the use of linguistic data for language processing (semantic ambiguity reso...
Current working practice of established German dictionaries incorporates large corpora as the basis ...
One of the biggest challenges in compiling a dictionary of a minority language is managing the large...
melamedunagicisupennedu Wordlevel translational equivalences can be extracted from parallel texts by...
The article presents the scientific and methodological challenges for the development of an innovati...
International audienceThe coverage of a parser depends mostly on the quality of the underlying gramm...
This paper presents linguistics-based methods and engineering methods for quality assurance in semi-...
The paper describes a procedure for the automatic generation of a large full-form lexicon of English...
An automated approach of extracting bilingual lexicon (or dictionary) from comparable, non-parallel ...
The Leipzig Corpora Collection offers free online access to 136 monolingual dictionaries enriched wi...
This article describes a lexical substitution dataset for German. The whole dataset contains 2,040 s...
The paper presents a large-scale computational subcategorisation lexicon for several thousand German...
This dissertation describes a natural language processing research in the field of nominal compounds...
Language documentation projects typically invest a lot of effort in creating digitized lexical resou...
In this thesis, three possible aspects of using linguistic (i.e. morpho-syntactic) knowledge for sta...
During the recent years, the use of linguistic data for language processing (semantic ambiguity reso...
Current working practice of established German dictionaries incorporates large corpora as the basis ...
One of the biggest challenges in compiling a dictionary of a minority language is managing the large...
melamedunagicisupennedu Wordlevel translational equivalences can be extracted from parallel texts by...
The article presents the scientific and methodological challenges for the development of an innovati...