Parsing is a step for understanding a natural language to find out about the words and their grammatical relations in a sentence. Statistical parsers require a set of annotated data, called a treebank, to learn the grammar of a language and apply the learnt model on new, unseen data. This set of annotated data is not available for all languages, and its development is very time- consuming, tedious, and expensive. In this dissertation, we propose a method for treebanking from scratch using machine learning methods. We first propose a bootstrapping approach to initialize the data annotation process. We aim at reducing human intervention to annotate the data. After developing a small data set, we use this data to train a statistical parser. Th...
Ambiguity resolution in the parsing of natural language requires a vast repository of knowledge to g...
The purpose of this paper is to describe the TüBa-D/Z treebank of written German and to compare it t...
This paper reports on the TIGER Treebank, a corpus of currently 35.000 syntactically annotated Germa...
The development of frameworks that allow to state grammars for natural languages in a mathematically...
We present "Treebank Refinement", which is a method that tunes the representation of syntactic analy...
This thesis presents new techniques for parsing natural language. They are based on Markov Models, w...
Arbeit an der Bibliothek noch nicht eingelangt - Daten nicht geprüftAbweichender Titel nach Übersetz...
We report on our ongoing work in developing the Irish Dependency Treebank, describe the results of t...
Treebanks are a valuable resource for the training of parsers that perform automatic annotation of u...
The purpose of this paper is to describe recent developments in the morphological, syntactic, and se...
Proceedings of the Workshop on Annotation and Exploitation of Parallel Corpora AEPC 2010. Editors:...
Manual development of deep linguistic resources is time-consuming and costly and therefore often des...
In this paper we present the Alpino Dependency Treebank and the tools that we have developed to faci...
Natural Language is highly ambiguous, on every level. This article describes a fast broad-coverage s...
In syntax, the trend nowadays is towards lexicalized grammar formalisms. It is now widely accepted t...
Ambiguity resolution in the parsing of natural language requires a vast repository of knowledge to g...
The purpose of this paper is to describe the TüBa-D/Z treebank of written German and to compare it t...
This paper reports on the TIGER Treebank, a corpus of currently 35.000 syntactically annotated Germa...
The development of frameworks that allow to state grammars for natural languages in a mathematically...
We present "Treebank Refinement", which is a method that tunes the representation of syntactic analy...
This thesis presents new techniques for parsing natural language. They are based on Markov Models, w...
Arbeit an der Bibliothek noch nicht eingelangt - Daten nicht geprüftAbweichender Titel nach Übersetz...
We report on our ongoing work in developing the Irish Dependency Treebank, describe the results of t...
Treebanks are a valuable resource for the training of parsers that perform automatic annotation of u...
The purpose of this paper is to describe recent developments in the morphological, syntactic, and se...
Proceedings of the Workshop on Annotation and Exploitation of Parallel Corpora AEPC 2010. Editors:...
Manual development of deep linguistic resources is time-consuming and costly and therefore often des...
In this paper we present the Alpino Dependency Treebank and the tools that we have developed to faci...
Natural Language is highly ambiguous, on every level. This article describes a fast broad-coverage s...
In syntax, the trend nowadays is towards lexicalized grammar formalisms. It is now widely accepted t...
Ambiguity resolution in the parsing of natural language requires a vast repository of knowledge to g...
The purpose of this paper is to describe the TüBa-D/Z treebank of written German and to compare it t...
This paper reports on the TIGER Treebank, a corpus of currently 35.000 syntactically annotated Germa...