The trees in the Penn Treebank have a standard representation that involves complete balanced bracketing. In this article, an alternative for this standard representation of the tree bank is proposed. The proposed representation for the trees is loss-less, but it reduces the total number of brackets by 28%. This is possible by omitting the redundant pairs of special brackets that encode initial and final embedding, using a technique proposed by Krauwer and des Tombe (1981). In terms of the paired brackets, the maximum nesting depth in sentences decreases by 78%. The 99.9% coverage is achieved with only five non-top levels of paired brackets. The observed shallowness of the reduced bracketing suggests that finite-state based methods for pars...
We present an investigation into parsing the Penn Chinese Treebank using a Data-Oriented Parsing (DO...
International audienceParsing efficiency within the context of tree adjoining grammars (TAGs) depend...
This article describes how a treebank of ungrammatical sentences can be created from a treebank of w...
The trees in the Penn Treebank have a standard representation that involves complete balanced bracke...
The trees in the Penn Treebank have a standard representation that involves complete balanced bracke...
A recently proposed balanced-bracket encoding (Yli-Jyrä and GómezRodríguez 2017) has given us a way ...
Treebanks, such as the Penn Treebank, provide a basis for the automatic creation of broad coverage g...
This paper presents empirical studies and closely corresponding theoretical models of the performanc...
Structure preserving grammar compaction (SPC) is a simple CFG compaction technique originally descri...
Thesis (Master's)--University of Washington, 2015In this thesis, I present a new method of producing...
This paper is a contribution to the ongoing discussion on treebank annotation schemes and their impa...
with structural information (brackets). Parsing partially bracketed strings arises naturally in seve...
Proceedings of the Sixth International Workshop on Treebanks and Linguistic Theories. Editors: Ko...
Previous work on treebank parsing with discontinuous constituents using Linear Context-Free Rewritin...
In the last decade, the Penn treebank has become the standard data set for evaluating parsers. The f...
We present an investigation into parsing the Penn Chinese Treebank using a Data-Oriented Parsing (DO...
International audienceParsing efficiency within the context of tree adjoining grammars (TAGs) depend...
This article describes how a treebank of ungrammatical sentences can be created from a treebank of w...
The trees in the Penn Treebank have a standard representation that involves complete balanced bracke...
The trees in the Penn Treebank have a standard representation that involves complete balanced bracke...
A recently proposed balanced-bracket encoding (Yli-Jyrä and GómezRodríguez 2017) has given us a way ...
Treebanks, such as the Penn Treebank, provide a basis for the automatic creation of broad coverage g...
This paper presents empirical studies and closely corresponding theoretical models of the performanc...
Structure preserving grammar compaction (SPC) is a simple CFG compaction technique originally descri...
Thesis (Master's)--University of Washington, 2015In this thesis, I present a new method of producing...
This paper is a contribution to the ongoing discussion on treebank annotation schemes and their impa...
with structural information (brackets). Parsing partially bracketed strings arises naturally in seve...
Proceedings of the Sixth International Workshop on Treebanks and Linguistic Theories. Editors: Ko...
Previous work on treebank parsing with discontinuous constituents using Linear Context-Free Rewritin...
In the last decade, the Penn treebank has become the standard data set for evaluating parsers. The f...
We present an investigation into parsing the Penn Chinese Treebank using a Data-Oriented Parsing (DO...
International audienceParsing efficiency within the context of tree adjoining grammars (TAGs) depend...
This article describes how a treebank of ungrammatical sentences can be created from a treebank of w...