We present a novel approach to Data-Oriented Parsing (DOP). Like other DOP models, our parser utilizes syntactic fragments of arbitrary size from a treebank to analyze new sentences, but, crucially, it uses only those which are encountered at least twice. This criterion al-lows us to work with a relatively small but representative set of fragments, which can be employed as the symbolic backbone of sev-eral probabilistic generative models. For pars-ing we define a transform-backtransform ap-proach that allows us to use standard PCFG technology, making our results easily replica-ble. According to standard Parseval metrics, our best model is on par with many state-of-the-art parsers, while offering some comple-mentary benefits: a simple genera...
Treebank parsing can be seen as the search for an optimally refined grammar consistent with a coarse...
Statistical models for parsing natural language have recently shown considerable success in broad-co...
Inducing a grammar from text has proven to be a notoriously challenging learning task despite decade...
We present a novel approach to Data-Oriented Parsing (DOP). Like other DOP models, our parser utiliz...
We present a novel approach to Data-Oriented Parsing (DOP). Like other DOP models, our parser utiliz...
Excellent results have been reported for DataOriented Parsing (DOP) of natural language texts (Bod, ...
Data-Oriented Parsing (dop) ranks among the best pars-ing schemes, pairing state-of-the art parsing ...
This paper proposes an extension of Tree-DOP which approximates the LFG-DOP model. GF-DOP combines t...
honors thesisCollege of EngineeringComputingVivek SrikumarDue to the computational complexity of par...
Recent advances in parsing technology have made treebank parsing with discontinuous constituents p...
Statistical parsers are e ective but are typically limited to producing projective dependencies or c...
Statistical parsers are effective but are typically limited to producing projective dependencies or ...
In Data-Oriented Parsing (DOP) an annotated corpus is used as a stochastic grammar. The most probabl...
The robustness of probabilistic parsing generally comes at the expense of grammaticality judgementth...
Data Oriented Parsing (DOP) is based on the idea of processing new input by com-bining fragments (as...
Treebank parsing can be seen as the search for an optimally refined grammar consistent with a coarse...
Statistical models for parsing natural language have recently shown considerable success in broad-co...
Inducing a grammar from text has proven to be a notoriously challenging learning task despite decade...
We present a novel approach to Data-Oriented Parsing (DOP). Like other DOP models, our parser utiliz...
We present a novel approach to Data-Oriented Parsing (DOP). Like other DOP models, our parser utiliz...
Excellent results have been reported for DataOriented Parsing (DOP) of natural language texts (Bod, ...
Data-Oriented Parsing (dop) ranks among the best pars-ing schemes, pairing state-of-the art parsing ...
This paper proposes an extension of Tree-DOP which approximates the LFG-DOP model. GF-DOP combines t...
honors thesisCollege of EngineeringComputingVivek SrikumarDue to the computational complexity of par...
Recent advances in parsing technology have made treebank parsing with discontinuous constituents p...
Statistical parsers are e ective but are typically limited to producing projective dependencies or c...
Statistical parsers are effective but are typically limited to producing projective dependencies or ...
In Data-Oriented Parsing (DOP) an annotated corpus is used as a stochastic grammar. The most probabl...
The robustness of probabilistic parsing generally comes at the expense of grammaticality judgementth...
Data Oriented Parsing (DOP) is based on the idea of processing new input by com-bining fragments (as...
Treebank parsing can be seen as the search for an optimally refined grammar consistent with a coarse...
Statistical models for parsing natural language have recently shown considerable success in broad-co...
Inducing a grammar from text has proven to be a notoriously challenging learning task despite decade...