Statistical parsers trained on labeled data suffer from sparsity, both grammatical and lexical. For parsers based on strongly lexicalized grammar formalisms (such as CCG, which has complex lexical cate-gories but simple combinatory rules), the problem of sparsity can be isolated to the lexicon. In this paper, we show that semi-supervised Viterbi-EM can be used to extend the lexicon of a generative CCG parser. By learning complex lexical entries for low-frequency and unseen words from unlabeled data, we obtain improvements over our supervised model for both in-domain (WSJ) and out-of-domain (ques-tions and Wikipedia) data. Our learnt lexicons when used with a discriminative parser such as C&C also significantly im-prove its performance o...
Most current linguistic theories give lexical accounts of several phenomena that used to be consider...
We introduce a new CCG parsing model which is factored on lexical category as-signments. Parsing is ...
In this paper we present preliminary experiments that aim to reduce lexical data sparsity in statist...
Current supervised parsers are limited by the size of their labelled training data, making improving...
We present methods to control the lexicon size when learning a Combinatory Cate-gorial Grammar seman...
Using semi-supervised EM, we learn finegrained but sparse lexical parameters of a generative parsing...
This article uses semi-supervised Expectation Maximization (EM) to learn lexico-syntactic dependenci...
Ambiguity resolution in the parsing of natural language requires a vast repository of knowledge to g...
In this dissertation, we have proposed novel methods for robust parsing that integrate the flexibili...
Thesis (Ph.D.)--University of Washington, 2016-01Combinatory Categorial Grammar (CCG) is a widely st...
State-of-the-art parsers suffer from incomplete lexicons, as evidenced by the fact that they all co...
Several recent stochastic parsers use bilexical grammars, where each word type idiosyncratically pre...
Generative lexicalized parsing models, which are the mainstay for probabilistic parsing of English, ...
Combinatory Categorial Grammar (CCG) is a lexicalized grammar formalism in which words are associate...
We describe a new self-learning framework for parser lexicalisation that requires only a plain-text ...
Most current linguistic theories give lexical accounts of several phenomena that used to be consider...
We introduce a new CCG parsing model which is factored on lexical category as-signments. Parsing is ...
In this paper we present preliminary experiments that aim to reduce lexical data sparsity in statist...
Current supervised parsers are limited by the size of their labelled training data, making improving...
We present methods to control the lexicon size when learning a Combinatory Cate-gorial Grammar seman...
Using semi-supervised EM, we learn finegrained but sparse lexical parameters of a generative parsing...
This article uses semi-supervised Expectation Maximization (EM) to learn lexico-syntactic dependenci...
Ambiguity resolution in the parsing of natural language requires a vast repository of knowledge to g...
In this dissertation, we have proposed novel methods for robust parsing that integrate the flexibili...
Thesis (Ph.D.)--University of Washington, 2016-01Combinatory Categorial Grammar (CCG) is a widely st...
State-of-the-art parsers suffer from incomplete lexicons, as evidenced by the fact that they all co...
Several recent stochastic parsers use bilexical grammars, where each word type idiosyncratically pre...
Generative lexicalized parsing models, which are the mainstay for probabilistic parsing of English, ...
Combinatory Categorial Grammar (CCG) is a lexicalized grammar formalism in which words are associate...
We describe a new self-learning framework for parser lexicalisation that requires only a plain-text ...
Most current linguistic theories give lexical accounts of several phenomena that used to be consider...
We introduce a new CCG parsing model which is factored on lexical category as-signments. Parsing is ...
In this paper we present preliminary experiments that aim to reduce lexical data sparsity in statist...