Pretrained language models are generally acknowledged to be able to encode syntax [Tenney et al., 2019, Jawahar et al., 2019, Hewitt and Manning, 2019]. In this article, we propose UPOA, an Unsupervised constituent Parsing model that calculates an Out Association score solely based on the self-attention weight matrix learned in a pretrained language model as the syntactic distance for span segmentation. We further propose an enhanced version, UPIO, which exploits both inside association and outside association scores for estimating the likelihood of a span. Experiments with UPOA and UPIO disclose that the linear projection matrices for the query and key in the self-attention mechanism play an important role in parsing. We therefore extend t...
We explore deep clustering of multilingual text representations for unsupervised model interpretatio...
In natural language generation using symbolic grammars, state-of-the-art realisation rankers use sta...
We investigate the task of unsupervised constituency parsing from bilingual parallel corpora. Our go...
There are many methods to improve performances of statistical parsers. Among them, resolving structu...
We propose a spectral approach for un-supervised constituent parsing that comes with theoretical gua...
Constituency Parse Extraction from Pre-trained Language Models (CPE-PLM) is a recent paradigm that a...
Based on simple methods such as observing word and part of speech tag co-occurrence and clustering, ...
textThe subject matter of this thesis is the problem of learning to discover grammatical structure f...
Making an informed choice of pre-trained language model (LM) is critical for performance, yet enviro...
Humans, even from infancy, are capable of unsupervised (“sta- tistical”) learning of linguistic info...
Much of the recent work on depen-dency parsing has been focused on solv-ing inherent combinatorial p...
Much of the recent work on depen-dency parsing has been focused on solv-ing inherent combinatorial p...
Many different metrics exist for evaluating parsing results, including Viterbi, Crossing Brackets Ra...
Language models are an important component of speech recognition. They aim to predict the probabilit...
Statistical models for parsing natural language have recently shown considerable success in broad-co...
We explore deep clustering of multilingual text representations for unsupervised model interpretatio...
In natural language generation using symbolic grammars, state-of-the-art realisation rankers use sta...
We investigate the task of unsupervised constituency parsing from bilingual parallel corpora. Our go...
There are many methods to improve performances of statistical parsers. Among them, resolving structu...
We propose a spectral approach for un-supervised constituent parsing that comes with theoretical gua...
Constituency Parse Extraction from Pre-trained Language Models (CPE-PLM) is a recent paradigm that a...
Based on simple methods such as observing word and part of speech tag co-occurrence and clustering, ...
textThe subject matter of this thesis is the problem of learning to discover grammatical structure f...
Making an informed choice of pre-trained language model (LM) is critical for performance, yet enviro...
Humans, even from infancy, are capable of unsupervised (“sta- tistical”) learning of linguistic info...
Much of the recent work on depen-dency parsing has been focused on solv-ing inherent combinatorial p...
Much of the recent work on depen-dency parsing has been focused on solv-ing inherent combinatorial p...
Many different metrics exist for evaluating parsing results, including Viterbi, Crossing Brackets Ra...
Language models are an important component of speech recognition. They aim to predict the probabilit...
Statistical models for parsing natural language have recently shown considerable success in broad-co...
We explore deep clustering of multilingual text representations for unsupervised model interpretatio...
In natural language generation using symbolic grammars, state-of-the-art realisation rankers use sta...
We investigate the task of unsupervised constituency parsing from bilingual parallel corpora. Our go...