This paper presents an algorithm for the unsuper-vised learning of a simple morphology of a nat-ural language from raw text. A generative prob-abilistic model is applied to segment word forms into morphs. The morphs are assumed to be gener-ated by one of three categories, namely prefix, suf-fix, or stem, and we make use of some observed asymmetries between these categories. The model learns a word structure, where words are allowed to consist of lengthy sequences of alternating stems and affixes, which makes the model suitable for highly-inflecting languages. The ability of the al-gorithm to find real morpheme boundaries is eval-uated against a gold standard for both Finnish and English. In comparison with a state-of-the-art al-gorithm the ...
Many Uralic languages have a rich morphological structure, but lack tools of morphological analysis ...
We present two methods for unsupervised segmentation of words into morpheme-like units. The model ...
We describe a simple method of unsupervised morpheme segmentation of words in an unknown language. A...
This work presents an algorithm for the unsupervised learning, or induction, of a simple morphology ...
The field of statistical natural language processing has been turning toward morpholog-ically rich l...
In order to develop computer applications that successfully process natural language data (text and ...
This dissertation presents a new computationally implemented linguistic model for morphological anal...
The morphology of a language is a knowledge of the ways in which the language’s words can change in ...
In this paper we describe a method to morphologically segment highly agglutinating and inflectional ...
This thesis contains work on a specific problem in field of LanguageTechnology. The problem can be d...
Most of the world’s natural languages have complex morphology. But the expense of building morpholog...
We present two methods for unsupervised segmentation of words into morpheme-like units. The model ut...
This paper attempts to participate in the ongoing discussion in search of a suitable model for the c...
Supervised morphological paradigm learning by identifying and aligning the longest com-mon subsequen...
This dissertation develops an algorithmic approach to linguistics through the study of topics in uns...
Many Uralic languages have a rich morphological structure, but lack tools of morphological analysis ...
We present two methods for unsupervised segmentation of words into morpheme-like units. The model ...
We describe a simple method of unsupervised morpheme segmentation of words in an unknown language. A...
This work presents an algorithm for the unsupervised learning, or induction, of a simple morphology ...
The field of statistical natural language processing has been turning toward morpholog-ically rich l...
In order to develop computer applications that successfully process natural language data (text and ...
This dissertation presents a new computationally implemented linguistic model for morphological anal...
The morphology of a language is a knowledge of the ways in which the language’s words can change in ...
In this paper we describe a method to morphologically segment highly agglutinating and inflectional ...
This thesis contains work on a specific problem in field of LanguageTechnology. The problem can be d...
Most of the world’s natural languages have complex morphology. But the expense of building morpholog...
We present two methods for unsupervised segmentation of words into morpheme-like units. The model ut...
This paper attempts to participate in the ongoing discussion in search of a suitable model for the c...
Supervised morphological paradigm learning by identifying and aligning the longest com-mon subsequen...
This dissertation develops an algorithmic approach to linguistics through the study of topics in uns...
Many Uralic languages have a rich morphological structure, but lack tools of morphological analysis ...
We present two methods for unsupervised segmentation of words into morpheme-like units. The model ...
We describe a simple method of unsupervised morpheme segmentation of words in an unknown language. A...