Big languages such as English and Finnish have many natural language processing (NLP) resources and models, but this is not the case for low-resourced and endangered languages as such resources are so scarce despite the great advantages they would provide for the language communities. The most common types of resources available for low-resourced and endangered languages are translation dictionaries and universal dependencies. In this paper, we present a method for constructing word embeddings for endangered languages using existing word embeddings of different resource-rich languages and the translation dictionaries of resource-poor languages. Thereafter, the embeddings are fine-tuned using the sentences in the universal dependencies and a...
This thesis describes our improvement of word sense translation for under-resourced languages utiliz...
The world harbors a diversity of some 6,500 mutually unintelligible languages. As has been increasin...
The world harbors a diversity of some 6,500 mutually unintelligible languages. As has been increasin...
In this paper, we present an approach for translating word embeddings from a majority language into ...
A sentiment analyzer and cross-lingual word embeddings for endangered languages (e.g., Erzya, Moksha...
Many endangered Uralic languages have multilingual machine readable dictionaries saved in an XML for...
This paper examines approaches to gener-ate lexical resources for endangered lan-guages. Our algorit...
The Kamusi Project, a multilingual online dictionary website, has as one of its goals to document th...
Cross-lingual representations of words enable us to reason about word meaning in multilingual contex...
Jiawei ZhaoCurrent machine translation techniques were developed using predominantly rich resource l...
Word embeddings - dense vector representations of a word’s distributional semantics - are an indespe...
We present our work towards building an infrastructure for documenting endangered languages with the...
About half of the living languages and dialects, including the Greek ones, are endangered, and langu...
These databases are translated from SemFi by using Giellatekno XML dictionaries. The included python...
International audienceThe paper presents a method for parsing low-resource languages with very small...
This thesis describes our improvement of word sense translation for under-resourced languages utiliz...
The world harbors a diversity of some 6,500 mutually unintelligible languages. As has been increasin...
The world harbors a diversity of some 6,500 mutually unintelligible languages. As has been increasin...
In this paper, we present an approach for translating word embeddings from a majority language into ...
A sentiment analyzer and cross-lingual word embeddings for endangered languages (e.g., Erzya, Moksha...
Many endangered Uralic languages have multilingual machine readable dictionaries saved in an XML for...
This paper examines approaches to gener-ate lexical resources for endangered lan-guages. Our algorit...
The Kamusi Project, a multilingual online dictionary website, has as one of its goals to document th...
Cross-lingual representations of words enable us to reason about word meaning in multilingual contex...
Jiawei ZhaoCurrent machine translation techniques were developed using predominantly rich resource l...
Word embeddings - dense vector representations of a word’s distributional semantics - are an indespe...
We present our work towards building an infrastructure for documenting endangered languages with the...
About half of the living languages and dialects, including the Greek ones, are endangered, and langu...
These databases are translated from SemFi by using Giellatekno XML dictionaries. The included python...
International audienceThe paper presents a method for parsing low-resource languages with very small...
This thesis describes our improvement of word sense translation for under-resourced languages utiliz...
The world harbors a diversity of some 6,500 mutually unintelligible languages. As has been increasin...
The world harbors a diversity of some 6,500 mutually unintelligible languages. As has been increasin...