Dialectal data and normalization models presented in the following paper: Hämäläinen, M., Alnajjar, K., & Tuisk, T. (2022) Help from the Neighbors: Estonian Dialect Normalization Using a Finnish Dialect Generator. In The Proceedings of The Third Workshop for Deep Learning for Low Resource NLP
Estonian Etymological Dictionary being compiled at the Institute of the Estonian Language. A limited...
In this paper, we describe the Nordic Dialect Corpus, which has recently been completed. The corpus ...
Proceedings of the 16th Nordic Conference of Computational Linguistics NODALIDA-2007. Editors: Jo...
This paper presents Murreviikko, a dataset of dialectal Finnish tweets which have been dialectologic...
The data used in the paper "Finnish Dialect Identification: The Effect of Audio and Text". If you u...
Data used in our Swedish normalization paper: Hämäläinen, M; Partanen, N & Alnajjar, K (2020) Norma...
Language label tokens are often used in multilingual neural language modeling and sequence-to-sequen...
Text normalization methods have been commonly applied to historical language or user-generated conte...
Hämäläinen, M., Partanen, N., Alnajjar, K., Rueter J. & Poibeau T. (2020). Automatic Dialect Adaptat...
Recordings of different Estonian dialects, 900000 words, transcribed and partly (400000 words) morph...
We adopt automatic language recognition methods to study di-alect levelling — a phenomenon that lead...
This paper introduces our work for adapting a rule based parser of spoken Estonian to the morphologi...
Traditional Estonian dialect classifications are based on the phonology, morphology, and lexis, and ...
One particular problem in large vocabulary continuous speech recognition for low-resourced languages...
This white paper is part of a series that promotes knowledge about language technology and its poten...
Estonian Etymological Dictionary being compiled at the Institute of the Estonian Language. A limited...
In this paper, we describe the Nordic Dialect Corpus, which has recently been completed. The corpus ...
Proceedings of the 16th Nordic Conference of Computational Linguistics NODALIDA-2007. Editors: Jo...
This paper presents Murreviikko, a dataset of dialectal Finnish tweets which have been dialectologic...
The data used in the paper "Finnish Dialect Identification: The Effect of Audio and Text". If you u...
Data used in our Swedish normalization paper: Hämäläinen, M; Partanen, N & Alnajjar, K (2020) Norma...
Language label tokens are often used in multilingual neural language modeling and sequence-to-sequen...
Text normalization methods have been commonly applied to historical language or user-generated conte...
Hämäläinen, M., Partanen, N., Alnajjar, K., Rueter J. & Poibeau T. (2020). Automatic Dialect Adaptat...
Recordings of different Estonian dialects, 900000 words, transcribed and partly (400000 words) morph...
We adopt automatic language recognition methods to study di-alect levelling — a phenomenon that lead...
This paper introduces our work for adapting a rule based parser of spoken Estonian to the morphologi...
Traditional Estonian dialect classifications are based on the phonology, morphology, and lexis, and ...
One particular problem in large vocabulary continuous speech recognition for low-resourced languages...
This white paper is part of a series that promotes knowledge about language technology and its poten...
Estonian Etymological Dictionary being compiled at the Institute of the Estonian Language. A limited...
In this paper, we describe the Nordic Dialect Corpus, which has recently been completed. The corpus ...
Proceedings of the 16th Nordic Conference of Computational Linguistics NODALIDA-2007. Editors: Jo...