In this thesis, we aim to explore the combination of different lexical normalization methods and provide a practical lexical normalization pipeline for Swedish student writings within the framework of SWEGRAM(Näsman et al., 2017). An important improvement in my implementation is that the pipeline design should consider the unique morphological and phonological characteristics of the Swedish language. This kind of localization makes the system more robust for Swedish at the cost of being less applicable to other languages in similar tasks. The core of the localization lies in a phonetic algorithm we designed specifically for the Swedish language and a compound processing step for Swedish compounding phenomenon. The proposed pipeline consists...
Historical texts are an important resource for researchers in the humanities. However, standard NLP ...
The paper describes S-VEX, the lexical acquisition component of the Swedish Core Language Engine (S-...
In this master thesis the focus has been made on the evaluation of Stockholm Umeå Corpus (SUC) as a ...
In this thesis, we aim to explore the combination of different lexical normalization methods and pro...
Corpora for second language (L2) learning may contain a receptive vocabulary, i.e., vocabulary that ...
Complex Word Identification (CWI) is a task of identifying complex words in text data and it is ofte...
The functional criteria of a lexical component for Swedish speech recognition and speech synthesis s...
This article examines current practices of normalization of names in Norse philology and computation...
In this study automatic lexical simplification via synonym replacement in Swedish was investigated u...
Data used in our Swedish normalization paper: Hämäläinen, M; Partanen, N & Alnajjar, K (2020) Norma...
This thesis investigates Swedish lexical blends. A lexical blend is defined as the concatenation of ...
Historical text constitutes a rich source of information for historians and other researchers in hum...
This paper presents a new lexical resource for learners of Swedish as a second language, SweLLex, an...
For commercial software with natural language functions, a high coverage is required. This implies t...
Advances in computational linguistics can provide new opportunities for historical linguistics, but ...
Historical texts are an important resource for researchers in the humanities. However, standard NLP ...
The paper describes S-VEX, the lexical acquisition component of the Swedish Core Language Engine (S-...
In this master thesis the focus has been made on the evaluation of Stockholm Umeå Corpus (SUC) as a ...
In this thesis, we aim to explore the combination of different lexical normalization methods and pro...
Corpora for second language (L2) learning may contain a receptive vocabulary, i.e., vocabulary that ...
Complex Word Identification (CWI) is a task of identifying complex words in text data and it is ofte...
The functional criteria of a lexical component for Swedish speech recognition and speech synthesis s...
This article examines current practices of normalization of names in Norse philology and computation...
In this study automatic lexical simplification via synonym replacement in Swedish was investigated u...
Data used in our Swedish normalization paper: Hämäläinen, M; Partanen, N & Alnajjar, K (2020) Norma...
This thesis investigates Swedish lexical blends. A lexical blend is defined as the concatenation of ...
Historical text constitutes a rich source of information for historians and other researchers in hum...
This paper presents a new lexical resource for learners of Swedish as a second language, SweLLex, an...
For commercial software with natural language functions, a high coverage is required. This implies t...
Advances in computational linguistics can provide new opportunities for historical linguistics, but ...
Historical texts are an important resource for researchers in the humanities. However, standard NLP ...
The paper describes S-VEX, the lexical acquisition component of the Swedish Core Language Engine (S-...
In this master thesis the focus has been made on the evaluation of Stockholm Umeå Corpus (SUC) as a ...