This work was conducted to find out how tokenization methods affect the training results of machine translation models. In this work, alphabet tokenization, morpheme tokenization, and BPE tokenization were applied to Korean as the source language and English as the target language respectively, and the comparison experiment was conducted by repeating 50,000 epochs of each 9 models using the Transformer neural network. As a result of measuring the BLEU scores of the experimental models, the model that applied BPE tokenization to Korean and morpheme tokenization to English recorded 35.73, showing the best performance
Neural machine translation (NMT) is a data-driven machine translation approach that has proven its s...
Machine translation has two important parts, a learning process which followed by a translation proc...
This paper suggests a methodology which is aimed to translate the Chinese terms into Korean. Our bas...
The previous English-Korean MT system that was the transfer-based MT system and applied to only writ...
In this paper we describe and experimen-tally evaluate FromTo K/E, a rule-based Korean-English machi...
Korean and Japanese have different writing scripts but share the same Subject-Object-Verb (SOV) word...
Training a statistical machine translation system starts with tokenizing a parallel corpus. Some lan...
Often, Statistical Machine Translation (SMT) between English and Korean suf-fers from null alignment...
Unlike English or Spanish, which has each word clearly segmented, morphologically rich languages, su...
This paper describes our ongoing Korean-Chinese machine translation system, which is based on verb p...
Factored machine translation models extendtraditionalPhrase Based Statistical MachineTranslation (PB...
Neural machine translation (NMT) conducts end-to-end translation with a source language encoder and ...
This paper investigates how different features influence the translation quality of a Russian-Englis...
Finding optimal translation is not an easy task as it requires in-depth knowledge of the language to...
This paper presents the overview of statistical machine translation systems that BJTU-NLP developed ...
Neural machine translation (NMT) is a data-driven machine translation approach that has proven its s...
Machine translation has two important parts, a learning process which followed by a translation proc...
This paper suggests a methodology which is aimed to translate the Chinese terms into Korean. Our bas...
The previous English-Korean MT system that was the transfer-based MT system and applied to only writ...
In this paper we describe and experimen-tally evaluate FromTo K/E, a rule-based Korean-English machi...
Korean and Japanese have different writing scripts but share the same Subject-Object-Verb (SOV) word...
Training a statistical machine translation system starts with tokenizing a parallel corpus. Some lan...
Often, Statistical Machine Translation (SMT) between English and Korean suf-fers from null alignment...
Unlike English or Spanish, which has each word clearly segmented, morphologically rich languages, su...
This paper describes our ongoing Korean-Chinese machine translation system, which is based on verb p...
Factored machine translation models extendtraditionalPhrase Based Statistical MachineTranslation (PB...
Neural machine translation (NMT) conducts end-to-end translation with a source language encoder and ...
This paper investigates how different features influence the translation quality of a Russian-Englis...
Finding optimal translation is not an easy task as it requires in-depth knowledge of the language to...
This paper presents the overview of statistical machine translation systems that BJTU-NLP developed ...
Neural machine translation (NMT) is a data-driven machine translation approach that has proven its s...
Machine translation has two important parts, a learning process which followed by a translation proc...
This paper suggests a methodology which is aimed to translate the Chinese terms into Korean. Our bas...