�� 2021 The Authors. Published by ACL. This is an open access article available under a Creative Commons licence. The published version can be accessed at the following link on the publisher���s website: https://aclanthology.org/2021.findings-acl.86While word segmentation is a solved problem in many languages, it is still a challenge in continuous-script or low-resource languages. Like other NLP tasks, word segmentation is domain-dependent, which can be a challenge in low-resource languages like Thai and Urdu since there can be domains with insufficient data. This investigation proposes a new solution to adapt an existing domaingeneric model to a target domain, as well as a data augmentation technique to combat the low-resource problems. ...
We introduce a bilingually motivated word segmentation approach to languages where word boundaries a...
A Thai written text is a string of symbols without explicit word boundary markup. A method for a dev...
We introduce a word segmentation approach to languages where word boundaries are not orthographicall...
�� 2020. Published by ACL. This is an open access article available under a Creative Commons licence...
Word segmentation is a problem in several Asian languages that have no explicit word boundary delimi...
A sentence is typically treated as the minimal syntactic unit used to extract valuable information f...
Thai is a low-resource language, so it is often the case that data is not available in sufficient qu...
Statistical Machine Translation (SMT) systems are usually trained on large amounts of bilingual text...
Abstract. Word segmentation is an important task in natural language processing, especially for lang...
AbstractA boosting-based ensemble learning can be used to improve classification accuracy by using m...
Myanmar sentences are written as contiguoussequences of syllables with no characters delimiting thew...
International audienceWe present a technique to improve out-of-domain statistical parsing by reducin...
Abstract This paper discusses a Thai corpus, TaLAPi, fully annotated with word segmentation (WS), pa...
In Thai language, the word boundary is not explicitly clear, therefore, word segmentation is needed ...
For languages without word boundary delimiters, dictionaries are needed for segmenting running texts...
We introduce a bilingually motivated word segmentation approach to languages where word boundaries a...
A Thai written text is a string of symbols without explicit word boundary markup. A method for a dev...
We introduce a word segmentation approach to languages where word boundaries are not orthographicall...
�� 2020. Published by ACL. This is an open access article available under a Creative Commons licence...
Word segmentation is a problem in several Asian languages that have no explicit word boundary delimi...
A sentence is typically treated as the minimal syntactic unit used to extract valuable information f...
Thai is a low-resource language, so it is often the case that data is not available in sufficient qu...
Statistical Machine Translation (SMT) systems are usually trained on large amounts of bilingual text...
Abstract. Word segmentation is an important task in natural language processing, especially for lang...
AbstractA boosting-based ensemble learning can be used to improve classification accuracy by using m...
Myanmar sentences are written as contiguoussequences of syllables with no characters delimiting thew...
International audienceWe present a technique to improve out-of-domain statistical parsing by reducin...
Abstract This paper discusses a Thai corpus, TaLAPi, fully annotated with word segmentation (WS), pa...
In Thai language, the word boundary is not explicitly clear, therefore, word segmentation is needed ...
For languages without word boundary delimiters, dictionaries are needed for segmenting running texts...
We introduce a bilingually motivated word segmentation approach to languages where word boundaries a...
A Thai written text is a string of symbols without explicit word boundary markup. A method for a dev...
We introduce a word segmentation approach to languages where word boundaries are not orthographicall...