The aim of this thesis is to design and implement a computational linguistic module for analysing Thai texts under the INTEX © system. Based essentially on Indo-European languages written in the Latin alphabet, INTEX © encounters some difficulties when processing a very different language such as Thai. The crucial problem is word and sentence segmentation, since Thai has no word separator: a sentence is written as a continuous sequence of letters, and sentence separators are frequently ambiguous. Accordingly, we have developed and evaluated two methods of word segmentation, firstly by using Regular Expressions and secondly Finite-State Transducers, which segment Thai texts into letters and syllables respectively. We have also created Thai E...
For languages without word boundary delimiters, dictionaries are needed for segmenting running texts...
This PhD thesis focuses on the problems encountered when developing automatic speech recognition for...
This study is to develop a word segmentation algorithm and solution for Myanmar language. This is a ...
The aim of this thesis is to design and implement a computational linguistic module for analysing Th...
A Thai written text is a string of symbols without explicit word boundary markup. A method for a dev...
Word segmentation is a problem in several Asian languages that have no explicit word boundary delimi...
Abstract This paper discusses a Thai corpus, TaLAPi, fully annotated with word segmentation (WS), pa...
This thesis presents a method of Micro-Systemic Linguistic Analysis of Thai compound words. The aim ...
In Thai language, the word boundary is not explicitly clear, therefore, word segmentation is needed ...
A sentence is typically treated as the minimal syntactic unit used to extract valuable information f...
The development of an information extraction (IE) system for Thai documents raises a number of issue...
The original publication is available at www.springerlink.comInternational audienceWe present in thi...
An increasing amount of electronically available information is stored in Asian language documents, ...
Abstract. Word segmentation is an important task in natural language processing, especially for lang...
Thai romanization is the way to write Thai language using roman alphabets. It could be performed on ...
For languages without word boundary delimiters, dictionaries are needed for segmenting running texts...
This PhD thesis focuses on the problems encountered when developing automatic speech recognition for...
This study is to develop a word segmentation algorithm and solution for Myanmar language. This is a ...
The aim of this thesis is to design and implement a computational linguistic module for analysing Th...
A Thai written text is a string of symbols without explicit word boundary markup. A method for a dev...
Word segmentation is a problem in several Asian languages that have no explicit word boundary delimi...
Abstract This paper discusses a Thai corpus, TaLAPi, fully annotated with word segmentation (WS), pa...
This thesis presents a method of Micro-Systemic Linguistic Analysis of Thai compound words. The aim ...
In Thai language, the word boundary is not explicitly clear, therefore, word segmentation is needed ...
A sentence is typically treated as the minimal syntactic unit used to extract valuable information f...
The development of an information extraction (IE) system for Thai documents raises a number of issue...
The original publication is available at www.springerlink.comInternational audienceWe present in thi...
An increasing amount of electronically available information is stored in Asian language documents, ...
Abstract. Word segmentation is an important task in natural language processing, especially for lang...
Thai romanization is the way to write Thai language using roman alphabets. It could be performed on ...
For languages without word boundary delimiters, dictionaries are needed for segmenting running texts...
This PhD thesis focuses on the problems encountered when developing automatic speech recognition for...
This study is to develop a word segmentation algorithm and solution for Myanmar language. This is a ...