Abstract This paper discusses a Thai corpus, TaLAPi, fully annotated with word segmentation (WS), part-of-speech (POS) and named entity (NE) information with the aim to provide a high-quality and sufficiently large corpus for real-life implementation of Thai language processing tools. The corpus contains 2,720 articles (1,043,471words) from the entertainment and lifestyle (NE&L) domain and 5,489 articles (3,181,487 words) in the news (NEWS) domain, with a total of 35 POS tags and 10 named entity categories. In particular, we present an approach to segment and tag foreign and loan words expressed in transliterated or original form in Thai text corpora. We see this as an area for study as adapted and un-adapted foreign language sequences ...
A lot of research is currently ongoing in word segmentation and POS taggingdeveloped differently wit...
Word segmentation and Part-of-Speech (POS) tagging are fundamental tasks in natural language process...
An increasing amount of electronically available information is stored in Asian language documents, ...
this report we describe the Thai POS tagged corpus building, linguistic tools and some applica-tions...
In Thai language, the word boundary is not explicitly clear, therefore, word segmentation is needed ...
Word segmentation is a problem in several Asian languages that have no explicit word boundary delimi...
In Natural Language Processing (NLP), Word segmentation and Part-of-Speech (POS) taggingare fundamen...
The aim of this thesis is to design and implement a computational linguistic module for analysing Th...
The development of an information extraction (IE) system for Thai documents raises a number of issue...
Abstract. Word segmentation is an important task in natural language processing, especially for lang...
In this paper, we analyze the syntactic structure of Myanmar grammatical categories to be able to us...
Information overload is a problem in the Information Age and Information visualization is an approac...
In the applications of Natural languageprocessing (NLP), sentence analysis is one of theimportant ph...
In Natural Language Processing (NLP), Word segmentation and Part-ofSpeech (POS) tagging are fundamen...
A Thai written text is a string of symbols without explicit word boundary markup. A method for a dev...
A lot of research is currently ongoing in word segmentation and POS taggingdeveloped differently wit...
Word segmentation and Part-of-Speech (POS) tagging are fundamental tasks in natural language process...
An increasing amount of electronically available information is stored in Asian language documents, ...
this report we describe the Thai POS tagged corpus building, linguistic tools and some applica-tions...
In Thai language, the word boundary is not explicitly clear, therefore, word segmentation is needed ...
Word segmentation is a problem in several Asian languages that have no explicit word boundary delimi...
In Natural Language Processing (NLP), Word segmentation and Part-of-Speech (POS) taggingare fundamen...
The aim of this thesis is to design and implement a computational linguistic module for analysing Th...
The development of an information extraction (IE) system for Thai documents raises a number of issue...
Abstract. Word segmentation is an important task in natural language processing, especially for lang...
In this paper, we analyze the syntactic structure of Myanmar grammatical categories to be able to us...
Information overload is a problem in the Information Age and Information visualization is an approac...
In the applications of Natural languageprocessing (NLP), sentence analysis is one of theimportant ph...
In Natural Language Processing (NLP), Word segmentation and Part-ofSpeech (POS) tagging are fundamen...
A Thai written text is a string of symbols without explicit word boundary markup. A method for a dev...
A lot of research is currently ongoing in word segmentation and POS taggingdeveloped differently wit...
Word segmentation and Part-of-Speech (POS) tagging are fundamental tasks in natural language process...
An increasing amount of electronically available information is stored in Asian language documents, ...