The bigram language models are popular, in much language processing applications, in both Indo-European and Asian languages. However, when the language model for Chinese is applied in a novel domain, the accuracy is reduced significantly, from 96 % to 78 % in our evaluation. We apply pattern recognition techniques (i.e. Bayesian, decision tree and neural network classifiers) to discover language model errors. We have examined 2 general types of features: model-based and language-specific features. In our evaluation, Bayesian classifiers produce the best recall performance of 80 % but the precision is low (60%). Neural network produced good recall (75%) and precision (80%) but both Bayesian and Neural network have low skip ratio (65%). The d...
Large-scale language models (LLMs) has shown remarkable capability in various of Natural Language Pr...
English grammar error correction algorithm refers to the use of computer programming technology to a...
Spelling check is an important preprocessing task when dealing with user generated texts such as twe...
In this paper, we propose a new method for effective error analysis of machine translation (MT) syst...
Error pattern detection is very helpful in Computer-Aided Pronunciation Training (CAPT). This paper ...
Prediction of language mistakes is a task introduced by Duolingo as part of the Second Language Acqu...
In this thesis, I show the advantages of using symbolic parsers for Grammatical Error Detection and ...
In this project, common tonal and phonetic pronunciation errors of Mandarin Chinese made by second l...
This thesis aims to build a system to tackle the task of diagnosing the grammatical errors in senten...
The demand for computer-assisted language learning systems that can provide corrective feedback on l...
Computer Assisted Pronunciation Training (CAPT) is becoming more and more popular among language lea...
[[abstract]]N-gram language modeling is a crucial component in any speech recognizer since it is exp...
This thesis presents a study in the area of computer assisted language learning systems. The study f...
This paper presents a new approach that uses linguistic knowledge and pronunciation space for automa...
Choosing the most suitable classifier in a linguistic context is a well-known problem in the product...
Large-scale language models (LLMs) has shown remarkable capability in various of Natural Language Pr...
English grammar error correction algorithm refers to the use of computer programming technology to a...
Spelling check is an important preprocessing task when dealing with user generated texts such as twe...
In this paper, we propose a new method for effective error analysis of machine translation (MT) syst...
Error pattern detection is very helpful in Computer-Aided Pronunciation Training (CAPT). This paper ...
Prediction of language mistakes is a task introduced by Duolingo as part of the Second Language Acqu...
In this thesis, I show the advantages of using symbolic parsers for Grammatical Error Detection and ...
In this project, common tonal and phonetic pronunciation errors of Mandarin Chinese made by second l...
This thesis aims to build a system to tackle the task of diagnosing the grammatical errors in senten...
The demand for computer-assisted language learning systems that can provide corrective feedback on l...
Computer Assisted Pronunciation Training (CAPT) is becoming more and more popular among language lea...
[[abstract]]N-gram language modeling is a crucial component in any speech recognizer since it is exp...
This thesis presents a study in the area of computer assisted language learning systems. The study f...
This paper presents a new approach that uses linguistic knowledge and pronunciation space for automa...
Choosing the most suitable classifier in a linguistic context is a well-known problem in the product...
Large-scale language models (LLMs) has shown remarkable capability in various of Natural Language Pr...
English grammar error correction algorithm refers to the use of computer programming technology to a...
Spelling check is an important preprocessing task when dealing with user generated texts such as twe...