In an investigation of the use of a novelty detection algorithm for identifying inappropriate word combinations in a raw English corpus, we employ an unsupervised detection algorithm based on the one- class support vector machines (OC-SVMs) and extract sentences containing word sequences whose frequency of appearance is significantly low in native English writing. Combined with n-gram language models and document categorization techniques, the OC-SVM classifier assigns given sentences into two different groups; the sentences containing errors and those without errors. Accuracies are 79.30 % with bigram model, 86.63 % with trigram model, and 3...
Statistical machine translation has made tremendous progress over the past ten years. The output of ...
Tremendous research effort has gone into the field of natural language processing and understanding ...
International audienceThis research investigates the collocational errors made by English learners i...
International audienceIn this paper we investigate the use of linguistic information given by langua...
In this thesis, we investigate methods for automatic detection, and to some extent correction, of gr...
A better understanding on word classification could lead to a better detection and correction techni...
Learning a foreign language requires much practice outside of the classroom. Computer-assisted langu...
The Constituent Likelihood Automatic Word-tagging System (CLAWS) was originally designed for the low...
While the corpus-based research relies on human annotated corpora, it is often said that a non-negli...
This work focuses on designing a grammar detection system that understands both structural and conte...
Spell checking is the process of finding misspelled words and possibly correcting them. Most of the ...
Grammatical error correction, like other machine learning tasks, greatly benefits from large quant...
Shortage of available training data is holding back progress in the area of automated error detectio...
This paper explores the issue of automatically generated ungrammatical data and its use in error det...
Over the last several decades, the number of electronic documents has increased dramatically. With t...
Statistical machine translation has made tremendous progress over the past ten years. The output of ...
Tremendous research effort has gone into the field of natural language processing and understanding ...
International audienceThis research investigates the collocational errors made by English learners i...
International audienceIn this paper we investigate the use of linguistic information given by langua...
In this thesis, we investigate methods for automatic detection, and to some extent correction, of gr...
A better understanding on word classification could lead to a better detection and correction techni...
Learning a foreign language requires much practice outside of the classroom. Computer-assisted langu...
The Constituent Likelihood Automatic Word-tagging System (CLAWS) was originally designed for the low...
While the corpus-based research relies on human annotated corpora, it is often said that a non-negli...
This work focuses on designing a grammar detection system that understands both structural and conte...
Spell checking is the process of finding misspelled words and possibly correcting them. Most of the ...
Grammatical error correction, like other machine learning tasks, greatly benefits from large quant...
Shortage of available training data is holding back progress in the area of automated error detectio...
This paper explores the issue of automatically generated ungrammatical data and its use in error det...
Over the last several decades, the number of electronic documents has increased dramatically. With t...
Statistical machine translation has made tremendous progress over the past ten years. The output of ...
Tremendous research effort has gone into the field of natural language processing and understanding ...
International audienceThis research investigates the collocational errors made by English learners i...