The development of models for automatic detection of text re-use and plagiarism across languages has received increasing attention in recent years. However, the lack of an evaluation framework composed of annotated datasets has caused these efforts to be isolated. In this paper we present the CL!TR 2011 corpus, the first manually created corpus for the analysis of cross-language text re-use between English and Hindi. The corpus was used during the Cross-Language !ndian Text Re-Use Detection Competition. Here we overview the approaches applied the contestants and evaluate their quality when detecting a re-used text together with its source.Peer Reviewe
Language identification (LI) in textual documents is the process of automatically detecting the lang...
Plagiarism is an act of literature fraud, which is presenting others’ work or ideas without giving c...
The pervasiveness of offensive content in social media has become an important reason for concern fo...
The development of models for automatic detection of text re-use and plagiarism across languages has...
The development of models for automatic detection of text re-use and plagiarism across languages has...
Text reuse occurs when one borrows the text (either verbatim or paraphrased) from an earlier written...
Cross-lingual plagiarism occurs when the source (or original) text(s) is in one language and the pla...
Text reuse is the act of borrowing text from existing documents to create new texts. Freely availabl...
Text reuse is becoming a serious issue in many fields and research shows that it is much harder to d...
Text reuse is the act of borrowing text (either verbatim or paraphrased) from an earlier written tex...
The evaluation dataset for the cross-lingual text reuse detection task.The dataset was prepared for ...
In recent years, the problem of Cross-Lingual Text Reuse Detection (CLTRD) has gained the interest o...
Plagiarism, the unacknowledged reuse of text, does not end at language boundaries. Cross-language pl...
Linguistic code switching (LCS) occurs when speakers mix multiple languages in the same speech utter...
Three reasons make plagiarism across languages to be on the rise: (i) speakers of under-resourced la...
Language identification (LI) in textual documents is the process of automatically detecting the lang...
Plagiarism is an act of literature fraud, which is presenting others’ work or ideas without giving c...
The pervasiveness of offensive content in social media has become an important reason for concern fo...
The development of models for automatic detection of text re-use and plagiarism across languages has...
The development of models for automatic detection of text re-use and plagiarism across languages has...
Text reuse occurs when one borrows the text (either verbatim or paraphrased) from an earlier written...
Cross-lingual plagiarism occurs when the source (or original) text(s) is in one language and the pla...
Text reuse is the act of borrowing text from existing documents to create new texts. Freely availabl...
Text reuse is becoming a serious issue in many fields and research shows that it is much harder to d...
Text reuse is the act of borrowing text (either verbatim or paraphrased) from an earlier written tex...
The evaluation dataset for the cross-lingual text reuse detection task.The dataset was prepared for ...
In recent years, the problem of Cross-Lingual Text Reuse Detection (CLTRD) has gained the interest o...
Plagiarism, the unacknowledged reuse of text, does not end at language boundaries. Cross-language pl...
Linguistic code switching (LCS) occurs when speakers mix multiple languages in the same speech utter...
Three reasons make plagiarism across languages to be on the rise: (i) speakers of under-resourced la...
Language identification (LI) in textual documents is the process of automatically detecting the lang...
Plagiarism is an act of literature fraud, which is presenting others’ work or ideas without giving c...
The pervasiveness of offensive content in social media has become an important reason for concern fo...