Identifying cross-language plagiarism is challenging, especially for distant language pairs and sense-for-sense translations. We introduce the new multilingual retrieval model Cross-Language Ontology-Based Similarity Analysis (CL-OSA) for this task. CL-OSA represents documents as entity vectors obtained from the open knowledge graph Wikidata. Opposed to other methods, CL-OSA does not require computationally expensive machine translation, nor pre-training using comparable or parallel corpora. It reliably disambiguates homonyms and scales to allow its application toWebscale document collections. We show that CL-OSA outperforms state-of-the-art methods for retrieving candidate documents from five large, topically diverse test corpora that incl...
A system that recognises cross-lingual plagiarism needs to establish – among other things – whether ...
The existence of vast amounts of multilingual textual data on the internet leads to cross-lingual pl...
Recently, cross language and semantic plagiarism are on the rise. Many plagiarism detection tools ar...
Three reasons make plagiarism across languages to be on the rise: (i) speakers of under-resourced la...
none4siCross-language plagiarism detection deals with the automatic identification and extraction of...
Cross-language plagiarism detection deals with the automatic identification and extraction of plagia...
This is the author’s version of a work that was accepted for publication in Information Processing a...
Corresponding authors: Norman Meuschke, Terry Ruas Venue: 2nd Workshop on Extraction and Evaluation ...
La variante translingüe de la detección de plagio automática trata de detectar plagio entre document...
Plagiarism, the unacknowledged reuse of text, does not end at language boundaries. Cross-language pl...
International audienceThis paper is a deep investigation of cross-language plagiarism detection meth...
Cross-language plagiarism detection attempts to identify and extract automatically plagiarism among ...
The automatic detection of plagiarism is a task that has acquired relevance in the Information Retri...
Generally utterances in natural language are highly ambiguous, and a unique interpretation can usual...
Translated or cross-lingual plagiarism is defined as the translation of someone else’s work or words...
A system that recognises cross-lingual plagiarism needs to establish – among other things – whether ...
The existence of vast amounts of multilingual textual data on the internet leads to cross-lingual pl...
Recently, cross language and semantic plagiarism are on the rise. Many plagiarism detection tools ar...
Three reasons make plagiarism across languages to be on the rise: (i) speakers of under-resourced la...
none4siCross-language plagiarism detection deals with the automatic identification and extraction of...
Cross-language plagiarism detection deals with the automatic identification and extraction of plagia...
This is the author’s version of a work that was accepted for publication in Information Processing a...
Corresponding authors: Norman Meuschke, Terry Ruas Venue: 2nd Workshop on Extraction and Evaluation ...
La variante translingüe de la detección de plagio automática trata de detectar plagio entre document...
Plagiarism, the unacknowledged reuse of text, does not end at language boundaries. Cross-language pl...
International audienceThis paper is a deep investigation of cross-language plagiarism detection meth...
Cross-language plagiarism detection attempts to identify and extract automatically plagiarism among ...
The automatic detection of plagiarism is a task that has acquired relevance in the Information Retri...
Generally utterances in natural language are highly ambiguous, and a unique interpretation can usual...
Translated or cross-lingual plagiarism is defined as the translation of someone else’s work or words...
A system that recognises cross-lingual plagiarism needs to establish – among other things – whether ...
The existence of vast amounts of multilingual textual data on the internet leads to cross-lingual pl...
Recently, cross language and semantic plagiarism are on the rise. Many plagiarism detection tools ar...