Objective We aim to identify duplicate pairs of Medline citations, particularly when the documents are not identical but contain similar information. Materials and methods Duplicate pairs of citations are identified by comparing word n-grams in pairs of documents. N-grams are modified using two approaches which take account of the fact that the document may have been altered. These are: (1) deletion, an item in the n-gram is removed; and (2) substitution, an item in the n-gram is substituted with a similar term obtained from the Unified Medical Language System Metathesaurus. N-grams are also weighted using a score derived from a language model. Evaluation is carried out using a set of 520 Medline citation pairs, including a set of 260 manua...
Introduction: As MEDLINE indexers tag similar articles as duplicates even when journals have not add...
Part 1: ConferenceInternational audienceNear duplicate documents and their detection are studied to ...
247 p.Thesis (Ph.D.)--University of Illinois at Urbana-Champaign, 1984.An attempt was made to see ho...
Motivation: Duplicate publication impacts the quality of the scientific corpus, has been difficult t...
Motivation: Document similarity metrics such as PubMed’s “Find related articles ” feature, which hav...
Motivation: Duplicate publication impacts the quality of the scien-tific corpus, has been difficult ...
Objective: To automatically detect duplicate citations in a bibliographical database. Background: Ci...
Computational methods have been used to find duplicate biomedical publications in MEDLINE. Full text...
Computational methods have been used to find duplicate biomedical publications in MEDLINE. Full text...
Computational methods have been used to find duplicate biomedical publications in MEDLINE. Full text...
Finding duplicates is an important phase of systematic review. However, no consensus regarding the m...
Background: Finding duplicates is an important phase of systematic review. However, no consensus reg...
In recent years, the Web of Science Core Collection and Scopus databases have become primary sources...
<div><p>Background</p><p>Finding duplicates is an important phase of systematic review. However, no ...
<p>The scheme includes the third main steps. First, all literatures retrieved from different databas...
Introduction: As MEDLINE indexers tag similar articles as duplicates even when journals have not add...
Part 1: ConferenceInternational audienceNear duplicate documents and their detection are studied to ...
247 p.Thesis (Ph.D.)--University of Illinois at Urbana-Champaign, 1984.An attempt was made to see ho...
Motivation: Duplicate publication impacts the quality of the scientific corpus, has been difficult t...
Motivation: Document similarity metrics such as PubMed’s “Find related articles ” feature, which hav...
Motivation: Duplicate publication impacts the quality of the scien-tific corpus, has been difficult ...
Objective: To automatically detect duplicate citations in a bibliographical database. Background: Ci...
Computational methods have been used to find duplicate biomedical publications in MEDLINE. Full text...
Computational methods have been used to find duplicate biomedical publications in MEDLINE. Full text...
Computational methods have been used to find duplicate biomedical publications in MEDLINE. Full text...
Finding duplicates is an important phase of systematic review. However, no consensus regarding the m...
Background: Finding duplicates is an important phase of systematic review. However, no consensus reg...
In recent years, the Web of Science Core Collection and Scopus databases have become primary sources...
<div><p>Background</p><p>Finding duplicates is an important phase of systematic review. However, no ...
<p>The scheme includes the third main steps. First, all literatures retrieved from different databas...
Introduction: As MEDLINE indexers tag similar articles as duplicates even when journals have not add...
Part 1: ConferenceInternational audienceNear duplicate documents and their detection are studied to ...
247 p.Thesis (Ph.D.)--University of Illinois at Urbana-Champaign, 1984.An attempt was made to see ho...