Similar code may exist in large software projects due to some com-mon software engineering practices, such as copying and pasting code and n-version programming. Although previous work has studied syntactic equivalence and small-scale, coarse-grained pro-gram-level and function-level semantic equivalence, it is not known whether significant fine-grained, code-level semantic duplications exist. Detecting such semantic equivalence is also desirable be-cause it can enable many applications such as code understanding, maintenance, and optimization. In this paper, we introduce the first algorithm to automatically mine functionally equivalent code fragments of arbitrary size— down to an executable statement. Our notion of functional equiva-lence ...
Identifying similar code in software systems can assist many software engineering tasks, including p...
International audienceThe high availability of a huge number of documents on the Web makes plagiaris...
AbstractThe high availability of a huge number of documents on the Web makes plagiarism very attract...
Similar code may exist in large software projects due to some com-mon software engineering practices...
Several techniques have been developed for identifying similar code fragments in programs. These sim...
Duplication is detected by comparing features of source fragments. The main problem for the detectio...
Identifying syntactically or functionally similar code fragments in source code is an important res...
<p>Matching function binaries—the process of identifying similar functions among binary executables—...
Debugging symbols in binary executables carry the names of functions and global variables. When pres...
Software bugs are a reality of programming. They can be difficult to identify and resolve, even for...
Existing code similarity comparison methods, whether source or binary code based, are mostly not res...
International audienceThe detection of similarities in source code has applications not only in soft...
Abstract—Classic clone detection approaches are hardly capable of finding redundant code that has be...
Plagiarism detection and clone refactoring in software depend on one common concern: nding similar s...
In this paper we investigate the potential benefits of Latent Dirichlet Allocation (LDA) as a techni...
Identifying similar code in software systems can assist many software engineering tasks, including p...
International audienceThe high availability of a huge number of documents on the Web makes plagiaris...
AbstractThe high availability of a huge number of documents on the Web makes plagiarism very attract...
Similar code may exist in large software projects due to some com-mon software engineering practices...
Several techniques have been developed for identifying similar code fragments in programs. These sim...
Duplication is detected by comparing features of source fragments. The main problem for the detectio...
Identifying syntactically or functionally similar code fragments in source code is an important res...
<p>Matching function binaries—the process of identifying similar functions among binary executables—...
Debugging symbols in binary executables carry the names of functions and global variables. When pres...
Software bugs are a reality of programming. They can be difficult to identify and resolve, even for...
Existing code similarity comparison methods, whether source or binary code based, are mostly not res...
International audienceThe detection of similarities in source code has applications not only in soft...
Abstract—Classic clone detection approaches are hardly capable of finding redundant code that has be...
Plagiarism detection and clone refactoring in software depend on one common concern: nding similar s...
In this paper we investigate the potential benefits of Latent Dirichlet Allocation (LDA) as a techni...
Identifying similar code in software systems can assist many software engineering tasks, including p...
International audienceThe high availability of a huge number of documents on the Web makes plagiaris...
AbstractThe high availability of a huge number of documents on the Web makes plagiarism very attract...