International audienceProgram understanding aims at discovering human-readable properties of a software project from the analysis of its source code. Recently, we proposed a smart approach based on hierarchical agglomerative clustering that extracts so-called program topoi from source code. These topoi are high-level observable properties of the project. Based on textual and structural representations of the source code, our multi-steps approach clusters program topoi in an effective and efficient way. In this paper, we depict novel exploitation tasks of this program understanding approach and report on its application to Software Heritage. Software Heritage is an ambitious project which aims at collecting and archiving the biggest corpus o...
Le développement de projets open source à grande échelle implique de nombreux développeurs distincts...
Perhaps the most \ud important aspect in maintaining software legacy systems is un-derstanding \u...
grantor: University of TorontoA common problem that the software industry has to face is t...
Understanding source code of large open-source software projects is very challenging when there is o...
During the development of long lifespan software systems, specification documents can become outdate...
Large repositories of source code create new challenges and opportunities for statistical machine le...
Technical debt at the architectural level is a severe threat to software development projects. Uncon...
Address email Large repositories of source code create new challenges and opportunities for sta-tist...
We are interested in identifying the domain expertise of developers of a software system. A develope...
Software repositories contain a vast wealth of information about software development. Mining these ...
Large codebases are routinely indexed by standard Information Retrieval systems, starting from the a...
In the era of big data, information retrieval becomes even more challenging since the size of data v...
ii Many approaches have been developed to comprehend software source code, most of them focusing on ...
This thesis examines the application of document classification techniques to collections of source ...
Becoming increasingly complex, software development relies heavily on the reuse of existing librarie...
Le développement de projets open source à grande échelle implique de nombreux développeurs distincts...
Perhaps the most \ud important aspect in maintaining software legacy systems is un-derstanding \u...
grantor: University of TorontoA common problem that the software industry has to face is t...
Understanding source code of large open-source software projects is very challenging when there is o...
During the development of long lifespan software systems, specification documents can become outdate...
Large repositories of source code create new challenges and opportunities for statistical machine le...
Technical debt at the architectural level is a severe threat to software development projects. Uncon...
Address email Large repositories of source code create new challenges and opportunities for sta-tist...
We are interested in identifying the domain expertise of developers of a software system. A develope...
Software repositories contain a vast wealth of information about software development. Mining these ...
Large codebases are routinely indexed by standard Information Retrieval systems, starting from the a...
In the era of big data, information retrieval becomes even more challenging since the size of data v...
ii Many approaches have been developed to comprehend software source code, most of them focusing on ...
This thesis examines the application of document classification techniques to collections of source ...
Becoming increasingly complex, software development relies heavily on the reuse of existing librarie...
Le développement de projets open source à grande échelle implique de nombreux développeurs distincts...
Perhaps the most \ud important aspect in maintaining software legacy systems is un-derstanding \u...
grantor: University of TorontoA common problem that the software industry has to face is t...