The data files available here (68GB uncompressed) have been used for studying the evolution of code at the level of fine-grained elements. The data are associated with the processing of the 89 open source software repositories hosted on GitHub. Details regarding each individual GitHub project are stored in the repos folder under directories matching the owner and project name used on GitHub. For example, the files under repos/KDE/kdevelop correspond to the project hosted on https://github.com/KDE/kdevelop. Data associated with the statistical analysis of the processed repositories are stored in the statistical-analysis folder. The file project_details.txt contains the data used for selecting the processed projects
Software analysis and its diachronic sibling, software evolution analysis, rely heavily on data comp...
Research software is vital for academia, yet reliable figures are rare. In an attempt to better unde...
We conduct a comprehensive study of file-system code evolution. By analyzing eight years of Linux fi...
The data files available here (70GB uncompressed) have been used for studying the evolution of code ...
This is the Debsources Dataset: source code and related metadata spanning two decades of Free and Op...
Libre software projects offer abundant information about themselves in publicly available storages (...
Software evolution and maintenance is largely based on data gathered through years of experience: un...
A model regarding the lifetime of individual source code lines or tokens can estimate maintenance ef...
Current software systems contain increasingly more elements that have not usually been considered in...
This dataset consists of a number of GitHub repositories that cover the following programming langua...
The topic of this thesis is the analysis of the evolution of software components. In order to track ...
We introduce a large-scale dataset of the complete texts of free/open source software (FOSS) license...
The code and build artifacts are a compilation of source code projects and their related build outpu...
This dataset provides the code and the data sets used in the PHD thesis "Identification of Software ...
Software development is rapidly changing and software systems are increasing in size and expected li...
Software analysis and its diachronic sibling, software evolution analysis, rely heavily on data comp...
Research software is vital for academia, yet reliable figures are rare. In an attempt to better unde...
We conduct a comprehensive study of file-system code evolution. By analyzing eight years of Linux fi...
The data files available here (70GB uncompressed) have been used for studying the evolution of code ...
This is the Debsources Dataset: source code and related metadata spanning two decades of Free and Op...
Libre software projects offer abundant information about themselves in publicly available storages (...
Software evolution and maintenance is largely based on data gathered through years of experience: un...
A model regarding the lifetime of individual source code lines or tokens can estimate maintenance ef...
Current software systems contain increasingly more elements that have not usually been considered in...
This dataset consists of a number of GitHub repositories that cover the following programming langua...
The topic of this thesis is the analysis of the evolution of software components. In order to track ...
We introduce a large-scale dataset of the complete texts of free/open source software (FOSS) license...
The code and build artifacts are a compilation of source code projects and their related build outpu...
This dataset provides the code and the data sets used in the PHD thesis "Identification of Software ...
Software development is rapidly changing and software systems are increasing in size and expected li...
Software analysis and its diachronic sibling, software evolution analysis, rely heavily on data comp...
Research software is vital for academia, yet reliable figures are rare. In an attempt to better unde...
We conduct a comprehensive study of file-system code evolution. By analyzing eight years of Linux fi...