This dataset contains the SQL tables of the training and test datasets used in our experimentation. These tables contain the preprocessed textual data (in a form of tokens) extracted from each training and test project. Besides the preprocessed textual data, this dataset also contains meta-data about the projects, GitHub topics, and GitHub collections. The GitHub projects are identified by the tuple “Owner” and “Name”. The descriptions of the table fields are attached to their respective data descriptions
This dataset contains 330 code reviews with 40 tips, 16 request categories, 8 response categories, a...
This replication package contain datasets and scripts to replicate the results obtained in the paper...
This dataset collected from Stack Overflow (SO) and GitHub Discussions was used to conduct an empiri...
This dataset contains the SQL tables of the training and test datasets used in our experimentation. ...
This dataset contains the SQL tables of the training and test datasets used in our experimentation. ...
This dataset contains the scripts and dataset used in the study reported at Unveiling the Technical ...
This dataset contains the scripts and dataset used in the study reported at Mining the Technical Rol...
A hypergraph dataset mined from the GHTorrent project is presented. The dataset contains two files ...
Note: the entire GitTables corpus is here. Visit https://gittables.github.io for more background and...
These data files are data for paper "What Helps a New GitHub Project Achieve Sustained Activity?"
This dataset comprises of the raw data that we used for analyzing the automotive software landscape ...
These are the raw datasets used in the paper, unzipped in the same directory as the code and use the...
A collection of GitHub-Copilot-generated Python solutions and their translations to Dafny with verif...
Resulting pseudonymized classification data of the study "Automatic Core-Developer Identification on...
This is the replication package for creating a dataset of GitHub projects that are copies of other. ...
This dataset contains 330 code reviews with 40 tips, 16 request categories, 8 response categories, a...
This replication package contain datasets and scripts to replicate the results obtained in the paper...
This dataset collected from Stack Overflow (SO) and GitHub Discussions was used to conduct an empiri...
This dataset contains the SQL tables of the training and test datasets used in our experimentation. ...
This dataset contains the SQL tables of the training and test datasets used in our experimentation. ...
This dataset contains the scripts and dataset used in the study reported at Unveiling the Technical ...
This dataset contains the scripts and dataset used in the study reported at Mining the Technical Rol...
A hypergraph dataset mined from the GHTorrent project is presented. The dataset contains two files ...
Note: the entire GitTables corpus is here. Visit https://gittables.github.io for more background and...
These data files are data for paper "What Helps a New GitHub Project Achieve Sustained Activity?"
This dataset comprises of the raw data that we used for analyzing the automotive software landscape ...
These are the raw datasets used in the paper, unzipped in the same directory as the code and use the...
A collection of GitHub-Copilot-generated Python solutions and their translations to Dafny with verif...
Resulting pseudonymized classification data of the study "Automatic Core-Developer Identification on...
This is the replication package for creating a dataset of GitHub projects that are copies of other. ...
This dataset contains 330 code reviews with 40 tips, 16 request categories, 8 response categories, a...
This replication package contain datasets and scripts to replicate the results obtained in the paper...
This dataset collected from Stack Overflow (SO) and GitHub Discussions was used to conduct an empiri...