Summary GitTables (https://gittables.github.io) is a corpus of currently 1.7M relational tables extracted from CSV files on GitHub, we aim to grow this to at least 10M tables. Each file in this corpus represents a table with the original content (e.g. values and header) as extracted from the corresponding CSV file. Table columns were also annotated with >2K semantic types from Schema.org and DBpedia (provided separately). These column annotations consist of, for example, semantic types, hierarchical relations to other types, and descriptions. We believe GitTables can facilitate many use-cases, among which: Data integration, search and validation. Data visualization and analysis recommendation. Schema analysis and completion for e.g. ...
Understanding the semantics of table elements is a prerequisite for many data integration and data d...
This dataset contains the scripts and dataset used in the study reported at Mining the Technical Rol...
HTML tables on web pages ("web tables") have been used successfully as a data source for several app...
Summary GitTables 1M (https://gittables.github.io) is a corpus of currently 1M relational tables ex...
Note: the entire GitTables corpus is here. Visit https://gittables.github.io for more background and...
Note: the download page of the entire GitTables corpus is here: https://zenodo.org/record/4943312. ...
This is an old version. The correct GitTables 1.7M corpus can be found here: https://zenodo.org/rec...
This dataset contains >800K CSV files behind the GitTables 1M corpus. For more information about th...
Data sets used for experimental evaluation in the related publication: Matching Web Tables with Kno...
This dataset contains the scripts and dataset used in the study reported at Unveiling the Technical ...
The Socio-Economic Panel (SOEP) relies heavily on a GitLab server for its data management and docume...
This dataset contains the SQL tables of the training and test datasets used in our experimentation. ...
This dataset comprises of the raw data that we used for analyzing the automotive software landscape ...
Collecting and refining research data or writing software is a part of many researchers' daily routi...
This dataset contains the SQL tables of the training and test datasets used in our experimentation. ...
Understanding the semantics of table elements is a prerequisite for many data integration and data d...
This dataset contains the scripts and dataset used in the study reported at Mining the Technical Rol...
HTML tables on web pages ("web tables") have been used successfully as a data source for several app...
Summary GitTables 1M (https://gittables.github.io) is a corpus of currently 1M relational tables ex...
Note: the entire GitTables corpus is here. Visit https://gittables.github.io for more background and...
Note: the download page of the entire GitTables corpus is here: https://zenodo.org/record/4943312. ...
This is an old version. The correct GitTables 1.7M corpus can be found here: https://zenodo.org/rec...
This dataset contains >800K CSV files behind the GitTables 1M corpus. For more information about th...
Data sets used for experimental evaluation in the related publication: Matching Web Tables with Kno...
This dataset contains the scripts and dataset used in the study reported at Unveiling the Technical ...
The Socio-Economic Panel (SOEP) relies heavily on a GitLab server for its data management and docume...
This dataset contains the SQL tables of the training and test datasets used in our experimentation. ...
This dataset comprises of the raw data that we used for analyzing the automotive software landscape ...
Collecting and refining research data or writing software is a part of many researchers' daily routi...
This dataset contains the SQL tables of the training and test datasets used in our experimentation. ...
Understanding the semantics of table elements is a prerequisite for many data integration and data d...
This dataset contains the scripts and dataset used in the study reported at Mining the Technical Rol...
HTML tables on web pages ("web tables") have been used successfully as a data source for several app...