The dataset consists of relational Web tables in the WDC Web Table Corpus 2012 that are in English. It includes 91,815,190 tables out of the 147 million Web tables in the overall corpus
A product data corpus containing over 5.6 million product records retrieved from the most visited 32...
The Paderborn Genre Analysis 2012 corpus (PaGA-12) contains 1,639 HTML documents of 26 genres. All d...
Millions of websites have started to annotate structured data within their HTML pages using the sche...
The dataset consists of relational Web tables in the WDC Web Table Corpus 2015 that are in English. ...
The subset consists of 147 million relational tables. In relational tables, a set of entities is des...
The dataset consists of 90 million tables out of the 233 million Web tables in the corpus. In relati...
This dataset contains tables from the WDC Web Table Corpus 2015 that can be described as entity and ...
We have built a corpus containing texts in 106 languages from texts available on the Internet and on...
A set of corpora for 120 languages automatically collected from wikipedia and the web. Collected ...
In recent years, researchers have recognized relational tables on the Web as an important source of ...
WebIsADb is a publicly available database containing more than 400 million hypernymy relations we ex...
WikiDBs (https://wikidbs.github.io/) is a corpus of relational databases built from Wikidata (https:...
The World-Wide Web consists of a huge number of unstruc-tured documents, but it also contains struct...
The Web contains millions of relational HTML tables, which cover a multitude of different, often ver...
Hakimov S, Ell B, Kaupmann F, et al. Research Data - Towards a Large Corpus of Richly Annotated Web ...
A product data corpus containing over 5.6 million product records retrieved from the most visited 32...
The Paderborn Genre Analysis 2012 corpus (PaGA-12) contains 1,639 HTML documents of 26 genres. All d...
Millions of websites have started to annotate structured data within their HTML pages using the sche...
The dataset consists of relational Web tables in the WDC Web Table Corpus 2015 that are in English. ...
The subset consists of 147 million relational tables. In relational tables, a set of entities is des...
The dataset consists of 90 million tables out of the 233 million Web tables in the corpus. In relati...
This dataset contains tables from the WDC Web Table Corpus 2015 that can be described as entity and ...
We have built a corpus containing texts in 106 languages from texts available on the Internet and on...
A set of corpora for 120 languages automatically collected from wikipedia and the web. Collected ...
In recent years, researchers have recognized relational tables on the Web as an important source of ...
WebIsADb is a publicly available database containing more than 400 million hypernymy relations we ex...
WikiDBs (https://wikidbs.github.io/) is a corpus of relational databases built from Wikidata (https:...
The World-Wide Web consists of a huge number of unstruc-tured documents, but it also contains struct...
The Web contains millions of relational HTML tables, which cover a multitude of different, often ver...
Hakimov S, Ell B, Kaupmann F, et al. Research Data - Towards a Large Corpus of Richly Annotated Web ...
A product data corpus containing over 5.6 million product records retrieved from the most visited 32...
The Paderborn Genre Analysis 2012 corpus (PaGA-12) contains 1,639 HTML documents of 26 genres. All d...
Millions of websites have started to annotate structured data within their HTML pages using the sche...