Introduction Wikipedia is written in the wikitext markup language. When serving content, the MediaWiki software that powers Wikipedia parses wikitext to HTML, thereby inserting additional content by expanding macros (templates and modules). Hence, researchers who intend to analyze Wikipedia as seen by its readers should work with HTML, rather than wikitext. Since Wikipedia’s revision history is made publicly available by the Wikimedia Foundation exclusively in wikitext format, researchers have had to produce HTML themselves, typically by using Wikipedia’s REST API for ad-hoc wikitext-to-HTML parsing. This approach, however, (1) does not scale to very large amounts of data and (2) does not correctly expand macros in historical article revis...
This contains data and software for the following paper: Hill, Benjamin Mako and Shaw, Aaron. ...
International audienceDBpedia is a huge dataset essentially extracted from the content and structure...
Wikipedia (wiki + encyclopaedia) is a multilingual, web-based encyclopaedia with free content. It is...
Wikis are popular tools commonly used to support distributedcollaborative work. Wikis can be seen as...
We present an open-source toolkit which allows (i) to reconstruct past states of Wikipedia, and (ii)...
This dataset includes the historical versions of all individual references per article in the Englis...
Wikipedia, the popular online encyclopedia, has in just six years grown from an adjunct to the now-d...
Wikipedia revision metadata for every edit to every page in seven major language versions of Wikiped...
Wikipedia, an international project that uses Wiki software to collaboratively create an encyclopaed...
The "Wikipedia Edit Event Data 2021 (WikiEvent.2021)" gives the time, user name, and article title o...
The aim of this study is to show how Wikipedia establishes a public and digital space, where users p...
Abstract. Much of work in semantic web relying on Wikipedia as the main source of knowledge often wo...
The "Wikipedia Edit Event Data 2018 (WikiEvent.2018)" gives the time, user name, and article title o...
Wikipedia’s first twenty years: how what began as an experiment in collaboration became the world’s ...
International audienceDBpedia is a huge dataset essentially extracted from the content and structure...
This contains data and software for the following paper: Hill, Benjamin Mako and Shaw, Aaron. ...
International audienceDBpedia is a huge dataset essentially extracted from the content and structure...
Wikipedia (wiki + encyclopaedia) is a multilingual, web-based encyclopaedia with free content. It is...
Wikis are popular tools commonly used to support distributedcollaborative work. Wikis can be seen as...
We present an open-source toolkit which allows (i) to reconstruct past states of Wikipedia, and (ii)...
This dataset includes the historical versions of all individual references per article in the Englis...
Wikipedia, the popular online encyclopedia, has in just six years grown from an adjunct to the now-d...
Wikipedia revision metadata for every edit to every page in seven major language versions of Wikiped...
Wikipedia, an international project that uses Wiki software to collaboratively create an encyclopaed...
The "Wikipedia Edit Event Data 2021 (WikiEvent.2021)" gives the time, user name, and article title o...
The aim of this study is to show how Wikipedia establishes a public and digital space, where users p...
Abstract. Much of work in semantic web relying on Wikipedia as the main source of knowledge often wo...
The "Wikipedia Edit Event Data 2018 (WikiEvent.2018)" gives the time, user name, and article title o...
Wikipedia’s first twenty years: how what began as an experiment in collaboration became the world’s ...
International audienceDBpedia is a huge dataset essentially extracted from the content and structure...
This contains data and software for the following paper: Hill, Benjamin Mako and Shaw, Aaron. ...
International audienceDBpedia is a huge dataset essentially extracted from the content and structure...
Wikipedia (wiki + encyclopaedia) is a multilingual, web-based encyclopaedia with free content. It is...