The Web is increasingly used as a source for content of datasets of various types, especially multimedia content. These datasets are then often distributed as a collection of URLs, pointing to the original sources of the elements. As these sources go offline over time, the datasets experience decay in the form of link-rot. In this paper, we analyze 24 Web-sourced datasets with a combined total of over 270 million URLs and find that over 20% of the content is no longer available. We discuss the adverse effects of this decay on the reproducibility of work based on such data and make some recommendations on how they could be mediated in the future
When a website is suddenly lost without a backup, it may be reconstituted by probing web archives an...
The emergence of the web has fundamentally affected most aspects of information communication, inclu...
The content at the end of any hyperlink is subject to two phenomena: the link may break (Link Rot) o...
The failure of a web address to link to the appropriate online source is a significant problem facin...
From the earliest days of the web, users have been aware of the fickleness of linking to content. In...
The content at the end of any hyperlink is subject to two phenomena: the link may break (Link Rot) o...
In this study we offer a preliminary typology of types of link ephemerality that can occur and affec...
Slides from a public class offered at NYPL Research Libraries, September 2018Have you ever cited an ...
Links are an essential feature of the World Wide Web, and source code repositories are no exception....
All text is ephemeral. Some texts are more ephemeral than others. The web has proved to be among the...
The emergence of the web has fundamentally affected most aspects of information communication, inclu...
Increases in the use of web data for corpus-building, coupled with the use of specialist, single-use...
In the era of ‘born digital’ ETDs, librarians and institutional repository curators need to reframe ...
Ms. Rhodes explores URL stability, measured by the prevalence of link rot over a three-year period, ...
The usefulness and usability of data on the Semantic Web is ultimately reliant on the ability of cli...
When a website is suddenly lost without a backup, it may be reconstituted by probing web archives an...
The emergence of the web has fundamentally affected most aspects of information communication, inclu...
The content at the end of any hyperlink is subject to two phenomena: the link may break (Link Rot) o...
The failure of a web address to link to the appropriate online source is a significant problem facin...
From the earliest days of the web, users have been aware of the fickleness of linking to content. In...
The content at the end of any hyperlink is subject to two phenomena: the link may break (Link Rot) o...
In this study we offer a preliminary typology of types of link ephemerality that can occur and affec...
Slides from a public class offered at NYPL Research Libraries, September 2018Have you ever cited an ...
Links are an essential feature of the World Wide Web, and source code repositories are no exception....
All text is ephemeral. Some texts are more ephemeral than others. The web has proved to be among the...
The emergence of the web has fundamentally affected most aspects of information communication, inclu...
Increases in the use of web data for corpus-building, coupled with the use of specialist, single-use...
In the era of ‘born digital’ ETDs, librarians and institutional repository curators need to reframe ...
Ms. Rhodes explores URL stability, measured by the prevalence of link rot over a three-year period, ...
The usefulness and usability of data on the Semantic Web is ultimately reliant on the ability of cli...
When a website is suddenly lost without a backup, it may be reconstituted by probing web archives an...
The emergence of the web has fundamentally affected most aspects of information communication, inclu...
The content at the end of any hyperlink is subject to two phenomena: the link may break (Link Rot) o...