This workset is data in support of the article "Mapping Mutable Genres in Structurally Complex Volumes," http://arxiv.org/abs/1309.3323. It is a .tsv file containing 32,209 lines, each of which corresponds to a volume in HathiTrust Digital Library. The first column contains volume identifiers keyed to that library; the next three columns hold probabilities generated by naive Bayes classification (the probability that the volume is written in first person, the probability that the volume is fiction according to a classifier trained on 18c texts and one trained on 19c texts). The remaining columns hold metadata extracted from MARC records provided by HathiTrust. Many metadata fields are left blank, because information was not available. This ...
Large digital collections offer new avenues of exploration for literary scholars. But their potentia...
The collection "Fiction littéraire de Gallica" includes 19,240 public domain documents from the digi...
An intriguing new opportunity for research into the nineteenth-century history of print culture, lib...
This workset is data in support of the article "Mapping Mutable Genres in Structurally Complex Volum...
A topic model of 29,341 volumes of fiction, written in English and published between 1880 and 1999. ...
Metadata for English-language fiction in HathiTrust Digital Library, after 1922. These volumes were ...
Using regularized logistic regression and hidden Markov models, we predict genre at the page level i...
This report describes a collection of 210,305 volumes of fiction that researchers are encouraged to ...
Metadata for 774 works of fiction referenced in "Mapping Mutable Genres in Structurally Complex Volu...
Metadata about 210,266 volumes identified as English-language fiction in HathiTrust Digital Library,...
Abstract—To mine large digital libraries in humanistically meaningful ways, we need to divide them b...
In large digital libraries, such as the HathiTrust, metadata is insufficient to identify items of in...
The HathiTrust (HT) digital library comprises 4 billion pages (composing 11 million volumes). The Ha...
A zipped folder of files keyed to HathiTrust volume IDs, each representing a volume of English-langu...
The HathiTrust Digital Library (HTDL) was founded in 2008 with just over 2 million volumes in the co...
Large digital collections offer new avenues of exploration for literary scholars. But their potentia...
The collection "Fiction littéraire de Gallica" includes 19,240 public domain documents from the digi...
An intriguing new opportunity for research into the nineteenth-century history of print culture, lib...
This workset is data in support of the article "Mapping Mutable Genres in Structurally Complex Volum...
A topic model of 29,341 volumes of fiction, written in English and published between 1880 and 1999. ...
Metadata for English-language fiction in HathiTrust Digital Library, after 1922. These volumes were ...
Using regularized logistic regression and hidden Markov models, we predict genre at the page level i...
This report describes a collection of 210,305 volumes of fiction that researchers are encouraged to ...
Metadata for 774 works of fiction referenced in "Mapping Mutable Genres in Structurally Complex Volu...
Metadata about 210,266 volumes identified as English-language fiction in HathiTrust Digital Library,...
Abstract—To mine large digital libraries in humanistically meaningful ways, we need to divide them b...
In large digital libraries, such as the HathiTrust, metadata is insufficient to identify items of in...
The HathiTrust (HT) digital library comprises 4 billion pages (composing 11 million volumes). The Ha...
A zipped folder of files keyed to HathiTrust volume IDs, each representing a volume of English-langu...
The HathiTrust Digital Library (HTDL) was founded in 2008 with just over 2 million volumes in the co...
Large digital collections offer new avenues of exploration for literary scholars. But their potentia...
The collection "Fiction littéraire de Gallica" includes 19,240 public domain documents from the digi...
An intriguing new opportunity for research into the nineteenth-century history of print culture, lib...