Metadata for English-language fiction in HathiTrust Digital Library, after 1922. These volumes were identified as fiction algorithmically, using a predictive model trained on text, supplemented by metadata. Algorithmic prediction is imperfect, and this dataset contains errors that the author has not yet had time to fully measure and document. (Measuring recall, for instance, is not trivial.) The data is offered by the author without any promise or warranty. Use it if you find that it is, in practice, better than other alternatives; stop using it as soon as a better alternative becomes available.Ope
This thesis examines what automatic indexing and genre classification may bring to fiction. The thes...
A zipped folder of files keyed to HathiTrust volume IDs, each representing a volume of English-langu...
In this paper, we describe a BERT model trained on the Eighteenth Century Collections Online (ECCO) ...
Metadata for English-language fiction in HathiTrust Digital Library, after 1922. These volumes were ...
This workset is data in support of the article "Mapping Mutable Genres in Structurally Complex Volum...
A topic model of 29,341 volumes of fiction, written in English and published between 1880 and 1999. ...
Metadata about 210,266 volumes identified as English-language fiction in HathiTrust Digital Library,...
In large digital libraries, such as the HathiTrust, metadata is insufficient to identify items of in...
This report describes a collection of 210,305 volumes of fiction that researchers are encouraged to ...
Using regularized logistic regression and hidden Markov models, we predict genre at the page level i...
Metadata for 774 works of fiction referenced in "Mapping Mutable Genres in Structurally Complex Volu...
Model description This model is intended to predict, from the title of a book, whether it is 'ficti...
People read for various purposes like learning specific skills, acquiring foreign languages, and enj...
Fiction has come to play an essential part in human culture and life in recent centuries. Because of...
In a previous post, we introduced a dataset of Dutch novels with textual features and metadata. In t...
This thesis examines what automatic indexing and genre classification may bring to fiction. The thes...
A zipped folder of files keyed to HathiTrust volume IDs, each representing a volume of English-langu...
In this paper, we describe a BERT model trained on the Eighteenth Century Collections Online (ECCO) ...
Metadata for English-language fiction in HathiTrust Digital Library, after 1922. These volumes were ...
This workset is data in support of the article "Mapping Mutable Genres in Structurally Complex Volum...
A topic model of 29,341 volumes of fiction, written in English and published between 1880 and 1999. ...
Metadata about 210,266 volumes identified as English-language fiction in HathiTrust Digital Library,...
In large digital libraries, such as the HathiTrust, metadata is insufficient to identify items of in...
This report describes a collection of 210,305 volumes of fiction that researchers are encouraged to ...
Using regularized logistic regression and hidden Markov models, we predict genre at the page level i...
Metadata for 774 works of fiction referenced in "Mapping Mutable Genres in Structurally Complex Volu...
Model description This model is intended to predict, from the title of a book, whether it is 'ficti...
People read for various purposes like learning specific skills, acquiring foreign languages, and enj...
Fiction has come to play an essential part in human culture and life in recent centuries. Because of...
In a previous post, we introduced a dataset of Dutch novels with textual features and metadata. In t...
This thesis examines what automatic indexing and genre classification may bring to fiction. The thes...
A zipped folder of files keyed to HathiTrust volume IDs, each representing a volume of English-langu...
In this paper, we describe a BERT model trained on the Eighteenth Century Collections Online (ECCO) ...