This report describes a collection of 210,305 volumes of fiction that researchers are encouraged to borrow for their own work. Alternately, readers can simply browse the report as a description of English-language fiction in HathiTrust Digital Library. For instance, how does the proportion of fiction written by British authors or by women change across time? We also divide nineteenth- and twentieth-century fiction into seven subsets with different emphases (for instance, one where men and women are represented equally, and one composed of only the most prominent and widely-held books). Comparing the pictures produced by these different samples allows us to assess the fragility of recent quantitative arguments about literary history. Prepri...
Of the novelties introduced by digitization in the study of literature, the size of the archive is p...
Among the nearly 100 selections in a prominent anthology of literary journalism—stories and excerpts...
This essay describes how using unsupervised topic modeling (specifically the latent Dirichlet alloca...
A topic model of 29,341 volumes of fiction, written in English and published between 1880 and 1999. ...
This workset is data in support of the article "Mapping Mutable Genres in Structurally Complex Volum...
This dissertation considers the “problem of abundance” in literary studies, or the fact that the arc...
This project investigates the case of a prominent bibliography and dataset of American fiction, the ...
Numbers appear to have limited value for literary study, since our discipline is usually more concer...
A history of literary prestige needs to study both works that achieved distinction and the mass of v...
Recent developments in the polyhedric field of Digital Humanities offer a desirable perspective for ...
Some literary scholars have claimed that predictive models can measure the strength of the boundarie...
Large digital collections offer new avenues of exploration for literary scholars. But their potentia...
Metadata for English-language fiction in HathiTrust Digital Library, after 1922. These volumes were ...
External factors such as author gender, author nationality, and date of publication affect both the ...
Why is one novel still read, while another is forgotten? Literary scholars answer that the first is ...
Of the novelties introduced by digitization in the study of literature, the size of the archive is p...
Among the nearly 100 selections in a prominent anthology of literary journalism—stories and excerpts...
This essay describes how using unsupervised topic modeling (specifically the latent Dirichlet alloca...
A topic model of 29,341 volumes of fiction, written in English and published between 1880 and 1999. ...
This workset is data in support of the article "Mapping Mutable Genres in Structurally Complex Volum...
This dissertation considers the “problem of abundance” in literary studies, or the fact that the arc...
This project investigates the case of a prominent bibliography and dataset of American fiction, the ...
Numbers appear to have limited value for literary study, since our discipline is usually more concer...
A history of literary prestige needs to study both works that achieved distinction and the mass of v...
Recent developments in the polyhedric field of Digital Humanities offer a desirable perspective for ...
Some literary scholars have claimed that predictive models can measure the strength of the boundarie...
Large digital collections offer new avenues of exploration for literary scholars. But their potentia...
Metadata for English-language fiction in HathiTrust Digital Library, after 1922. These volumes were ...
External factors such as author gender, author nationality, and date of publication affect both the ...
Why is one novel still read, while another is forgotten? Literary scholars answer that the first is ...
Of the novelties introduced by digitization in the study of literature, the size of the archive is p...
Among the nearly 100 selections in a prominent anthology of literary journalism—stories and excerpts...
This essay describes how using unsupervised topic modeling (specifically the latent Dirichlet alloca...