We report on the work undertaken developing a web environment that allows users to search over 1 trillion tokens of text -- down to the page-level -- of the HathiTrust Part-of-Speech Extracted Features Dataset to help produce worksets for scholarly analysis. We present an extended example of the web environment in use, along with details about its implementation
We describe a novel approach to precise searching in the full content of digital libraries. The Sear...
This work describes the process of creation of a 70 billion word text corpus of English. We used an ...
We introduce a Web-scale linguistics search engine, Linggle, that retrieves lexical bundles in respo...
We report on the work undertaken developing a web environment that allows users to search over 1 tri...
Consortial collections have led to unprecedented scales of digitized corpora, but the insights that ...
Consortial collections have led to unprecedented scales of digitized corpora, but the insights that ...
This article details a practical technique that safely reconciles the production stability and integ...
Due to the inherent difficulty of processing noisy text, the potential of the Web as a decentralized...
The Web bears the potential of being the world’s greatest encyclopedic source, but we are far from f...
This article discusses the creation and unique implementation of a browser-based search tool at Stev...
The web is a potentially useful corpus for language study because it provides examples of language t...
The web is the largest amount of text ever available to man, and the search engines has classified...
PDF of a powerpoint presentation from TPDL 2013: 17th International Conference on Theory and Practic...
Proposes to directly prompt reputed database use over search engines by means of a 3D self- test met...
The HathiTrust Digital Library (HTDL) was founded in 2008 with just over 2 million volumes in the co...
We describe a novel approach to precise searching in the full content of digital libraries. The Sear...
This work describes the process of creation of a 70 billion word text corpus of English. We used an ...
We introduce a Web-scale linguistics search engine, Linggle, that retrieves lexical bundles in respo...
We report on the work undertaken developing a web environment that allows users to search over 1 tri...
Consortial collections have led to unprecedented scales of digitized corpora, but the insights that ...
Consortial collections have led to unprecedented scales of digitized corpora, but the insights that ...
This article details a practical technique that safely reconciles the production stability and integ...
Due to the inherent difficulty of processing noisy text, the potential of the Web as a decentralized...
The Web bears the potential of being the world’s greatest encyclopedic source, but we are far from f...
This article discusses the creation and unique implementation of a browser-based search tool at Stev...
The web is a potentially useful corpus for language study because it provides examples of language t...
The web is the largest amount of text ever available to man, and the search engines has classified...
PDF of a powerpoint presentation from TPDL 2013: 17th International Conference on Theory and Practic...
Proposes to directly prompt reputed database use over search engines by means of a 3D self- test met...
The HathiTrust Digital Library (HTDL) was founded in 2008 with just over 2 million volumes in the co...
We describe a novel approach to precise searching in the full content of digital libraries. The Sear...
This work describes the process of creation of a 70 billion word text corpus of English. We used an ...
We introduce a Web-scale linguistics search engine, Linggle, that retrieves lexical bundles in respo...