A short pilot study was conducted to provide recommendations on methods and workflows for extracting geographic references from the text of Biodiversity Heritage Library collections and disambiguating these references. An initial survey of the literature was conducted, and a variety of possible techniques and software were subsequently explored for natural language processing, machine learning, document annotation, and map visualization. A test corpus was evaluated, and preliminary findings identify challenges for a full-scale effort towards automated geoparsing, including: varying OCR quality, diversity of the corpus, historical context, and ambiguity of geographic references. The project background, approaches, and preliminary assessment ...
A vast amount of location information exists in unstructured texts, such as social media posts, news...
In order to better support the text mining of historical texts, we propose a combination of compleme...
There are several approaches to extract geo-knowledge from documents and textual fields in databases...
A short pilot study was conducted to provide recommendations on methods and workflows for extracting...
The vast majority of locality descriptions associated with biological specimens housed in natural hi...
Ground-truth datasets are essential for the training and evaluation of any automated algorithm. As s...
The locations at which natural history museum specimens were collected can be visualized using Geogr...
Conference paper exploring three case studies on how methods of retrieval of geographic data offered...
A vast amount of location information exists in unstructured texts, such as social media posts, news...
A significant amount of spatial information can be derived from unstructured datasets available in w...
The taxonomic literature is one of the largest resources of information on biodiversity, both curren...
The taxonomic literature is one of the largest resources of information on biodiversity, both curren...
peer reviewedThis paper discusses the added value of applying machine learning (ML) to contextually ...
The Museu de Ciències Naturals de Barcelona (MCNB) holds a collection of ca. 130,000 digitally regis...
We report on two JISC-funded projects that aimed to enrich the metadata of digitized historical coll...
A vast amount of location information exists in unstructured texts, such as social media posts, news...
In order to better support the text mining of historical texts, we propose a combination of compleme...
There are several approaches to extract geo-knowledge from documents and textual fields in databases...
A short pilot study was conducted to provide recommendations on methods and workflows for extracting...
The vast majority of locality descriptions associated with biological specimens housed in natural hi...
Ground-truth datasets are essential for the training and evaluation of any automated algorithm. As s...
The locations at which natural history museum specimens were collected can be visualized using Geogr...
Conference paper exploring three case studies on how methods of retrieval of geographic data offered...
A vast amount of location information exists in unstructured texts, such as social media posts, news...
A significant amount of spatial information can be derived from unstructured datasets available in w...
The taxonomic literature is one of the largest resources of information on biodiversity, both curren...
The taxonomic literature is one of the largest resources of information on biodiversity, both curren...
peer reviewedThis paper discusses the added value of applying machine learning (ML) to contextually ...
The Museu de Ciències Naturals de Barcelona (MCNB) holds a collection of ca. 130,000 digitally regis...
We report on two JISC-funded projects that aimed to enrich the metadata of digitized historical coll...
A vast amount of location information exists in unstructured texts, such as social media posts, news...
In order to better support the text mining of historical texts, we propose a combination of compleme...
There are several approaches to extract geo-knowledge from documents and textual fields in databases...