Taxonomic literature keeps records of the planet's biodiversity and gives access to the knowledge needed for research and sustainable management. The number of publications generated is quite large: the corpus of biodiversity literature includes tens of millions of figures and taxonomic treatments. Unfortunately, most of the taxonomic descriptions are from scientific publications in text format. With more than 61 million digitized pages in the Biodiversity Heritage Library (BHL), only 467,265 taxonomic treatments are available in the Biodiversity Literature Repository. To obtain highly structured texts from digitized text has been shown to be complex and very expensive (Cui et al. 2021). The scientific community has described over 1.2 milli...