The National Library of Finland (NLF) has digitized historical newspapers, journals and ephemera published in Finland since the late 1990s. The present collection consists of about 16.51 million pages mainly in Finnish and Swedish. Out of these about 7.64 million pages are freely available on the web site https://digi.kansalliskirjasto.fi/etusivu. The copyright restricted part of the collection can be used at six legal deposit libraries in different parts of Finland. The time period of the open collection is from 1771 to 1929. The last nine years, 1921–1929, were opened in January 2018. This paper presents briefly the ground truth Optical Character Recognition data of about 500 000 words that has been compiled at the NLF for development ...
The dataset comprises swedish newspaper pages from late 18th till early 20th century with carefully ...
Effects of Optical Character Recognition (OCR) quality on historical information retrieval have so f...
GT4HistOCR contains ground truth for research in Optical Character Recognition (OCR) technology appl...
The National Library of Finland (NLF) has digitized historical newspapers, journals and ephemera pub...
The National Library of Finland (NLF) has digitized historical newspapers, journals and ephemera pub...
This paper presents experiments on Optical character recognition (OCR) as a combination of Ocropy so...
The National Library of Finland (NLF) has digitized historical newspapers, journals and ephemera pub...
The dataset comprises finnish newspaper pages from late 18th till early 20th century with carefully ...
The data package contains materials for training or fine-tuning OCR models to work with fonts and la...
In this paper we describe a method for improving the optical character recognition (OCR) toolkit Tes...
We present an OCR ground truth data set for historical prints and show improvement o...
This dataset contains 50 pages of ground truth data for digitized historical newspapers from the Ber...
International audienceWe present an OCR ground truth data set for historical prints and show improve...
Recent advances in Optical Character Recognition (OCR) and Handwritten Text Recognition (HTR) have l...
Over the past years, considerable effort has been put into digitising library collections. As part o...
The dataset comprises swedish newspaper pages from late 18th till early 20th century with carefully ...
Effects of Optical Character Recognition (OCR) quality on historical information retrieval have so f...
GT4HistOCR contains ground truth for research in Optical Character Recognition (OCR) technology appl...
The National Library of Finland (NLF) has digitized historical newspapers, journals and ephemera pub...
The National Library of Finland (NLF) has digitized historical newspapers, journals and ephemera pub...
This paper presents experiments on Optical character recognition (OCR) as a combination of Ocropy so...
The National Library of Finland (NLF) has digitized historical newspapers, journals and ephemera pub...
The dataset comprises finnish newspaper pages from late 18th till early 20th century with carefully ...
The data package contains materials for training or fine-tuning OCR models to work with fonts and la...
In this paper we describe a method for improving the optical character recognition (OCR) toolkit Tes...
We present an OCR ground truth data set for historical prints and show improvement o...
This dataset contains 50 pages of ground truth data for digitized historical newspapers from the Ber...
International audienceWe present an OCR ground truth data set for historical prints and show improve...
Recent advances in Optical Character Recognition (OCR) and Handwritten Text Recognition (HTR) have l...
Over the past years, considerable effort has been put into digitising library collections. As part o...
The dataset comprises swedish newspaper pages from late 18th till early 20th century with carefully ...
Effects of Optical Character Recognition (OCR) quality on historical information retrieval have so f...
GT4HistOCR contains ground truth for research in Optical Character Recognition (OCR) technology appl...