The creation of a high-quality optical character recognition system (OCR) requires a large amount of labeled data. Obtaining, or in other words creating, such a quantity of labeled data is a costly process. This thesis focuses on several methods which efficiently use unlabeled data for the training of an OCR neural network. The proposed methods fall into the category of self-training algorithms. The general approach of all proposed methods can be summarized as follows. Firstly, the seed model is trained on a limited amount of labeled data. Then, the seed model in combination with the language model is used for producing pseudo-labels for unlabeled data. Machine-labeled data are then combined with the training data used for the creation of t...
Although OCR (Optical Character Recognition) is a topic which has been a subject of research since t...
Optical Character Recognition (OCR), is extraction of textual data from scanned text documents to fa...
Machine learning algorithms need a lot of data, both for training and for testing. However, not alw...
This paper presents a novel approach for optical character recognition (OCR) on acceleration and to ...
Character recognition is the process of enabling computers to classify the characters from their ima...
Recent works like BERT, GPT, ELMO, ULMFiT have successfully demonstrated the effectiveness of pretra...
Optical Character Recognition (OCR) is the process of extracting the characters from a digital image...
Abstract – Often the best model to solve a real world problem is relatively complex. The following p...
Sequence learning describes the process of understanding the spatio-temporal relations in a sequenc...
Abstract—Most of the popular optical character recognition (OCR) architectures use a set of handcraf...
Optical Character Recognition (OCR) plays an important role in the retrieval of information from pix...
Optical character recognition (OCR) remains a difficult problem for noisy documents or documents not...
This paper presents a generic optical character recognition (OCR) system based on deep Siamese convo...
This paper proposes a pre-training method for neural network-based character recognizers to reduce t...
Training a system to recognize handwritten words is a task that requires a large amount of data with...
Although OCR (Optical Character Recognition) is a topic which has been a subject of research since t...
Optical Character Recognition (OCR), is extraction of textual data from scanned text documents to fa...
Machine learning algorithms need a lot of data, both for training and for testing. However, not alw...
This paper presents a novel approach for optical character recognition (OCR) on acceleration and to ...
Character recognition is the process of enabling computers to classify the characters from their ima...
Recent works like BERT, GPT, ELMO, ULMFiT have successfully demonstrated the effectiveness of pretra...
Optical Character Recognition (OCR) is the process of extracting the characters from a digital image...
Abstract – Often the best model to solve a real world problem is relatively complex. The following p...
Sequence learning describes the process of understanding the spatio-temporal relations in a sequenc...
Abstract—Most of the popular optical character recognition (OCR) architectures use a set of handcraf...
Optical Character Recognition (OCR) plays an important role in the retrieval of information from pix...
Optical character recognition (OCR) remains a difficult problem for noisy documents or documents not...
This paper presents a generic optical character recognition (OCR) system based on deep Siamese convo...
This paper proposes a pre-training method for neural network-based character recognizers to reduce t...
Training a system to recognize handwritten words is a task that requires a large amount of data with...
Although OCR (Optical Character Recognition) is a topic which has been a subject of research since t...
Optical Character Recognition (OCR), is extraction of textual data from scanned text documents to fa...
Machine learning algorithms need a lot of data, both for training and for testing. However, not alw...