A method is presented for postprocessing the output of a word recognition algorithm for visual text recognition. A word recognition algorithm typically outputs a list of alternatives for the identity of each word in a running text. Each alternative is ranked by the probability that it is correct. A relaxation algorithm is proposed in this paper that re-ranks the alternatives for each word using word collocation statistics. These are measurements of the probability that two words occur nearby each other in a running text. The improvement in performance is measured by the increase in the percentage of correct alternatives in the first position. Experimental results are presented that show the proposed algorithm can improve the performance of ...
Orthography–semantics consistency (OSC) is a measure that quantifies the degree of semantic relatedn...
The main disadvantage of collocation-based word sense disambiguation is that the recall is low, with...
The word error rate of any optical character recognition system (OCR) is usually substantially below...
taohongcsbualoedu hullcsbualoedu OCR is an errorprone process when input images are degraded Most c...
WOS: 000295585600003In all natural languages, some words collocate with other words to create multi-...
A technique is presented that uses visual relationships between word images in a document to improve...
A probabilistic lattice chart parser is proposed for improving the performance of a text recognition...
Optical character recognition (OCR) is a recognition system used to recognize the substance of a che...
In this paper, stochastic error-correcting parsing is proposed as a powerful and flexible method to ...
This paper describes the derivation of probability of correctness from scores assigned by most recog...
The output of handwritten word recognizers (WR) tends to be very noisy due to various factors. In or...
Many document-based applications, including popular Web browsers, email viewers, and word processors...
[[abstract]]The techniques of image processing have been used in optical character recognition (OCR)...
In this paper, stochastic error-correcting parsing is pro-posed as a powerful and flexible method to...
This research examined visual and phonological coding in visual word recognition. Participants named...
Orthography–semantics consistency (OSC) is a measure that quantifies the degree of semantic relatedn...
The main disadvantage of collocation-based word sense disambiguation is that the recall is low, with...
The word error rate of any optical character recognition system (OCR) is usually substantially below...
taohongcsbualoedu hullcsbualoedu OCR is an errorprone process when input images are degraded Most c...
WOS: 000295585600003In all natural languages, some words collocate with other words to create multi-...
A technique is presented that uses visual relationships between word images in a document to improve...
A probabilistic lattice chart parser is proposed for improving the performance of a text recognition...
Optical character recognition (OCR) is a recognition system used to recognize the substance of a che...
In this paper, stochastic error-correcting parsing is proposed as a powerful and flexible method to ...
This paper describes the derivation of probability of correctness from scores assigned by most recog...
The output of handwritten word recognizers (WR) tends to be very noisy due to various factors. In or...
Many document-based applications, including popular Web browsers, email viewers, and word processors...
[[abstract]]The techniques of image processing have been used in optical character recognition (OCR)...
In this paper, stochastic error-correcting parsing is pro-posed as a powerful and flexible method to...
This research examined visual and phonological coding in visual word recognition. Participants named...
Orthography–semantics consistency (OSC) is a measure that quantifies the degree of semantic relatedn...
The main disadvantage of collocation-based word sense disambiguation is that the recall is low, with...
The word error rate of any optical character recognition system (OCR) is usually substantially below...