A method is presented for the efficient segmentation of text lines from scanned images of technical documents. The method has been implemented in the ARXYC (Adaptive Recursive XY Cut) algorithm, which constructs an XY-tree to represent the geometric layout structure of a page image in which the text lines are found as leaf nodes. Geometric layout analysis is a subcomponent of the Document Image Analysis processing sequence and is typically preceded by scanning a document into a pixel map, preprocessing of the pixel map to reduce noise and remove skew and by thresholding to a binary image, and typically followed by a mapping of the geometric layout to a function representation and recovery of text and graphics from the pixel image. Technical...
A top-down page segmentation technique known as the recursive X-Y cut decomposes a document image re...
AbstractText/Image region separation is the process of identifying location of various text and imag...
We describe a top-down approach to the segmentation and representation of documents containing tabul...
A method is presented for the efficient segmentation of text lines from scanned images of technical ...
A single-parameter text-line extraction algorithm is described along with an efJicient technique for...
Column segmentation logically precedes OCR in the document analysis process. The trainable algorithm...
Abstract- Segmentation of text from poorly documented images is a very difficult task due to high mu...
This paper describes a text-line identification and segmentation technique that is probability based...
This paper describes fast and efficient method for page segmentation of document containing nonrecta...
[[abstract]]This study presents a new method, namely the multi-plane segmentation approach, for segm...
We present a fully automated process to scan the Australian Telecom Yellow Pages and produce a text ...
The object of research is the process of recognizing the areas of scanned documents images. The pape...
The object of research is the process of recognizing the areas of scanned documents images. The pape...
a fast speed and robust document image segmentation and classification algorithm based on bottom-up ...
Alternating horizontal and vertical projection profiles are extracted from nested sub-blocks of scan...
A top-down page segmentation technique known as the recursive X-Y cut decomposes a document image re...
AbstractText/Image region separation is the process of identifying location of various text and imag...
We describe a top-down approach to the segmentation and representation of documents containing tabul...
A method is presented for the efficient segmentation of text lines from scanned images of technical ...
A single-parameter text-line extraction algorithm is described along with an efJicient technique for...
Column segmentation logically precedes OCR in the document analysis process. The trainable algorithm...
Abstract- Segmentation of text from poorly documented images is a very difficult task due to high mu...
This paper describes a text-line identification and segmentation technique that is probability based...
This paper describes fast and efficient method for page segmentation of document containing nonrecta...
[[abstract]]This study presents a new method, namely the multi-plane segmentation approach, for segm...
We present a fully automated process to scan the Australian Telecom Yellow Pages and produce a text ...
The object of research is the process of recognizing the areas of scanned documents images. The pape...
The object of research is the process of recognizing the areas of scanned documents images. The pape...
a fast speed and robust document image segmentation and classification algorithm based on bottom-up ...
Alternating horizontal and vertical projection profiles are extracted from nested sub-blocks of scan...
A top-down page segmentation technique known as the recursive X-Y cut decomposes a document image re...
AbstractText/Image region separation is the process of identifying location of various text and imag...
We describe a top-down approach to the segmentation and representation of documents containing tabul...