Optical character recognition using image processing and artificial neural network techniques
Loading...
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
University of Peradeniya
Abstract
Optical Character Recognition (OCR) is a potential tool for developing automated applications in many areas such as recognition of postal addresses and ZIP codes, Bank cheque processing and text recognition of printed documents. In this study an OCR system was developed to recognize printed text documents using image processing and Artificial Neural Network techniques. The recognition problem could be broken down into several components and these components are segmentation, skeletonization, normalization, feature extraction and classification. Except for the classification component that was implemented using a 3 layer feed forward neural network all the other components were implemented using image processing techniques. Segmentation was used to identify individual character images in the text document and then the skeletons of each segmented character image was obtained using Hilditch’s skeletonization algorithm. These skeletons were then normalized to a standard size using scaling. Feature extraction of normalized skeletons was performed to obtain 16 element feature vectors. These feature vectors were then classified using the neural network trained earlier with a data set of frequently used character fonts. During the testing of the OCR system developed under this study, high correct recognition percentages were observed on text documents that contained characters of large font sizes. Comparatively lower correct recognition percentages were observed among text
documents prepared with smaller font sizes. One reason for this could be the presence of higher number or connected characters in documents having smaller font text images. Finding a suitable architecture and the optimum training parameters for the back propagation network were the problems encountered in the construction of a neural network as the classifier.