NOVEL TECHNIQUES FOR ENHANCING AUTOMATIC ARABIC HANDWRITING RECOGNITION
Hany Ahmed Sayed Mansour;
Abstract
In this thesis, we present a novel segmentation free Arabic handwriting recognition systems based on hidden Markov model (HMM). Three main contributions are introduced: online Arabic handwriting recognition system, offline Arabic handwriting recognition system and combining the both offline and online systems.
For offline handwriting system, we introduce a new technique for dividing the image into non-uniform horizontal segments to extract the features and a new technique for solving the problems of the skewing of characters by fusing multiple HMMs. The proposed system first pre-processes the input image by setting the thickness of the input word to three pixels and fixing the spacing between the different parts of the word. The input image is divided into constant number of non-uniform horizontal segments depending on the distribution of the foreground pixels. A set of robust features representing the gradient of the foreground pixels is extracted using sliding windows. The input image is decomposed into several images representing the vertical, horizontal, left diagonal and right diagonal edges in the image. A set of robust features representing the densities of the foreground pixels in the various edge images is extracted using sliding windows. The proposed system builds character HMM models and learns word HMM models using embedded training. Besides the vertical sliding window, two slanted sliding windows are used to extract the features. Three different HMMs are used: one for the vertical sliding window and two for the slanted windows. A fusion scheme is used to combine the three HMMs. The proposed system is very promising and competes with the other Arabic handwriting recognition systems reported in the literature.
For online handwriting recognition system, delayed strokes are removed from the online Arabic word to avoid the difficulty and the confusion caused by the delayed strokes in the recognition process. Dictionaries for all the words in the database have been constructed with and without the delayed strokes. Word matching in both dictionaries along with effective online features and careful choice of the HMM parameters have significantly improved the recognition rate of the proposed system.
For the combined system, the integration between online and offline approaches has proven to give a better performance. With the combination we could increase the system performance over the best individual recognizer.
For offline handwriting system, we introduce a new technique for dividing the image into non-uniform horizontal segments to extract the features and a new technique for solving the problems of the skewing of characters by fusing multiple HMMs. The proposed system first pre-processes the input image by setting the thickness of the input word to three pixels and fixing the spacing between the different parts of the word. The input image is divided into constant number of non-uniform horizontal segments depending on the distribution of the foreground pixels. A set of robust features representing the gradient of the foreground pixels is extracted using sliding windows. The input image is decomposed into several images representing the vertical, horizontal, left diagonal and right diagonal edges in the image. A set of robust features representing the densities of the foreground pixels in the various edge images is extracted using sliding windows. The proposed system builds character HMM models and learns word HMM models using embedded training. Besides the vertical sliding window, two slanted sliding windows are used to extract the features. Three different HMMs are used: one for the vertical sliding window and two for the slanted windows. A fusion scheme is used to combine the three HMMs. The proposed system is very promising and competes with the other Arabic handwriting recognition systems reported in the literature.
For online handwriting recognition system, delayed strokes are removed from the online Arabic word to avoid the difficulty and the confusion caused by the delayed strokes in the recognition process. Dictionaries for all the words in the database have been constructed with and without the delayed strokes. Word matching in both dictionaries along with effective online features and careful choice of the HMM parameters have significantly improved the recognition rate of the proposed system.
For the combined system, the integration between online and offline approaches has proven to give a better performance. With the combination we could increase the system performance over the best individual recognizer.
Other data
| Title | NOVEL TECHNIQUES FOR ENHANCING AUTOMATIC ARABIC HANDWRITING RECOGNITION | Other Titles | تقنيات مبتكرة لتحسين التعرف الآلى على الكتابة العربية لخط اليد | Authors | Hany Ahmed Sayed Mansour | Issue Date | 2016 |
Recommend this item
Similar Items from Core Recommender Database
Items in Ain Shams Scholar are protected by copyright, with all rights reserved, unless otherwise indicated.