Deep Learning Approaches in Arabic OCR
Mohamed Atia Mohamed Radwan;
Abstract
Recognition of Arabic text in printed documents or natural scenes is a hard
problem compared to the same application on Latin languages. Since Arabic typography
is more complicated and the difference between characters can be very subtle there
becomes more steps needed in such systems we want to build. In this thesis we develop
Arabic characters’ recognition systems for different tasks. The first recognition pipeline
is used to recognize and transcribe Arabic licenses car plates from a video stream. The
pipeline consists of a localization algorithm for extracting the license plate, and a deep
neural network for recognizing characters on the plate. The neural network was trained
on synthetic data and tested on a real world example manually annotated. The model
for Arabic character recognition achieved 90% accuracy, while the model for Arabic
numbers recognition achieved 94% accuracy. We then introduce an Arabic OCR system
for recognizing Arabic text in scanned documents. The pipeline starts with a document
containing lines of text, and segments them into each line alone then into sub-words
using histogram projection thresholding. Since the remaining sub-systems of the
pipeline are trained to recognize text of default size 18pt, we built a neural network
model for predicting the size of an input sub-word, hence we can afterwards normalize
this sub-word into the 18p default size. A multichannel deep neural network model is
built to segment input sub-word into characters. Then a model for recognizing
characters is finally used to have the final output
problem compared to the same application on Latin languages. Since Arabic typography
is more complicated and the difference between characters can be very subtle there
becomes more steps needed in such systems we want to build. In this thesis we develop
Arabic characters’ recognition systems for different tasks. The first recognition pipeline
is used to recognize and transcribe Arabic licenses car plates from a video stream. The
pipeline consists of a localization algorithm for extracting the license plate, and a deep
neural network for recognizing characters on the plate. The neural network was trained
on synthetic data and tested on a real world example manually annotated. The model
for Arabic character recognition achieved 90% accuracy, while the model for Arabic
numbers recognition achieved 94% accuracy. We then introduce an Arabic OCR system
for recognizing Arabic text in scanned documents. The pipeline starts with a document
containing lines of text, and segments them into each line alone then into sub-words
using histogram projection thresholding. Since the remaining sub-systems of the
pipeline are trained to recognize text of default size 18pt, we built a neural network
model for predicting the size of an input sub-word, hence we can afterwards normalize
this sub-word into the 18p default size. A multichannel deep neural network model is
built to segment input sub-word into characters. Then a model for recognizing
characters is finally used to have the final output
Other data
| Title | Deep Learning Approaches in Arabic OCR | Other Titles | مناهج التعلم العميق فى التعرف على الكتابة العربية | Authors | Mohamed Atia Mohamed Radwan | Issue Date | 2017 |
Recommend this item
Similar Items from Core Recommender Database
Items in Ain Shams Scholar are protected by copyright, with all rights reserved, unless otherwise indicated.