A HYBRID MODEL FOR ARABIC SCRIPT RECOGNITION BASED ON CNN-CBAM AND BLSTM


(Received: 4-Mar.-2024, Revised: 5-May-2024 , Accepted: 21-May-2024)
Handwriting recognition, particularly for Arabic, is a very challenging field of research due to various complex factors, such as the presence of ligatures, cursive writing style, slant variations, diacritics, overlapping and other difficult problems. This paper specifically addresses the task of recognizing offline Arabic handwritten text lines. The main contributions include the pre-processing stage and the utilization of a deep learning-based approach with data-augmentation techniques. The pre-processing step involves correcting the skew of text-lines and removing any unnecessary white space in images. The deep-learning architecture consists of a Convolutional Neural Network and Convolutional Block Attention Module for feature extraction, along with Bidirectional Long Short-Term Memory for sequence modeling and Connectionist Temporal Classification as a decoder. Data-augmentation techniques are utilized on the images in the database to enhance the system’s ability to recognize a wide range of Arabic characters and to extend the level of abstraction in patterns due to synthetic variations. Our suggested approach has the capability of precisely recognizing Arabic handwritten texts without the necessity of character segmentation, thereby resolving various issues associated with this aspect. The results obtained from the KHATT database highlight the effectiveness of our approach, demonstrating a Word Error Rate of 14.55% and a Character Error Rate of 3.25%.

[1] S. Ahmed, S. Naz, S. Swati, M. I. Razzak and A. I. Umar, "UCOM Offline Dataset: An Urdu Handwritten Dataset Generation," Int. Arab Journal of Information Technology, vol. 14, no. 2, pp. 239-245, 2017.

[2] A. Graves and J. Schmidhuber, "Offline Handwriting Recognition with Multidimensional Recurrent Neural NetWorks," Proc. of the 21st Int. Conf. on Neural Information Processing Systems (NIPS 2008), Red Hook, pp. 545-552, 2009.

[3] S. Naz et al., "The Optical Character Recognition of Urdu-like Cursive Scripts," Pattern Recognition, vol. 47, no. 3, pp. 1229–1248, DOI: 10.1016/j.patcog.2013.09.037, Mar. 2014.

[4] S. Faisal Rashid, M.-P. Schambach, J. Rottland and S. von der Null, "Low Resolution Arabic Recognition with Multidimensional Recurrent Neural Networks," Proc. of the 4th Int. Workshop on Multilingual OCR (MOCR '13), Article no. 6, pp. 1-5, DOI: 10.1145/2505377.2505385, Aug. 2013.

[5] D. Xiang, H. Yan, X. Chen and Y. Cheng, "Offline Arabic Handwriting Recognition System Based on HMM," Proc. of the 2010 3rd IEEE Int. Conf. on Computer Science and Information Technology, DOI: 10.1109/iccsit.2010.5564429, Chengdu, Jul. 2010.

[6] N. Dalal and B. Triggs, "Histograms of Oriented Gradients for Human Detection," Proc. of the 2005 IEEE Computer Society Conf. on Computer Vision and Pattern Recogn. (CVPR’05), vol. 1, pp. 886–893, 2005.

[7] T. Ojala, M. Pietikäinen and D. Harwood, "A Comparative Study of Texture Measures with Classification Based on Featured Distributions," Pattern Recognition, vol. 29, no. 1, pp. 51–59, Jan. 1996.

[8] Y. LeCun and Y. Bengio, "Convolutional Networks for Images, Speech and Time Series," Part of Book: The Handbook of Brain Theory and Neural Networks, p. 3361, 1995.

[9] S. A. Mahmoud et al., "KHATT: An Open Arabic Offline Handwritten Text Database," Pattern Recognition, vol. 47, no. 3, pp. 1096–1112, DOI: 10.1016/j.patcog.2013.08.009, Mar. 2014.

[10] M. F. Ben Zeghiba, J. Louradour and C. Kermorvant, "Hybrid Word/Part-of- Arabic-Word Language Models for Arabic Text Document Recognition," Proc. of the 2015 13th IEEE Int. Conf. on Document Analysis and Recognition (ICDAR), pp. 671-675, DOI: 10.1109/icdar.2015.7333846, Aug. 2015.

[11] R. Ahmad, S. Naz, M. Zeshan Afzal, S. Faisal Rashid, M. Liwicki and Dengel, "KHATT: A Deep Learning Benchmark on Arabic Script," Proc. of the 2017 14th IEEE IAPR Int. Conf. on Document Analysis and Recognition, pp. 10-14, DOI: 10.1109/icdar.2017.358, Nov. 2017.

[12] M. F. BenZeghiba, "A Comparative Study on Optical Modeling Units for Off-line Arabic Text Recognition," Proc. of the 2017 14th IEEE IAPR Int. Conf. on Document Analysis and Recognition (ICDAR), DOI: 10.1109/icdar.2017.170, Nov. 2017.

[13] R. Ahmad, S. Naz, M. Zeshan Afzal, S. Faisal Rashid, M. Liwicki and A. Dengel, "The Impact of Visual Similarities of Arabic-like Scripts Regarding Learning in an OCR System," Proc. of the 2017 14th IEEE IAPR Int. Conf. on Document Analysis and Recognition (ICDAR), DOI: 10.1109/icdar.2017.359, 2017.

[14] S. Khamekhem Jemni, Y. Kessentini, S. Kanoun and J.-M. Ogier, "Offline Arabic Handwriting Recognition Using BLSTMs Combination," Proc. of the 2018 13th IAPR Int.Workshop on Document Analysis Systems (DAS), DOI: 10.1109/das.2018.54, Apr. 2018.

[15] I. Ahmad and G. A. Fink, "Handwritten Arabic Text Recognition Using Multi-stage Sub-core-shape HMMs," Int. Journal on Document Analysis and Recognition (IJDAR), vol. 22, no. 3, pp. 329–349, 2019.

[16] R. Ahmad, S. Naz, M. Afzal, S. Rashid, M. Liwicki and A. Dengel, "A Deep Learning based Arabic Script Recognition System: Benchmark on KHAT," Int. Arab Journal of Information Technology, vol. 17, no. 3, pp. 299–305, DOI: 10.34028/iajit/17/3/3, May 2020.

[17] Z. Noubigh, A. Mezghani and M. Kherallah, "Contribution on Arabic Handwriting Recognition Using Deep Neural Network," Proc. of the Int. Conf. on Hybrid Intelligent Systems, Part of the Book Series: Advances in Intelligent Systems and Computing, vol. 1179, pp. 123–133, DOI: 10.1007/978-3-030-49336-3_13, 2020.

[18] D. Coquenet, C. Chatelain and T. Paquet, "End-to-End Handwritten Paragraph Text Recognition Using a Vertical Attention Network," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 1, pp. 508–524, DOI: 10.1109/tpami.2022.3144899, Jan. 2023.

[19] B. N. Shashank, S. Nagesh Bhattu and Krishna, "Improvising the CNN Feature Maps through Integration of Channel Attention for Handwritten Text Recognition," Communications in Computer and InformationScience, pp. 490–502, DOI: 10.1007/978-3-031-31417-9_37, Jan. 2023.

[20] T. Anjum and N. Khan, "CALText: Contextual Attention Localization for Offline Handwritten Text," arXiv.org, arXiv: 2111.03952, DOI: 10.48550/arXiv.2111.03952, 2021.

[21] H. Lamtougui, H. El Moubtahij, H. Fouadi and K. Satori, "An Efficient Hybrid Model for Arabic Text Recognition," Computers, Materials & Continua, vol. 74, no. 2, pp. 2871–2888, 2023.

[22] S. Momeni and B. Baba Ali, "A Transformer-based Approach for Arabic Offline Handwritten Text Recognition," Signal, Image and Video Processing, vol. 18, no. 4, pp. 3053–3062, Jan. 2024.

[23] J. Kanai and A. Bagdanov, "Projection Profile Based Skew Estimation Algorithm for JBIG Compressed Images," Int. J. on Document Analysis and Recognition, vol. 1, pp. 43–51, 1998.

[24] S. Woo, J. Park, J.-Y. Lee and I. S. Kweon, "CBAM: Convolutional Block Attention Module," arXiv.org, arXiv: 1807.06521, DOI: 10.48550/arXiv.1807.06521, Jul. 2018.

[25] J. Hu, L. Shen, S. Albanie, G. Sun and E. Wu, "Squeeze-and-Excitation Networks," IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 42, no. 8, pp. 1–1, 2019.

[26] V. I. Levenshtein, "Binary Codes Capable of Correcting Deletions, Insertions and Reversals," Soviet Physics-Doklady, vol. 10, no. 8, pp. 707-710, Feb. 1966.