Novel Algorithm for Baseline Detection of Offline Arabic Handwritten Text Recognition

Authors

  • Ahmad Mustafa Ali Al Masri Faculty of Computer Science and Mathematics, Universiti Malaysia Terengganu, 21030 Kuala Nerus, Terengganu, Malaysia
  • Muhammad Suzuri Hitam Faculty of Computer Science and Mathematics, Universiti Malaysia Terengganu, 21030 Kuala Nerus, Terengganu, Malaysia
  • Wan Nural Jawahir Hj Wan Yussof Faculty of Computer Science and Mathematics, Universiti Malaysia Terengganu, 21030 Kuala Nerus, Terengganu, Malaysia
  • Atallah Al-Shatnawi Department of Information System, Prince Hussein Bin Abdullah Faculty of Information Technology, Al Al-Bayt University, Mafraq, Jordan

DOI:

https://doi.org/10.37934/araset.37.1.5668

Keywords:

Arabic Text Recognition, Arabic Text preprocessing, Baseline Detection, Handwritten and Skeleton Algorithm

Abstract

The baseline detection is one of the challenging tasks in the pre-processing stage of an online Arabic handwritten text recognition. The challenges include text skewness, touching letters or words, short words, sub-words, ligatures, isolated characters, small descenders, and diacritics. This paper presents a novel automated algorithm for baseline detection that is able to overcome the previously mentioned challenges. The proposed baseline algorithm mainly used a skeletonizing algorithm to estimate the baseline for each sub-word that appear in the image separately. The proposed algorithm had been tested using the benchmark handwritten Arabic database, i.e., IFN/ENIT database. The experimental results showed that it is able to reduce the average error between the actual baseline and the resulting baseline to 3.39 pixels for all sub-words and achieved an average of 3.6% in the ratio between the error pixels and the image height. In addition, the proposed algorithm shows that it is superior as compared to four others benchmarked baseline detection algorithms. It is anticipated that this algorithm will be able to be applied for many Arabic offline handwritten applications in the future.

Downloads

Download data is not yet available.

Published

2024-01-09

Issue

Section

Articles