Novel Algorithm for Baseline Detection of Offline Arabic Handwritten Text Recognition
DOI:
https://doi.org/10.37934/araset.37.1.5668Keywords:
Arabic Text Recognition, Arabic Text preprocessing, Baseline Detection, Handwritten and Skeleton AlgorithmAbstract
The baseline detection is one of the challenging tasks in the pre-processing stage of an online Arabic handwritten text recognition. The challenges include text skewness, touching letters or words, short words, sub-words, ligatures, isolated characters, small descenders, and diacritics. This paper presents a novel automated algorithm for baseline detection that is able to overcome the previously mentioned challenges. The proposed baseline algorithm mainly used a skeletonizing algorithm to estimate the baseline for each sub-word that appear in the image separately. The proposed algorithm had been tested using the benchmark handwritten Arabic database, i.e., IFN/ENIT database. The experimental results showed that it is able to reduce the average error between the actual baseline and the resulting baseline to 3.39 pixels for all sub-words and achieved an average of 3.6% in the ratio between the error pixels and the image height. In addition, the proposed algorithm shows that it is superior as compared to four others benchmarked baseline detection algorithms. It is anticipated that this algorithm will be able to be applied for many Arabic offline handwritten applications in the future.