Latin Letters Recognition Using Optical Character Recognition to Convert Printed Media Into Digital Format

       Rio Anugrah, Ketut Bayu Yogha Bintoro

Abstract


Printed media is still popular now days society. Unfortunately, such media encountered several drawbacks. For example, this type of media consumes large storage that impact in high maintenance cost. To keep printed information more efficient and long-lasting, people usually convert it into digital format. In this paper, we built Optical Character Recognition (OCR) system to enable automatic conversion the image containing the sentence in Latin characters into digital text-shaped information. This system consists of several interrelated stages including preprocessing, segmentation, feature extraction, classifier, model and recognition. In preprocessing, the median filter is used to clarify the image from noise and the Otsu’s function is used to binarize the image. It followed by character segmentation using connected component labeling. Artificial neural network (ANN) is used for feature extraction to recognize the character. The result shows that this system enable to recognize the characters in the image whose success rate is influenced by the training of the system.


  http://dx.doi.org/10.14203/jet.v17.56-62

Keywords


Optical Character Recognition (OCR); segmentation; feature extraction; artificial neural network (ANN)

Full Text:

  PDF

References


J. Chandarana, and M. Kapadia, "Optical character recognition," International Journal of Emerging Technology and Advanced Engineering, 4(5), pp. 219–223, 2014.

S. Chopra, and A. Ghadge, "Optical Character Recognition," International Journal of Advanced Research in Computer and Communication Engineering, 3(1), pp. 4956-4958, 2014.

A. Sharma, and D.R. Chaudhary, "Character Recognition Using Neural Network. International Journal of Engineering Trends and Technology (IJETT), 4(April), pp. 662–667, Apr. 2013.

A.T. Birhanu, and R. Sethuraman, "Artificial Neural Network Approach to the Development of OCR for Real Life Amharic Documents," International Journal of Science, Engineering and Technology Research, 4(1), pp. 141–147, 2015.

E.S.A. Ahmed, R. E.A Elatif, Z.T. Alser, "Median Filter Performance Based on Different Window Sizes for Salt and Pepper Noise Removal in Gray and RGB Images," International Journal of Signal Processing Image Processing and Pattern Recognition, 8(10), pp. 343-352, 2015. Crossref

B.K Pal, P.S. Tiwari, P.S. Kumar, "Efficient Small and Capital Handwritten Character Recognition with Noise Reduction," International Journal of Emerging Technology and Advanced Engineering, 3(8), pp. 408-413, 2013.

N. Chaki, S.H. Shaikh, K. Saeed, "Exploring image binarization techniques," Studies in Computational Intelligence, 560, pp. 5–16, 2014. Crossref

N. Otsu, "A threshold selection method from gray-level histograms," IEEE Transactions on Systems, Man, and Cybernetics, 9(1), pp. 62-66, 1979. Crossref

G. Mehul, P. Ankita, D. Namrata, G. Rahul, S. Sheth, "Text-Based Image Segmentation Methodology," Procedia Technology, vol. 14, pp. 465-472, 2014. Crossref

D.N. Hakro, S.A. Awan, M. Memon, A. Aamur, G. Mojai, "Interactive thinning for segmentation-based and segmentation-free Sindhi OCR," Sindh University Research Journal-SURJ (Science Series), 47(3), pp. 395-398, 2015.

M.S. Nixon, and A.S Aguado, Feature Extraction and Image Processing, Academic Press, 2008. Crossref

S. Kumar, "A Brief Review of Classifiers used in OCR Applications," International Journal of Computer Trends and Technology (IJCTT), 34(2), pp. 80-88, 2016. Crossref

R. Buse, Z.-Q. Liu, J. Bezdek, "Word Recognition Using Fuzzy Logic," IEEE Transactions on Fuzzy Systems, vol. 10, no. 1, pp. 65-76, 2002. Crossref

N. Bhagava, A. Kumawat, R. Bhargava, "Threshold and binarization for document image analysis using otsu’s Algorithm," International Journal of Computer Trends and Technology (IJCTT), 17(5), pp. 272-275, 2014.

S. Imam Syafi, R. Tri Wahyuningrum, D. Arif Muntasa, "Segmentasi Obyek Pada Citra Digital Menggunakan Metode Otsu Thresholding," Jurnal Informatika, 13(1), pp. 1-8, 2015. Crossref

K. B. Yogha, M. Cendana, R. Lipikorn, “Non-deterministic finite state automata as termites swarm agent model,” in Proc. of 7th International Workshop on Computer Science and Engineering (WCSE), pp. 318-325, 2017.

P. Pooja, R. Phalak, W. Jayashri, S. Yugandhara, "Text extraction from English comic images using connected component algorithm," in Proc. of 4th IRF International Conference, Pune, pp. 166-169, 2014.

A. S. Karne and S. S. Navalgund, "Implementation of an Image Thinning Algorithm using Verilog and MATLAB," International Journal of Current Engineering and Technology (Ncwse), pp. 333-337, 2013.

Hendri, "Character Recognition Dengan Menggunakan Jaringan Syaraf Tiruan," Jurnal TIMES, III(2), pp. 1–5, 2014.


Article Metrics

Metrics Loading ...

Metrics powered by PLOS ALM

Refbacks

  • There are currently no refbacks.




Copyright (c) 2017 Jurnal Elektronika dan Telekomunikasi

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Powered by OJS | Design by ThemeOJS