Koreksi Orientasi Teks pada Dokumen Digital Menggunakan Transformasi Hough
DOI:
https://doi.org/10.32877/bt.v7i2.1821
Keywords:
Deteksi Tepi Canny, Digital Archiving, Koreksi Orientasi Dokumen, Rotasi Gambar, Transformasi Hough
Abstract
Dokumen-dokuman yang diarsipkan sangat penting bagi kebutuhan kantor, rumah, industri maupun museum sejarah. Alat pendukung seperti scanner membantu mengubah dokumen fisik menjadi dokumen digital. Dokumen digital dapat digunakan sewaktu-waktu jika dokumen fisik yang asli mengalami kerusakan, bencana ataupun kehilangan. Terkadang teks dalam dokumen digital hasil scan tidak tegak lurus atau memiliki orientasi yang salah seperti miring lebih dari 1 derajat. Agar mudah dibaca oleh orang-orang, maka dalam penelitian ini diusulkan menggunakan metode transformasi Hough sebagai solusi pada masalah tersebut. Transformasi Hough umumnya digunakan dalam pendeteksian garis, lingkaran atau bentuk lainnya. Langkah yang dilakukan yaitu dimulai dari grayscale, gaussian blur untuk menghaluskan gambar dan menghilangkan noise, pendeteksian tepi dengan Canny Edge Detection, deteksi garis dengan transformasi Hough dan melakukan rotasi gambar untuk memperbaiki kemiringan teks. Dataset yang dipakai adalah gambar scan ketikan teks dengan noise dan memiliki variasi sudut kemiringan. Hasil dari 10 sampel gambar menunjukkan bahwa sekitar 70% gambar dapat dirotasi mendekati kondisi tegak lurus dengan nilai sudut rata-rata kurang dari 1 derajat. Kemudian 30% lainnya memiliki nilai sudut rata-rata di antara 1 dan 2 derajat, meskipun begitu hal ini tidak terlalu signifikan dan masih dapat ditolerir bagi indera penglihatan manusia. Waktu pemrosesan tiap gambar rata-rata membutuhkan 0,022 detik. Hasil penelitian dapat digunakan untuk memperbaiki kemiringan pada dokumen digital dan meningkatkan akurasi sistem OCR.
Downloads
References
M. L. Chambers, Scanners For Dummies. in –For dummies. Wiley, 2004. [Online]. Available: https://books.google.co.id/books?id=hivOZOYiSy8C
S. L. Buchmann, “Moths on the Flatbed Scanner: The Art of Joseph Scheer,” Insects, vol. 2, no. 4, pp. 564–583, Dec. 2011, doi: 10.3390/insects2040564.
Federal Research Center " et al., “Towards a unified framework for identity documents analysis and recognition,” Computer Optics, vol. 46, no. 3, Jun. 2022, doi: 10.18287/2412-6179-CO-1024.
E. Callegari et al., “The Precision, Inter-Rater Reliability, and Accuracy of a Handheld Scanner Equipped with a Light Detection and Ranging Sensor in Measuring Parts of the Body—A Preliminary Validation Study,” Sensors, vol. 24, no. 2, p. 500, Jan. 2024, doi: 10.3390/s24020500.
J. F. Kelly, The Ultimate iPad: Your Digital Life at Your Fingertips. Pearson Education, 2014. [Online]. Available: https://books.google.co.id/books?id=V2QHBAAAQBAJ
J. D. Monson, Getting Started with Digital Collections: Scaling to Fit Your Organization. American Library Association, 2017. [Online]. Available: https://books.google.co.id/books?id=MXs4DwAAQBAJ
J. M. Perrin, Digitizing Flat Media: Principles and Practices. in LITA Guides. Rowman & Littlefield Publishers, 2015. [Online]. Available: https://books.google.co.id/books?id=epbnCgAAQBAJ
Y. Lin, S. L. Pintea, and J. C. Van Gemert, “Deep Hough-Transform Line Priors,” in Computer Vision – ECCV 2020, vol. 12367, A. Vedaldi, H. Bischof, T. Brox, and J.-M. Frahm, Eds., in Lecture Notes in Computer Science, vol. 12367. , Cham: Springer International Publishing, 2020, pp. 323–340. doi: 10.1007/978-3-030-58542-6_20.
K. Zhao, Q. Han, C.-B. Zhang, J. Xu, and M.-M. Cheng, “Deep Hough Transform for Semantic Line Detection,” IEEE Trans. Pattern Anal. Mach. Intell., pp. 1–1, 2021, doi: 10.1109/TPAMI.2021.3077129.
R. O. Duda and P. E. Hart, “Use of the Hough transformation to detect lines and curves in pictures,” Commun. ACM, vol. 15, no. 1, pp. 11–15, Jan. 1972, doi: 10.1145/361237.361242.
L. Jacobs, J. Weiss, and D. Dolan, “Object tracking in noisy radar data: Comparison of Hough transform and RANSAC,” in IEEE International Conference on Electro-Information Technology , EIT 2013, Rapid City, SD, USA: IEEE, May 2013, pp. 1–6. doi: 10.1109/EIT.2013.6632715.
A. Kempczynski, M. Bosek, and B. Grzegorzewski, “Comparison of Hough and Fourier transform approach in the study of kinetics of red blood cell aggregates,” presented at the 16th Polish-Slovak-Czech Optical Conference on Wave and Quantum Aspects of Contemporary Optics, Polanica Zdroj, Poland, Dec. 2008, p. 714118. doi: 10.1117/12.822389.
U. Marmol and N. Borowiec, “Detection of Line Objects by Means of Gabor Wavelets and Hough Transform,” Archives of Civil Engineering, pp. 339–363, Jul. 2020, doi: 10.24425/ace.2020.134401.
J. Canny, “A Computational Approach to Edge Detection”.
G. U. Nneji, J. Cai, J. Deng, H. N. Monday, M. A. Hossin, and S. Nahar, “Identification of Diabetic Retinopathy Using Weighted Fusion Deep Learning Based on Dual-Channel Fundus Scans,” Diagnostics, vol. 12, no. 2, p. 540, Feb. 2022, doi: 10.3390/diagnostics12020540.
Z. B. Faheem et al., “Image Watermarking Using Least Significant Bit and Canny Edge Detection,” Sensors, vol. 23, no. 3, p. 1210, Jan. 2023, doi: 10.3390/s23031210.
X. Chen, J. Ling, S. Wang, Y. Yang, L. Luo, and Y. Yan, “Ship detection from coastal surveillance videos via an ensemble Canny-Gaussian-morphology framework,” J. Navigation, vol. 74, no. 6, pp. 1252–1266, Nov. 2021, doi: 10.1017/S0373463321000540.
N. Cao and Y. Liu, “High-Noise Grayscale Image Denoising Using an Improved Median Filter for the Adaptive Selection of a Threshold,” Applied Sciences, vol. 14, no. 2, p. 635, Jan. 2024, doi: 10.3390/app14020635.
J. Li and X. Gui, “Fully Automatic Grayscale Image Segmentation: Dynamic Thresholding for Background Adaptation, Improved Image Center Point Selection, and Noise-Resilient Start/End Point Determination,” Applied Sciences, vol. 14, no. 20, p. 9303, Oct. 2024, doi: 10.3390/app14209303.
J. Wang and S. Lee, “Data Augmentation Methods Applying Grayscale Images for Convolutional Neural Networks in Machine Vision,” Applied Sciences, vol. 11, no. 15, p. 6721, Jul. 2021, doi: 10.3390/app11156721.
University of Kufa / Faculty of Education / Department of Computer Science/Iraq, A. A. M. Qazzaz, H. A. Alsabbagh, and University of Kufa / Faculty of Education / Department of Computer Science/Iraq, “An Overview of the Most Important Methods for Coloring Grayscale Images,” FJIECE, vol. 3, no. 1, pp. 45–67, Apr. 2024, doi: 10.46649/fjiece.v3.1.5a.14.4.2024.
T. Siriapisith, W. Kusakunniran, and P. Haddawy, “Pyramid graph cut: Integrating intensity and gradient information for grayscale medical image segmentation,” Computers in Biology and Medicine, vol. 126, p. 103997, Nov. 2020, doi: 10.1016/j.compbiomed.2020.103997.
“RECOMMENDATION ITU-R BT.601-7 – Studio encoding parameters of digital television for standard 4:3 and wide-screen 16:9 aspect ratios”.
“Carl Friedrich Gauss - Biography,” Maths History. Accessed: Nov. 27, 2024. [Online]. Available: https://mathshistory.st-andrews.ac.uk/Biographies/Gauss/
X. Tao, D. Zhang, Z. Wang, X. Liu, H. Zhang, and D. Xu, “Detection of Power Line Insulator Defects Using Aerial Images Analyzed With Convolutional Neural Networks,” IEEE Trans. Syst. Man Cybern, Syst., vol. 50, no. 4, pp. 1486–1498, Apr. 2020, doi: 10.1109/TSMC.2018.2871750.
O. O. Abayomi Alli, R. Damasevicius, S. Misra, and R. Maskeliunas, “Cassava disease recognition from LOW QUALITY images using enhanced data augmentation model and deep learning,” Expert Systems, vol. 38, no. 7, p. e12746, Nov. 2021, doi: 10.1111/exsy.12746.
P.-E. Danielsson and M. Hammerin, “High-accuracy rotation of images,” CVGIP: Graphical Models and Image Processing, vol. 54, no. 4, pp. 340–344, 1992, doi: https://doi.org/10.1016/1049-9652(92)90080-H.
P. Royster, “The Art of Scanning”.
“Dataset Scanned documents OCR.” Accessed: Jun. 28, 2024. [Online]. Available: https://www.kaggle.com/datasets/omnamahshivai/dataset-scanned-documents-ocr
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2024 bit-Tech
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
I hereby assign and transfer to bit-Tech all exclusive copyright ownership rights to the above work. This includes, but is not limited to, the right to publish, republish, downgrade, distribute, transmit, sell, or use the work and other related materials worldwide, in whole, or in part, in all languages, in electronic, printed, or any other form of media, now known or hereafter developed and reserves the right to permit or license a third party to do any of the above. I understand that this exclusive right will belong to bit-Tech from the date the article is accepted for publication. I also understand that bit-Tech, as the copyright owner, has sole authority to license and permit reproduction of the article. I understand that, except for copyright, any other proprietary rights associated with the work (e.g. patents or other rights to any process or procedure) must be retained by the author. In addition, I understand that bit-Tech permits authors to use their papers in any way permitted by the applied Creative Commons license.