ARTICLE
TITLE

LOCAL BINARIZATION FOR DOCUMENT IMAGES CAPTURED BY CAMERAS WITH DECISION TREE

SUMMARY

Character recognition in a document image captured by a digital camera requires a good binary image as the input for the separation the text from the background. Global binarization method does not provide such good separation because of the problem of uneven levels of lighting in images captured by cameras. Local binarization method overcomes the problem but requires a method to partition the large image into local windows properly. In this paper, we propose a local binariation method with dynamic image partitioning using integral image and decision tree for the binarization decision. The integral image is used to estimate the number of line in the document image. The number of line in the document image is used to devide the document into local windows. The decision tree makes a decision for threshold in every local window. The result shows that the proposed method can separate the text from the background better than using global thresholding with the best OCR result of the binarized image is 99.4%. Pengenalan karakter pada sebuah dokumen citra yang diambil menggunakan kamera digital membutuhkan citra yang terbinerisasi dengan baik untuk memisahkan antara teks dengan background. Metode binarisasi global tidak memberikan hasil pemisahan yang bagus karena permasalahan tingkat pencahayaan yang tidak seimbang pada citra hasil kamera digital. Metode binarisasi lokal dapat mengatasi permasalahan tersebut namun metode tersebut membutuhkan metode untuk membagi citra ke dalam bagian-bagian window lokal. Pada paper ini diusulkan sebuah metode binarisasi lokal dengan pembagian citra secara dinamis menggunakan integral image dan decision tree untuk keputusan binarisasi lokalnya. Integral image digunakan untuk mengestimasi jumlah baris teks dalam dokumen citra. Jumlah baris tersebut kemudian digunakan untuk membagi citra dokumen ke dalam window lokal. Keputusan nilai threshold untuk setiap window lokal ditentukan dengan decisiontree. Hasilnya menunjukkan metode yang diusulkan dapat memisahkan teks dari dokumen citra lebih baik dari binarisasi global dengan tingkat pengenalan OCR hingga 99.4%.

 Articles related

Mawaddah Harahap, Em Manuel Laia, Lilis Suryani Sitanggang, Melda Sinaga, Daniel Franci Sihombing, Amir Mahmud Husein    

The Coronavirus (COVID-19) pandemic has resulted in the worldwide death rate continuing to increase significantly, identification using medical imaging such as X-rays and computed tomography plays an important role in helping medical personnel diagnose p... see more


Dwiza Riana, Sri Rahayu, Sri Hadianti, Frieyadie, Muhamad Hasan, Izni Nur Karimah, Rafly Pratama    

Cervical cancer’s a gynecological malignancy in women that’s very dangerous, even causes death. Prevention through early detection of Pap smear test. It was carried out by pathologists with the help of a microscope still have obstacles in observations.&n... see more


Ulfah Nur Oktaviana, Ricky Hendrawan, Alfian Dwi Khoirul Annas, Galih Wasis Wicaksono    

Rice is a staple food source for most countries in the world, including Indonesia. The problem of rice disease is a problem that is quite crucial and is experienced by many farmers. Approximately 200,000 - 300,000 tons per year the amount of rice attacke... see more


Ardi wijaya, Puji Rahayu , Rozali Toyib    

Problems in image processing to obtain the best smile are strongly influenced by the quality, background, position, and lighting, so it is very necessary to have an analysis by utilizing existing image processing algorithms to get a system that can make ... see more


Wilis Kaswidjanti,Bambang Yuwono,Nisa’ul Azizah,Nur Heri Cahyana    

Purpose: The case study that became this study is to detect pneumonia using x-ray readings of thorax images. Data at Paru Respira Hospital Yogyakarta stated that there were 266 pneumonia patients undergoing hospitalization and 1384 pneumonia patients und... see more

Revista: Telematika