The K-Means Clustering Algorithm With Semantic Similarity To Estimate The Cost of Hospitalization


 The cost of hospitalization from a patient can be estimated by performing a cluster of patient. One of the algorithms that is widely used for clustering is K-means. K-means algorithm, based on distance still has weaknesses in terms of measuring the proximity of meaning or semantics between data. To overcome this problem, semantic similarity can be used to measure the similarity between objects in clustering, so that, semantic proximity can be calculated. This study aims to conduct clustering of patient data by paying attention to the similarity of the patient’s disease. ICD code is used as a guide in determining a patient’s disease. The K-means method is combined with semantic similarity to measure the proximity of the patient’s ICD code. The method used to measure the semantic similarity between data, in this study, is the semantic similarity of Girardi, Leacock & Chodorow, Rada, and Jaccard Similarity. Cluster quality measurement uses the silhouette coefficient method. Based on the experimental results, the method of measuring semantic similarity data is capable to produce better quality clustering results than without semantic similarity. The best accuracy is 91.78% for the three semantic similarity methods, whereas without semantic similarity the best accuracy is 84.93%.

 Articles related

Muhammad Khairul,Fauziah Fauziah,Iskandar Fitri    

There are three types of spinal disorders, namely kyphosis, lordosis, and scoliosis. To find out spinal disorders, it is necessary to carry out X-rays from an early age. Spinal disorders are not only found in children but can be found in adolescents, adu... see more

Revista: Jurnal JTIK KITA

Aji Prasetya Wibawa,Hidayah Kariima Fithri,Ilham Ari Elbaith Zaeni,Andrew Nafalski    

Stopword removal necessary in Information Retrieval. It can remove frequently appeared and general words to reduce memory storage. The algorithm eliminates each word that is precisely the same as the word in the stopword list. However, generating the lis... see more

Kevin Widjaja, Raymond Sunardi Oetama    

Youtube is the most popular video platform in the world today. Successful YouTubers can create videos that are widely viewed by many Youtube users around the world. A lot of viral videos on Youtube came from the United States. But, making viral videos on... see more

Revista: Ultima Infosys

Reno Supardi,Indra Kanedi    

Abstract - In making the application, the implementation of the K-Means Clustering Algorithm method at the Eidelweis Store will be designed for this study to design software at the Eidelweis Store. The method used in making this application is a system d... see more

Pareza Alam Jusia, Fadhel Muhammad Irfan, Kurniabudi Kurniabudi    

Siswa-siswi SMA Negeri 2 Kota Jambi cenderung memilih jurusan berdasarkan karenaminat, dan keinginan orang tua. Beberapa di antaranya sudah memperhitungkan potensiyang ada pada diri mereka, maka komitmen untuk belajar dibidang itu tidak akan berjalanlanc... see more