Prediction of Post-Operative Survival Expectancy in Thoracic Lung Cancer Surgery Using Extreme Learning Machine and SMOTE


Lung cancer is the most common cause of cancer death globally. Thoracic surgery is a common treatment for patients with lung cancer. However, there are many risks and postoperative complications leading to death. In this study, we will predict life expectancy for lung cancer patients one year after thoracic surgery The data used is secondary data for lung cancer patients in 2007-2011. There are 470 data consisting of 70 death class data and 400 survival class data for one year after surgery. The algorithm used is Extreme learning machine (ELM) for classification, which tends to be fast in the learning process and has good generalization performance. Synthetic Minority Over-sampling (SMOTE) is used to solve the problem of imbalanced data. The proposed solution combines the benefits of using SMOTE for imbalanced data along with ELM. The results show ELM and SMOTE outperform other algorithms such as Naïve Bayes, Decision stump, J48, and Random Forest. The best results on ELM were obtained at 50 neurons with 89.1% accuracy, F-Measure 0.86, and ROC 0.794. In the combination of ELM and SMOTE, the accuracy is 85.22%, F-measure 0.864, and ROC 0.855 on neuron 45 using a data division proportion of 90:10. The test results show that the proposed method can significantly improve the performance of the ELM algorithm in overcoming class imbalance. The contribution of this study is to build a machine learning model with good performance so that it can be a support system for medical informatics experts and doctors in early detection to predict the life expectancy of lung cancer patients.

 Articles related

Shamitha S Kotekani,Ilango Velchamy    

Fraud detection has received considerable attention from many academic research and industries worldwide due to its increasing popularity. Insurance datasets are enormous, with skewed distributions and high dimensionality. Skewed class distribution and i... see more

(1) Hartono Hartono (Universitas Sumatera Utara, Indonesia) (2) Opim Salim Sitompul (Universitas Sumatera Utara, Indonesia) (3) Tulus Tulus (Universitas Sumatera Utara, Indonesia) (4) Erna Budhiarti Nababan (Universitas Sumatera Utara, Indonesia)    

Class imbalance occurs when instances in a class are much higher than in other classes. This machine learning major problem can affect the predicted accuracy. Support Vector Machine (SVM) is robust and precise method in handling class imbalance problem b... see more

Muhammad Ibnu Choldun Rachmatullah    

One of the problems that are often faced by classifier algorithms is related to the problem of imbalanced data. One of the recommended improvement methods at the data level is to balance the number of data in different classes by enlarging the sample to ... see more

Gagah Gumelar    

Imbalance data merupakan masalah yang harus diselesaikan pada klasifikasi data mining, karena dapat mengakibatkan menurunnya performa klasifikasi. salah satu cara untuk mengatasi masalah imbalance data adalah dengan metode sampling. metode sampling sendi... see more

Ita Yulianti, Ami Rahmawati, Tati Mardiana    

When compared with other types of cancer, most of the population with cancer die from lung cancer.A person needs to do a screening test through X-rays, CT scans, and MRI to detect the disease. However, before carrying out the process, the doctor wil... see more