ARTICLE
TITLE

Improving Cardiovascular Disease Prediction by Integrating Imputation, Imbalance Resampling, and Feature Selection Techniques into Machine Learning Model

SUMMARY

Cardiovascular disease (CVD) is the leading cause of death worldwide. Primary prevention is by early prediction of the disease onset. Using laboratory data from the National Health and Nutrition Examination Survey (NHANES) in 2017-2020 timeframe (N= 7.974), we tested the ability of machine learning (ML) algorithms to classify individuals at risk. The ML models were evaluated based on their classification performances after comparing four imputation, three imbalance resampling, and three feature selection techniques.Due to its popularity, we utilized decision tree (DT) as the baseline. Integration of multiple imputation by chained equation (MICE) and synthetic minority oversampling with Tomek link down-sampling (SMOTETomek) into the model improved the area under the curve-receiver operating characteristics (AUC-ROC) from 57% to 83%. Applying simultaneous perturbation feature selection and ranking (spFSR) reduced the feature predictors from 144 to 30 features and the computational time by 22%. The best techniques were applied to six ML models, resulting in Xtreme gradient boosting (XGBoost) achieving the highest accuracy of 93% and AUC-ROC of 89%.The accuracy of our ML model in predicting CVD outperforms those from previous studies. We also highlight the important causes of CVD, which might be investigated further for potential effects on electronic health records. 

 Articles related

(1) Reza Fuad Rachmadi (Institut Teknologi Sepuluh Nopember, Indonesia) (2) I Ketut Eddy Purnama (Institut Teknologi Sepuluh Nopember, Indonesia) (3) Supeno Mardi Susiki Nugroho (Institut Teknologi Sepuluh Nopember, Indonesia)    

Person re-identification is one of the problems in the computer vision field that aims to retrieve similar human images in some image collections (or galleries). It is very useful for people searching or tracking in a closed environment (like a mall or b... see more


Yazeed Qholili Arifin,Ade Ismail,Vipkas Al Hadid Firdaus    

Inside any e-commerce platform, search features are a key factor in an online business. In order to look for the desired item name, users need to type a pattern into the search feature. Inside the typing process users may make some mistypes. Based on the... see more

Revista: SISFORMA

Affan Ardana    

Purpose: The research aims to find the best parameters and features for predicting stock price movement using the XGBoost algorithm. The parameters are searched using the RMSE value, and the features are searched using the importance value.Design/methodo... see more

Revista: Telematika

Revanto Alif Nawasta,Nur Heri Cahyana,Heriyanto Heriyanto    

Purpose: To determine emotions based on voice intonation by implementing MFCC as a feature extraction method and KNN as an emotion detection method.Design/methodology/approach: In this study, the data used was downloaded from several video podcasts on Yo... see more

Revista: Telematika

Putu Hendra Suputra,Anggraini Dwi Sensusiati,Myrtati Dyah Artaria,Gijsbertus Jacob Verkerke,Eko Mulyanto Yuniarno,I Ketut Eddy Purnama    

Cranial anthropometric reference points (landmarks) play an important role in craniofacial reconstruction and identification. Knowledge to detect the position of landmarks is critical. This work aims to locate landmarks automatically. Landmarks positioni... see more