ARTICLE
TITLE

The Effect of Class Imbalance Handling on Datasets Toward Classification Algorithm Performance

SUMMARY

Class imbalance is a condition where the amount of data in the minority class is smaller than that of the majority class. The impact of the class imbalance in the dataset is the occurrence of minority class misclassification, so it can affect classification performance. Various approaches have been taken to deal with the problem of class imbalances such as the data level approach, algorithmic level approach, and cost-sensitive learning. At the data level, one of the methods used is to apply the sampling method. In this study, the ADASYN, SMOTE, and SMOTE-ENN sampling methods were used to deal with the problem of class imbalance combined with the AdaBoost, K-Nearest Neighbor, and Random Forest classification algorithms. The purpose of this study was to determine the effect of handling class imbalances on the dataset on classification performance. The tests were carried out on five datasets and based on the results of the classification the integration of the ADASYN and Random Forest methods gave better results compared to other model schemes. The criteria used to evaluate include accuracy, precision, true positive rate, true negative rate, and g-mean score. The results of the classification of the integration of the ADASYN and Random Forest methods gave 5% to 10% better than other models.

 Articles related

Diding Suhandy,Meinilwita Yulia    

UV-Vis spectroscopy has been used as a promising method for coffee quality evaluation including in authentication of several high-economic coffee types. In this paper, we have compared the abilities of linear discriminant analysis (LDA) and support vecto... see more


Rizka Wakhidatus Sholikah,Diana Purwitasari,Mohammad Zaenuddin Hamidi    

An ethical clearance document ensures that the research will protect the subject in accordance with existing ethical principles. The ethical clearance is issued by the Research Ethics Commission (KEP). KEP will conduct a review of the proposed ethical pr... see more

Revista: Techno.Com

?. F. Povkhan    

Context. The general problem of constructing logical trees of recognition (classification) in the theory of artificial intelligence is considered in this paper. The object of this study is the concept of the classification tree (a logical and an algorith... see more


D. S. Semenov,D. M. Piza    

Context. Under conditions of simultaneous exposure to active and passive interference, the passive component decorrelates the active component of the combined interference in the receiving channels of the surveillance radar, which significantly reduces t... see more


Muhammad Zulqarnain,Rozaida Ghazali,Muhammad Ghulam Ghouse,Muhammad Faheem Mushtaq    

Text classification has become very serious problem for big organization to manage the large amount of online data and has been extensively applied in the tasks of Natural Language Processing (NLP). Text classification can support users to excellently ma... see more