ARTICLE
TITLE

Application of Naïve Bayes Algorithm Variations On Indonesian General Analysis Dataset for Sentiment Analysis

SUMMARY

Indonesian General Analysis Dataset is a dataset sourced from social media twitter by using keywords in the form of conjunctions to get a dataset that does not only focus on a particular topic. The use of Indonesian language datasets with general topics can be used to test the accuracy of the classification model so as to provide additional reference in choosing the right methods and parameters for sentiment analysis. One of the algorithms which in several studies produces the highest level of accuracy is naive Bayes which has several variations. This study aims to obtain the method with the best accuracy from the naive Bayes variation by setting the minimum and maximum document frequency parameters on the Indonesian General Analysis Dataset for sentiment analysis. The naive Bayes classifier variations used include Bernoulli naive Bayes, gaussian naive Bayes, complement naive Bayes and multinomial naive Bayes. The research stage begins with downloading the dataset. Preprocessing becomes the next stage which consists of tokenizing, stemming, converting abbreviations and eliminating conjunctions. In the preprocessed data, feature extraction is carried out by converting the dataset into vectors and applying the TF-IDF method before entering the sentiment analysis classification stage. Tests in this study were carried out by applying the minimum document frequency (min-df) and maximum document frequency (max-df) for each variation of naive Bayes to obtain the appropriate parameters. The test uses k-fold cross validation of the dataset to divide the training data and sentiment analysis test data. The next confusion matrix is ??made to evaluate the level of accuracy. 

 Articles related

Wei Liu    

Intelligent agriculture can renovate agricultural production and management, making agricultural production truly scientific and efficient. The existing data mining technology for agricultural information is powerful and professional. But the technology ... see more


Putri Arta Aritonang, Monika Evelin Johan, Iwan Prasetiawan    

As an obligatory application during the COVID-19 pandemic by Indonesians, PeduliLindungi must have provided outstanding quality services to its users. However, as of December 2021, users’ sentiment toward the quality and service of the PeduliLindungi app... see more

Revista: Ultima Infosys

Hijrah Hijrah, Maulidar Maulidar, Adria Adria    

 This study was conducted to measure and compare the values ??of accuracy, precision, and recall, from the rapid miner and weka data mining applications by comparing the value of the confusion matrix, the application of the confusion matrix method w... see more


Alisa Fitriyani, Agung Triayudi    

The public's lack of interest in the capital market has made the top brass of capital market companies compete with each other to provide services in order to provide convenience for customers in the various services available and provide convenience in ... see more


Deswinda Sari Hasibuan,Resad Setyadi    

The role of information technology in increasing the use of e-Raport is to make it easier for schools to input student data. The e-Raport application is very important because it is a system to make it easier for teachers, staff, students, parents, and t... see more

Revista: SISFORMA