ARTICLE
TITLE

Sentiment Analysis and Topic Modelling of The COVID-19 Vaccine in Indonesia on Twitter Social Media Using Word Embedding

SUMMARY

This study aims to analyze the sentiments of the Indonesian people towards the COVID-19 vaccine on Twitter. Data collection was carried out from September 2020 to June 2021 with the keyword "covid vaccine," which resulted in 262306 tweets. After filtering and cleaning, there are 83384 tweets left. The labeling process was done manually by an expert. The label composition in the data is 35209 tweets of positive sentiment, 41596 tweets of neutral sentiment, and 6579 tweets of negative sentiment. The remaining data is preprocessed using case folding, removing punctuation, stopword removal, stemming, and the application of slang words. The highest number of tweets appeared in January 2021, after Joko Widodo became the first person in Indonesia to receive a vaccine injection. The number of tweets reached 23492 tweets. At the topic modeling stage, measurements were conducted using the Coherence Score. The distribution of the optimal number of topics is 3 topics. The first topic, with a token percentage value of 51.8%, leads to positive sentiment, while the second and third topics, with token percentage values of 24.5% and 23.7%, lead to neutral sentiment. Bidirectional LSTM architecture was implemented to perform sentiment classification. Fasttext and GloVe word embedding was tested to vectorize tweet data. The test accuracy generated by Fasttext word embedding reached 75,7690%, while the test accuracy produced with GloVe word embedding reached 74.7017%. The usage of slang words could not increase the test accuracy in this study. The use of the Modelcheckpoint to monitor model performance during training could produce a model with a slightly higher test accuracy, about 1.07% (in scenario 1 and scenario 6), compared to a model whose performance was monitored using Early Stopping. In future research, it can be tried to apply a lower learning rate to produce better accuracy in a large number of epochs, or it could be by changing the dropout parameter.

 Articles related

(1) Thanh Trung Ho (University of Economics and Law, Ho Chi Minh City, Vietnam; Viet Nam National University, Ho Chi Minh City,, Viet Nam) (2) Hien Minh Bui (UEH College of Technology and Design (UEH-CTD), University of Economics Ho Chi Minh City (UEH), Vietnam, Viet Nam) (3) Phung Kim Thai (UEH College of Technology and Design (UEH-CTD), University of Economics Ho Chi Minh City (UEH), Vietnam, Viet Nam)    

Feedback and comments on mobile commerce applications are extremely useful and valuable information sources that reflect the quality of products or services to determine whether data is positive or negative and help businesses monitor brand and product s... see more


Citra Lestari    

In 2021, Indonesia government performed National Covid-19 vaccine program throughput Indonesia. Al though it was considered a successful program, based on the number of participation, there were also quite many negative opinion about Covid-19 vaccine, es... see more


Muhammad Fahmi Julianto, Yesni Malau, Wahyutama Fitri Hidayat    

News about the war that took place between Russia and Ukraine can not be denied affecting various aspects of life in the world. This affects the writings of every citizen of the world on various social media platforms, one of which is Twitter. Sentiment ... see more


Putri Arta Aritonang, Monika Evelin Johan, Iwan Prasetiawan    

As an obligatory application during the COVID-19 pandemic by Indonesians, PeduliLindungi must have provided outstanding quality services to its users. However, as of December 2021, users’ sentiment toward the quality and service of the PeduliLindungi app... see more

Revista: Ultima Infosys

Ahmad R Pratama    

In contrast to several other countries, Indonesian sentiment analysis research is primarily focused on the text-based analysis of Twitter. Given that Twitter users in Indonesia account for less than a seventh of those on Facebook, sentiment analysis on t... see more