ARTICLE
TITLE

Multidisciplinary classification for Indonesian scientific articles abstract using pre-trained BERT model

SUMMARY

Scientific articles now have multidisciplinary content. These make it difficult for researchers to find out relevant information. Some submissions are irrelevant to the journal's discipline. Categorizing articles and assessing their relevance can aid researchers and journals. Existing research still focuses on single-category predictive outcomes. Therefore, this research takes a new approach by applying a multidisciplinary classification for Indonesian scientific article abstracts using a pre-trained BERT model, showing the relevance between each category in an abstract. The dataset used was 9,000 abstracts with 9 disciplinary categories. On the dataset, text preprocessing is performed. The classification model was built by combining the pre-trained BERT model with Artificial Neural Network. Fine-tuning the hyperparameters is done to determine the most optimal hyperparameter combination for the model. The hyperparameters consist of batch size, learning rate, number of epochs, and data ratio. The best hyperparameter combination is a learning rate of 1e-5, batch size 32, epochs 3, and data ratio 9:1, with a validation accuracy value of 90.8%. The confusion matrix results of the model are compared with the confusion matrix results by experts. In this case, the highest accuracy result obtained by the model is 99.56%. A software prototype used the most accurate model to classify new data, displaying the top two prediction probabilities and the dominant category. This research produces a model that can be used to solve Indonesian text classification-related problems.

 Articles related

Fery Ardiansyah Effendi, Yuliant Sibaroni    

Film business and its individual reviews cannot be separated and film review sites such as IMDb is a credible source of reviews posted in public forums. With IMDb site reviews being unstructured and bias-heavy, classification methods by reducing addition... see more


Joseph Ananda Sugihdharma,Fitra Abdurrachman Bachtiar    

Myers-Briggs Type Indicator (MBTI) is a personality model developed by Katharine Cooks Briggs and Isabel Briggs Myers in 1940. It displays a combination of preferences from four domains. Generally, test takers need to answer about 50 to 70 questions, and... see more


Lionel Reinhart Halim, Alethea Suryadibrata    

Depression and social anxiety are the two main negative impacts of cyberbullying. Unfortunately, a survey conducted by UNICEF on 3rd September 2019 showed that 1 in 3 young people in 30 countries had been victims of cyberbullying. Sentiment analysis rese... see more