Analisis Sentimen Isu Vaksinasi COVID-19 Pada Twitter Menggunakan Metode Naïve Bayes dan Pembobotan TF-IDF Tokenisasi 1-2

Authors

  • Yashmine Hapsari Kalimantan Institute of Technology
  • Syamsul Mujahidin Kalimantan Institute of Technology
  • Nisa Fadhliana Kalimantan Institute of Technology

DOI:

https://doi.org/10.35718/specta.v7i2.812

Keywords:

COVID-19, Naïve Bayes, Sentiment analysis, Vaccination

Abstract

The COVID-19 vaccination has been implemented to cut down the spread of the virus in society, but the status of the vaccine, which has been in the development stage, is one of the factors causing people to hesitate to vaccinate. Therefore, a sentiment analysis was carried out on the issue of COVID-19 vaccination with processes and parameters that could increase the model’s accuracy. In this study, sentiment classification was performed using the Naive Bayes method and a dataset of 5,000 tweets related to the vaccination of COVID-19. The weighting stage was applied using the TF-IDF method in which a comparison was made of the effect of using unigram, bigram and 1-2 gram tokenization on model accuracy. The results of one of the experiments with the Gaussian classifier and the ratio train: test is 7:3, the model accuracy is 67.4% for the unigram parameter, 65.5% for the bigram parameter, and 70% for the 1-2 gram parameter, where the model with the combined token is 1 -2 grams has a higher accuracy when compared to using only 1 type of token. Based on these results, it can be concluded that the combination of unigram and bigram tokenization types can provide added value to the model for classifying data, thereby increasing accuracy in analysis related to public sentiment.

Published

2023-08-31

How to Cite

Hapsari, Y., Mujahidin, S., & Fadhliana, N. (2023). Analisis Sentimen Isu Vaksinasi COVID-19 Pada Twitter Menggunakan Metode Naïve Bayes dan Pembobotan TF-IDF Tokenisasi 1-2. SPECTA Journal of Technology, 7(2), 573–583. https://doi.org/10.35718/specta.v7i2.812