EKSTRAKSI FITUR MENGGUNAKAN MODEL WORD2VEC PADA SENTIMENT ANALYSIS KOLOM KOMENTAR KUISIONER EVALUASI DOSEN OLEH MAHASISWA

Muhammad Rusli

Abstract


This research is about Sentiment Analysis using the Word2vec model. this research was conducted by Fauzi (2019). But in his research the use of the Word2vec model produces an accuracy of 70%, because the data used is small. In little data Word2vec cannot grasp the similarity of meaning well. So that related research was conducted which used lecturer evaluation comment data and also Wikipedia article data in Indonesian language as Word2vec model. In this study a comparison of average extraction features of Word2vec and Bag of Centroid base Word2vec was done and a combination of the two was then performed using the Support Vector Machine method. The application of Word2vec Average base feature extraction in the lecturer evaluation commentary data resulted in an accuracy of 84,8%. Then using the Bag of Centroid base feature extraction using Word2vec Hierarchy Clustering produces the best accuracy of 81,6% with a total of 75 features. The result of merging the two feature extractions produces an accuracy of 85,3%.

Keywords: Sentiment Analysis, Word2vec, Feature extraction

Penelitian ini mengenai Sentiment Analysis menggunakan model Word2vec. penelitian ini pernah dilakukan oleh  Fauzi (2019). Namun pada penelitiannya penggunaan model Word2vec menghasilkan akurasi 70%, karena data yang digunakan sedikit. Dalam data yang sedikit Word2vec tidak dapat menangkap kemiripan makna dengan baik. Sehingga dilakukan penelitian terkait yang mana menggunakan data komentar evaluasi dosen dan juga data artikel Wikipedia berbahasa Indonesia  sebagai model Word2vec. Pada penelitian ini dilakukan perbandingan ekstraksi fitur Average base Word2vec dan Bag of Centroid base Word2vec dan juga dilakukan penggabungan keduanya kemudian dilakukan klasifikasi menggunakan metode Support Vector Machine. Penerapan ekstraksi fitur Average base Word2vec pada data komentar evaluasi dosen menghasilkan akurasi sebesar 84,8%. Kemudian menggunakan ekstraksi fitur Bag of Centroid base Word2vec menggunakan Hirarki Clustering menghasilkan akurasi terbaik sebesar 81,6% dengan jumlah 75 fitur. Hasil penggabungan kedua ekstraksi fitur menghasilkan akurasi sebesar 85,3%.

Kata kunci: Sentiment Analysis, Word2vec, Ekstraksi fitur.



Full Text:

PDF

References


Fauzi, M. Ali, 2019. Word2vec model for Sentiment Analysis of product reviews in Indonesian language, Brawijaya University. Vol. 9, No. 1. page. 525-530.

Rossiello, Gaetano dkk, 2017. Centroid-based Text Summarization through Compositionality of Word Embeddings, University of Bari. Proceedings of the MultiLing 2017 Workshop on Summarization and Summary Evaluation Across Source Types and Genres, pages 12–21




DOI: http://dx.doi.org/10.20527/klik.v7i1.296

Copyright (c) 2020 KLIK - KUMPULAN JURNAL ILMU KOMPUTER

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.

Indexed by:

  
 

 

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.joomla
counter View My Stats