ASPECT EXTRACTION IN E-COMMERCE USING LATENT DIRICHLET ALLOCATION (LDA) WITH TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF)

Satyawan Agung Nugroho; Fitra A Bachtiar; Randy Cahya Wihandika

doi:10.21107/kursor.v11i2.247

Authors

Satyawan Agung Nugroho Universitas Brawijaya, Indonesia
Fitra A Bachtiar
Randy Cahya Wihandika

DOI:

https://doi.org/10.21107/kursor.v11i2.247

Keywords:

Aspect Extraction, Latent Dirichlet Allocation, Perplexity, Term Frequency - Inverse Document Frequency, Topic Modelling

Abstract

Social media is a common thing that people use. Posts or comments found on social media describe someoneâ€™s feelings and opinions so there have to be important topics that can be extracted from social media. In the e-commerce field, topic is an interesting thing to know because it can describes peopleâ€™s opinion towards a product. However, the large number of social media users is currently making the process of finding topics from social media difficult, so computer assistance is needed. One method that can be used is Latent Dirichlet Allocation (LDA). LDA is a good method for extracting topics, but the drawback is that sometimes the topics are incomprehensible. To cover up the drawback, TF-IDF feature selection method is used so that less important words can be skipped so LDA can generate a better topic. The best hyperparameter values â€‹â€‹obtained were 10 iterations, 10 topics, Î± and Î² values consecutively 0,1 and 0,01. The best feature selection percentile value is 90. This value is used to find the threshold that can be used as the lower limit of the TF-IDF value of each word so that the word with greater TF-IDF value can be used as feature.

Downloads

Download data is not yet available.

References

[1] O. MÃ¼ller, R. Jaakonmaki, and J. vom Brocke, â€œThe Impact of Content , Context , and Creator on User Engagement in Social Media Marketing,â€ Proc. 50th Hawaii Int. Conf. Syst. Sci., 2017.
[2] V. Taecharungroj and B. Mathayomchan, â€œAnalysing TripAdvisor reviews of tourist attractions in Phuket , Thailand,â€ Tour. Manag., vol. 75, pp. 550â€“568, 2019.
[3] K. Bastani, H. Namavari, and J. Shaffer, â€œLatent Dirichlet allocation ( LDA ) for topic modeling of the CFPB consumer complaints,â€ Expert Syst. Appl., vol. 127, pp. 256â€“271, 2019.
[4] B. Liu, Sentiment Analysis and Opinion Mining. Morgan&Claypool Publishers, 2012.
[5] T. Hofmann, â€œUnsupervised Learning by Probabilistic Latent Semantic Analysis,â€ Mach. Learn., vol. 42, pp. 177â€“196, 2001.
[6] D. M. Blei, A. Y. Ng, and M. I. Jordan, â€œLatent Dirichlet Allocation,â€ J. Mach. Learn. Res., vol. 3, pp. 993â€“1022, 2003.
[7] Y. Guo, S. J. Barnes, and Q. Jia, â€œMining meaning from online ratings and reviewsâ€¯: Tourist satisfaction analysis using latent dirichlet allocation,â€ Tour. Manag., vol. 59, pp. 467â€“483, 2017.
[8] D. Mimno, H. M. Wallach, E. Talley, and M. Leenders, â€œOptimizing Semantic Coherence in Topic Models,â€ Proc. 2011 Conf. Empir. Methods Nat. Lang. Process., no. 2, pp. 262â€“272, 2011.
[9] R. Ahuja, A. Chug, S. Kohli, S. Gupta, and P. Ahuja, â€œThe Impact of Features Extraction on the Sentiment Analysis,â€ Procedia Comput. Sci., vol. 152, pp. 341â€“348, 2019.
[10] N. C. Wirawan, Indriati, and P. P. Adikara, â€œAnalisis Sentimen Dengan Query Expansion Pada Review Aplikasi M- Banking Menggunakan Metode Fuzzy K-Nearest Neighbor ( Fuzzy k-NN ),â€ J. Pengemb. Teknol. Inf. dan Ilmu Komput., vol. 2, no. 1, pp. 362â€“368, 2018.
[11] W. E. Nurjanah, R. S. Perdana, and M. A. Fauzi, â€œAnalisis Sentimen Terhadap Tayangan Televisi Berdasarkan Opini Masyarakat pada Media Sosial Twitter menggunakan Metode K-Nearest Neighbor dan Pembobotan Jumlah Retweet,â€ J. Pengemb. Teknol. Inf. dan Ilmu Komput., vol. 1, no. 12, pp. 1750â€“1757, 2017.
[12] A. Agustina, â€œAnalisis Dan Visualisasi Suara Pelanggan Pada Pusat Layanan Pelanggan Dengan Pemodelan Topik Menggunakan Latent Dirichlet Allocation (LDA) Studi Kasus: PT. Petrokimia Gresik,â€ Institut Teknologi Sepuluh November, 2017.
[13] H. Hao, K. Zhang, W. Wang, and G. Gao, â€œA Tale of Two Countriesâ€¯: International Comparison of Online Doctor Reviews Between China and the United States,â€ Int. J. Med. Inform., vol. 99, pp. 37â€“44, 2017

ASPECT EXTRACTION IN E-COMMERCE USING LATENT DIRICHLET ALLOCATION (LDA) WITH TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF)

Authors

DOI:

Keywords:

Abstract

Downloads

References

Downloads

Published

Issue

Section

Citation Check

Make a Submission

system

TOOLS

tanggal_penting

Important Date

template2

certificate

histats

purcase_contact

Purchase Contact

Information