Perbandingan performa deteksi cyberbullying dengan transformer, deep learning, dan machine learning

Fuad Muftie; Kamal Muftie Yafi; Qinthara Muftie Addina

doi:10.31571/saintek.v13i1.4002

Perbandingan performa deteksi cyberbullying dengan transformer, deep learning, dan machine learning

Authors

Fuad Muftie Universitas Nusa Mandiri
Kamal Muftie Yafi Universitas Indonesia
Qinthara Muftie Addina Universitas Pendidikan Indonesia

DOI:

https://doi.org/10.31571/saintek.v13i1.4002

Keywords:

Transformers, Sentiment Analysis, Natural Language Processing, Deep Learning

Abstract

Peningkatan aktivitas browsing terutama di situs media sosial mengakibatkan rawannya terjadi cyberbullying (perundungan dunia maya). Telah banyak dilakukan penelitian untuk melakukan pendeteksian cyberbullying, baik dengan metode machine learning maupun deep learning. Dalam penelitian ini dilakukan perbandingan performa pengklasifikasian data teks apakah termasuk cyberbullying atau bukan, dengan menggunakan algoritma Transformer. Kemudian dilakukan perbandingan performa metode transformer dengan metode deep learning lain (RNN, LSTM, dan GRU) serta dengan metode machine learning (Naïve Bayes, Logistic Regression, SVM, dan Decision Tree). Hasil terbaik untuk model deep learning adalah dataset Youtube dengan model Transformer yang mendapat akurasi 98.49%. Kemudian hasil terbaik model machine learning adalah dataset Youtube dengan model SVM dan menggunakan feature Tf-Idf yang mendapat akurasi 97.82%.

Downloads

Author Biographies

Kamal Muftie Yafi, Universitas Indonesia

Fakultas Matematika dan Ilmu Pengetahuan Alam

Qinthara Muftie Addina, Universitas Pendidikan Indonesia

Fakultas Pendidikan Bahasa dan Sastra Indonesia

References

Birjali, M., Kasri, M., & Beni-Hssane, A. (2021). A comprehensive survey on sentiment analysis: Approaches, challenges and trends. Knowledge-Based Systems, 226, 107134. https://doi.org/https://doi.org/10.1016/j.knosys.2021.107134

Caselli, T., Basile, V., Mitrović, J., & Granitzer, M. (2021). HateBERT: Retraining BERT for abusive language detection in English. Proceedings of the 5th Workshop on Online Abuse and Harms (WOAH 2021), 17–25. https://doi.org/10.18653/v1/2021.woah-1.3

Chugh, D., Anjum, A., & Katarya, R. (2021). Automated news summarization using transformers.

Dadvar, M., & Eckert, K. (2020). Cyberbullying detection in social networks using deep learning based models. In Big Data Analytics and Knowledge Discovery: 22nd International Conference, DaWaK 2020, Bratislava, Slovakia, September 14–17, 2020, Proceedings 22 (pp. 245-255). Springer International Publishing.

Elsafoury, F. (2020). Cyberbullying datasets. Mendeley Data. https://doi.org/10.17632/jf4pzyvnpj.1

Elsafoury, F., Katsigiannis, S., Pervez, Z., & Ramzan, N. (2021). When the Timeline meets the pipeline: A survey on automated cyberbullying detection. IEEE Access, 9, 103541–103563. https://doi.org/10.1109/ACCESS.2021.3098979

Glazkova, A. (2020). A Comparison of synthetic oversampling methods for multi-class text classification. CoRR, abs/2008.0. https://arxiv.org/abs/2008.04636

Iwendi, C., Srivastava, G., Khan, S., & Maddikunta, P. K. R. (2020). Cyberbullying detection solutions based on deep learning architectures. Multimedia Systems. https://doi.org/10.1007/s00530-020-00701-5

Jabeen, F., & Treur, J. (2018). Computational analysis of bullying behavior in the social media era BT - Computational Collective Intelligence (N. T. Nguyen, E. Pimenidis, Z. Khan, & B. Trawiński (eds.); pp. 192–205). Springer International Publishing.

Kennedy, S., Walsh, N., Sloka, K., Foster, J., & Mccarren, A. (2020). Fact or factitious? contextualized opinion spam detection.

Kovács, G. (2019). An empirical comparison and evaluation of minority oversampling techniques on a large number of imbalanced datasets. Applied Soft Computing, 83, 105662. https://doi.org/https://doi.org/10.1016/j.asoc.2019.105662

Ma, E. (2019). NLP Augmentation. https://github.com/makcedward/nlpaug

Mehta, P., & Pandya, D. S. (2020). A Review on sentiment analysis methodologies, Practices And Applications. International Journal of Scientific & Technology Research, 9, 601–609.

Rupapara, V., Rustam, F., Shahzad, H. F., Mehmood, A., Ashraf, I., & Choi, G. S. (2021). Impact of SMOTE on imbalanced text features for toxic comments classification using RVVC Model. IEEE Access, 9, 78621–78634. https://doi.org/10.1109/ACCESS.2021.3083638

Shaikh, A. R., Alhoori, H., & Sun, M. (2023). YouTube and science: models for research impact. Scientometrics, 128(2), 933-955.

Sato, M., Orihara, R., Sei, Y., Tahara, Y., & Ohsuga, A. (2018). Text classification and transfer learning based on character-level deep convolutional neural networks BT - agents and artificial intelligence (J. van den Herik, A. P. Rocha, & J. Filipe (eds.); pp. 62–81). Springer International Publishing.

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Lukasz, & Polosukhin, I. (2017). Attention is all you need. Proceedings of the 31st International Conference on Neural Information Processing Systems, 6000–6010.

Wei, J., & Zou, K. (2019). EDA: Easy data augmentation techniques for boosting performance on text classification tasks. EMNLP-IJCNLP 2019 - 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing, Proceedings of the Conference, abs/1901.1, 6382–6388. https://doi.org/10.18653/v1/d19-1670

Zhong, H., Miller, D. J., & Squicciarini, A. (2019). Flexible inference for cyberbully incident detection. In U. Brefeld, A. Marascu, F. Pinelli, E. Curry, B. MacNamee, N. Hurley, E. Daly, & M. Berlingerio (Eds.), European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML-PKDD 2018 (pp. 356-371 BT-Machine Learning and Knowledge Disco). Springer Verlag. https://doi.org/10.1007/978-3-030-10997-4_22

Downloads

Published

2024-06-30

How to Cite

Muftie, F., Yafi, K. M., & Addina, Q. M. (2024). Perbandingan performa deteksi cyberbullying dengan transformer, deep learning, dan machine learning. Jurnal Pendidikan Informatika Dan Sains, 13(1), 75–87. https://doi.org/10.31571/saintek.v13i1.4002

Download Citation

Issue

Vol. 13 No. 1 (2024): Jurnal Pendidikan Informatika dan Sains

Section

Articles

License

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

License and Copyright Agreement

In submitting the manuscript to the journal, the authors certify that:

They are authorized by their co-authors to enter into these arrangements.
The work described has not been formally published before, except in the form of an abstract or as part of a published lecture, review, thesis, or overlay journal. Please also carefully readÂ Jurnal Pendidikan Informatika dan Sains Posting Your Article Policy at http://journal.ikippgriptk.ac.id/index.php/saintek/about/submissions#onlineSubmissions
That it is not under consideration for publication elsewhere,
That its publication has been approved by all the author(s) and by the responsible authorities â€“ tacitly or explicitly â€“ of the institutes where the work has been carried out.
They secure the right to reproduce any material that has already been published or copyrighted elsewhere.
They agree to the following license and copyright agreement.

Copyright

Authors who publish withÂ Jurnal Pendidikan Informatika dan Sains agree to the following terms:

Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License (CC BY-SA 4.0) that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.

Perbandingan performa deteksi cyberbullying dengan transformer, deep learning, dan machine learning

Authors

DOI:

Keywords:

Abstract

Downloads

Author Biographies

Kamal Muftie Yafi, Universitas Indonesia

Qinthara Muftie Addina, Universitas Pendidikan Indonesia

References

Downloads

Published

How to Cite

Issue

Section

License

SIDE

flagCounter