PERFORMANCE EVALUATION OF MULTILABEL EMOTION CLASSIFICATION USING DATA AUGMENTATION TECHNIQUES | Malaysian Journal of Computer Science

FULL TEXT

Published: Apr 30, 2024

DOI: https://doi.org/10.22452/mjcs.vol37no2.4

Keywords:

Text classification; Deep learning; Class imbalance; NLP; Data augmentation; ChatGPT.

Zahra Ahanin

Department of Information Systems, Faculty of Computer Science and Information Technology, University of Malaya

Maizatul Akmar Ismail

Department of information Systems, Faculty of Computer Science and Information Technology, Universiti Malaya

Tutut Herawan

Department of information Systems, Faculty of Computer Science and Information Technology, Universiti Malaya

Abstract

One of the challenges of emotion classification is the existence of low annotated datasets, that makes the task more complex. Certain existing datasets often suffer from imbalanced data for the emotion classes. Several data augmentation approaches can help to overcome the challenges regarding imbalanced datasets. However, the existing data augmentation techniques in emotion classification lack consideration for the contextual nuances of emotions and this area is still relatively underexplored. In this work, we study the impact of data augmentation on classification performance of three machine learning models including Logistic Regression, BiLSTM and BERT and compare frequently used methods to address the issue. Specifically, we assessed Easy Data Augmentation (EDA) and contextual Embedding-based data augmentation (BERT) on two datasets. Based on the experimental results, we combined two BERT-based augmentation techniques including insert and substitute, to generate data for minority emotion classes. Furthermore, we proposed a data augmentation method using ChatGPT. Compared to the baseline models, incorporating the BERT augmentation techniques with BERT model resulted in improvements of +4.34% and +5.56% in Macro F1 score on the SemEval-2018 and GoEmotions datasets, respectively. Moreover, the proposed augmentation technique utilizing ChatGPT yielded improvements of +3.55% and +4.83% on the same datasets.

Downloads

Download data is not yet available.

How to Cite

Ahanin, Z. ., Ismail, M. A. ., & Herawan, T. (2024). PERFORMANCE EVALUATION OF MULTILABEL EMOTION CLASSIFICATION USING DATA AUGMENTATION TECHNIQUES. Malaysian Journal of Computer Science, 37(2), 154–168. https://doi.org/10.22452/mjcs.vol37no2.4

Issue

Vol. 37 No. 2 (2024): Malaysian Journal of Computer Science

Section

Articles

Creative Commons License

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Most read articles by the same author(s)

Ashish Dutt, Maizatul Akmar Ismail, A PARTITION-BASED FEATURE SELECTION METHOD FOR MIXED DATA: A FILTER APPROACH , Malaysian Journal of Computer Science: Vol. 33 No. 2 (2020): Malaysian Journal of Computer Science
Anitha Anandhan, Maizatul Akmar Ismail, Liyana Shuib, EXPERT RECOMMENDATION THROUGH TAG RELATIONSHIP IN COMMUNITY QUESTION ANSWERING , Malaysian Journal of Computer Science: Vol. 35 No. 3 (2022): Malaysian Journal of Computer Science
Nader Sohrabi Safa, Norjihan Abdul Ghani, Maizatul Akmar Ismail, An Artificial Neural Network Classification Approach For Improving Accuracy Of Customer Identification In E-Commerce , Malaysian Journal of Computer Science: Vol. 27 No. 3 (2014): Malaysian Journal of Computer Science
Marjan Mansourvar, Ram Gopal Raj, Maizatul Akmar Ismail, Sameem Abdul Kareem, Saravanan Shanmugam, Shahrom Wahid, Rohana Mahmud, Rukaini Hj. Abdullah, Fariza Hanum Fariza Nasaruddin, Norisma Idris, Automated Web Based System for Bone Age Assessment using Histogram Technique , Malaysian Journal of Computer Science: Vol. 25 No. 3 (2012): Malaysian Journal of Computer Science
Elfizar Elfizar, Mohd Sapiyan Baba, Tutut Herawan, Object-Based Viewpoint For Large-Scale Distributed Virtual Environment , Malaysian Journal of Computer Science: Vol. 28 No. 4 (2015): Malaysian Journal of Computer Science
Prabha Rajagopal, Sri Devi Ravana, Maizatul Akmar Ismail, Relevance Judgments Exclusive of Human Assessors in Large Scale Information Retrieval Evaluation Experimentation , Malaysian Journal of Computer Science: Vol. 27 No. 2 (2014): Malaysian Journal of Computer Science
Huda Mohammed Barakat, Maizatul Akmar Ismail, Sri Devi Ravana, Utilization Of Cross-Terms To Enhance The Language Model For Information Retrieval , Malaysian Journal of Computer Science: Vol. 26 No. 3 (2013): Malaysian Journal of Computer Science
Anindya Apriliyanti Pravitasari, Triyani Hendrawati, Anna Chadidjah, Tutut Herawan, ENHANCING INFANT PAIN DETECTION WITH HYBRID ATTENTION MECHANISMS IN LIGHTWEIGHT MOBILENETV3 ARCHITECTURES , Malaysian Journal of Computer Science: Vol. 38 (2025): Special Issue on Frontier on Computer Science and Information Technology (FOCUS 2024)
Artika Arista, Maizatul Akmar Ismail, Liyana Shuib, Tutut Herawan, A SYSTEMATIC LITERATURE REVIEW ON ETHICAL FRAMEWORK FOR ADOPTION OF GENERATIVE ARTIFICIAL INTELLIGENCE , Malaysian Journal of Computer Science: Vol. 38 (2025): Special Issue on Frontier on Computer Science and Information Technology (FOCUS 2024)
Choy Lik Kay, Nazean Jomhari, Mumtaz Begum Mustafa, Tutut Herawan, INVESTIGATING THE STRESS LEVELS OF CHILDREN WITH SPECIAL NEEDS IN VIRTUAL REALITY SNOEZELEN ROOM ENVIRONMENT FOR LEARNING , Malaysian Journal of Computer Science: Vol. 37 No. 3 (2024): Malaysian Journal of Computer Science

Editorial Information

Editorial Member

Submission Guidelines

Submission Guidelines

Indexing

Article Publication Charge

Article Publication Charge (APC)

Journal Template

Journal Template

Special Issue

In Press Publication

In Press Publication

Awards

Information

Conference

Articles

Articles in Press

Top Cited Articles

Top Cited Articles

Most View Articles

Most View Articles

Publishing Timeline

Screening time: 14 days

Review to acceptance: 189 days

Acceptance to publication: 30 days