RESERVOIR COMPUTING WITH TRUNCATED NORMAL DISTRIBUTION FOR SPEECH EMOTION RECOGNITION

Hemin Ibrahim; Chu Kiong Loo

doi:10.22452/mjcs.vol35no2.3

FULL TEXT

Published: Apr 29, 2022

DOI: https://doi.org/10.22452/mjcs.vol35no2.3

Keywords:

Reservoir Computing truncated normal distribution population-based training speech emotion recognition recurrent neural network

Hemin Ibrahim

Department of Artificial Intelligence, Faculty of Computer Science and Information Technology, Universiti Malaya, Kuala Lumpur, Malaysia

Chu Kiong Loo

Department of Artificial Intelligence, Faculty of Computer Science and Information Technology, Universiti Malaya, Kuala Lumpur, Malaysia

Abstract

Speech is an effective, quick, and important way for communicating and exchanging complex information between humans. Emotions have always been a part of normal human conversation which makes the speech more attractive. Because of this major role of both speech and emotion, many researchers are inspired by studying Speech Emotion Recognition (SER) which still has plenty of challenges. In this study, we proposed a novel reservoir computing approach with the initialization of random connection weights for the input weight by the truncated normal distribution. Furthermore, Population-Based Training (PBT) is adopted to optimize the hyperparameters of the whole Echo State Network (ESN) model which have a significant impact on the model performance. The proposed model has adopted bidirectional reservoir input to increase the memorization capability, and Sparse Random Projection (SRP) was applied for dimensional reduction as a simple, unsupervised, and low complexity approach. The speaker-independent strategy was employed on EMODB and SAVEE datasets as an acted speech emotion dataset and Aibo as a non-acted dataset. The model achieved 84.8%, 65.95%, and 45.99% unweighted average recalls on the EMODB, SAVEE, and Aibo datasets respectively. The results show that the proposed model outperforms the recent state-of-the-art studies with a cheaper computational cost.

Downloads

Download data is not yet available.

How to Cite

Ibrahim, H., & Chu Kiong Loo. (2022). RESERVOIR COMPUTING WITH TRUNCATED NORMAL DISTRIBUTION FOR SPEECH EMOTION RECOGNITION. Malaysian Journal of Computer Science, 35(2), 128–141. https://doi.org/10.22452/mjcs.vol35no2.3

Issue

Vol. 35 No. 2 (2022): Malaysian Journal of Computer Science

Section

Articles

Article Sidebar

Main Article Content

Abstract

Downloads

Article Details