Emotion Variation Detection in Discrete English Speech: A Wavelet Transform Use Case in Mental Health Monitoring
Date
Authors
Supervisor
Item type
Degree name
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
The increasing complexity in modern society has been leading to a series of emotional shifts and mental pressures for individuals. Emotion detection can assist people in managing stress and monitoring mental health. Consequently, recent works are leveraging advancements in vocal/acoustic signal processing and machine learning models to improve emotion detection from speech signals. A challenge in detecting variations in emotion from speech involves the identification of appropriate features that can accurately represent the underlying phenomenon. This paper proposes a set of features derived from energy content and entropy measures extracted through the decomposition signals of the discrete wavelet transform. These features aim to characterize various negative emotions, encompassing fear, sadness, anger, anxiety, and disgust, within speech signals in non-controlled noise conditions. We employ CNN-based architectures to classify the speech signals to detect the embedded emotions. The results of our experiments on publicly available datasets show that the proposed method performs better than the state-of-the-art methods, which use other time-frequency representations. We achieved an unweighted accuracy (UA) of 83.7 ± 2.5 and a weighted accuracy (WA) of 81.7 ± 5.