Speech Emotion Recognition using Convolutional Neural Networks
Keywords:Spectrograms, CNN, emotion, classification, preprocessing, deep learning
This research work presents an assortment of techniques in Speech Emotion Recognition with the help of spectrograms and handcrafted Deep learning architecture: Convolutional Neural Networks (CNN). Emotional kingdom detection is a crucial part of human-device interplay studies. To make interaction between man and machine more natural, many milestones are conquered in speech emotion recognition, but still this process requires more up-to-the-mark results. To make an attempt for the same, this study represents a three-layers deep, two-dimensional Convolutional Neural Network for the challenging task of emotion detection from spectrograms produced by audio (speech) signals. We teach and examine our version on eight emotions: Neutral, Calm, Happy, Sad, Angry, Fearful, Disgust and Surprised. The mean value of outputs produces a classification of human speech. Our proposed version achieves, on average, a weighted accuracy of 72%.