Analyzing Speech Signals for Emotion Recognition Using Machine Learning Methods
SOURAV KARMAKAR KARMAKAR
Paper Contents
Abstract
Speech emotion detection is an exciting and rapidly advancing sector of AI, applicable in Historical Audio analysis and Automated Conflict Resolution. This study showcases the machine learning methods focused on identifying emotions from dialogue signals by utilizing cutting-edge feature extraction methods and deep learning models. Key audio features such as MFCC, spectrograms, and chroma features are obtained and processed to obtain the emotional characteristics of speech. A model that uses CNN is developed to categorize emotions into predefined categories, optimizing performance using batch normalization and dropout techniques. The model is evaluated on a benchmark dataset, achieving competitive accuracy and demonstrating its effectiveness in distinguishing emotions. Experimental results, including classification reports and confusion matrix analysis, highlight the strength of the proposed pathway. We attained a total overall accuracy of 97 %. This study assists in enhancing machines that can comprehend emotions in real time. This will lead to a more efficient virtual assistant and mental health analysis system. This iteration perfectly adds the real-time ability to the narrative. It maximizes its use after training and keeps the rhythm of the initial text.
Copyright
Copyright © 2025 SOURAV KARMAKAR. This is an open access article distributed under the Creative Commons Attribution License.