• Abstract

    The paradigm of textual or display-based control in human-computer interaction (HCI) has changed in favor of more understandable control methods, such as gesture, voice, and imitation. Speech in particular contains a large quantity of information, revealing the speaker's inner state as well as his or her goal and intention. The speaker's request can be understood through language analysis, but additional speech features show the speaker's mood, purpose, and intention. As a consequence, in modern HCI systems, emotion identification from speech has become crucial. Additionally, it is challenging to aggregate the results of the many professionals engaged in emotion identification. There have been several methods for analyzing sound in the past. However, it was impossible to analyses people's emotions during a live speech. Studies on real-time data are now more prominent than ever because of the advancement of artificial intelligence and the great performance of deep learning techniques. This research uses a cutting-edge deep-learning technique to identify emotions in human speech. The research made use of the open-source Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS) dataset. More than 2000 fragments of data were captured by 24 performers as speeches and songs for the RAVDESS dataset. The actors' responses to eight distinct moods were recorded. It was designed to find various emotion classifications. In this study, a novel neuro-fuzzy swallow swarm-optimized deep convolutional neural networks (NFSO-DCNN) approach for classification was suggested. The performance of the suggested model was compared to that of similar research, and the outcomes were assessed. Employing the suggested example on the RAVDESS dataset, an overall accuracy of 98.5% was attained for categorizing emotions

  • References

    1. Abbaschian BJ, Sierra-Sosa D, Elmaghraby A (2021) Deep learning techniques for speech emotion recognition, from databases to models. Sensors 21:1249.
    2. Al Mahdi Z, Rao Naidu V, Kurian P (2019) Analysing the Role of Human Computer Interaction Principles for E-Learning Solution Design. In Smart Technologies and Innovation for a Sustainable Future: Proceedings of the 1st American University in the Emirates International Research Conference - Dubai, UAE 2017, pp. 41-44. Springer International Publishing.
    3. Alnuaim AA, Zakariah M, Shukla PK, Alhadlaq A, Hatamleh WA, Tarazi H, Sureshbabu R, Ratna R (2022) Human-computer interaction for recognizing speech emotions using multilayer perceptron classifier. Journal of Healthcare Engineering 2022.
    4. Atmaja BT, Akagi M (2021) Two-stage dimensional emotion recognition by fusing predictions of acoustic and text networks using SVM. Speech Communication 126:9-21.
    5. Chattopadhyay S, Dey A, Singh PK, Ahmadian A, Sarkar R (2023) A feature selection model for speech emotion recognition using clustering-based population generation with hybrid of equilibrium optimizer and atom search optimization algorithm. Multimedia Tools and Applications 82:9693-9726.
    6. Chen X, Cao M, Wei H, Shang Z, Zhang L (2021) Patient emotion recognition in human-computer interaction system based on machine learning method and interactive design theory. Journal of Medical Imaging and Health Informatics 11:307-312.
    7. Heracleous P, Mohammad Y, Yoneyama A (2020) Integrating language and emotion features for multilingual speech emotion recognition. In Human-Computer Interaction. Multimodal and Natural Interaction: Thematic Area, HCI 2020, Held as Part of the 22nd International Conference, HCII 2020, Copenhagen, Denmark, July 19–24, 2020, Proceedings, Part II 22, pp. 187-196. Springer International Publishing.
    8. Karanchery S, Palaniswamy S (2021) Emotion recognition using one-shot learning for human-computer interactions. In 2021 International Conference on communication, control and information sciences (ICCISc), Vol. 1, pp. 1-8. IEEE.
    9. Li Q, Liu YQ, Peng YQ, Liu C, Shi J, Yan F, Zhang Q (2021) Real-time facial emotion recognition using lightweight convolution neural network. In Journal of Physics: Conference Series 182:012130. IOP Publishing.
    10. Nayak S, Nagesh B, Routray A, Sarma M (2021) A Human–Computer Interaction framework for emotion recognition through time-series thermal video sequences. Computers & Electrical Engineering 93:107280.
    11. Pustejovsky J, Krishnaswamy N (2020) October. Embodied human-computer interactions through situated grounding. In Proceedings of the 20th ACM International Conference on Intelligent Virtual Agents 1-3.
    12. Qi J, Jiang G, Li G, Sun Y, Tao B (2019) Intelligent human-computer interaction based on surface EMG gesture recognition. Ieee Access, 7, pp.61378-61387.
    13. Rapp A, Curti L, Boldi A (2021) The human side of human-chatbot interaction: A systematic literature review of ten years of research on text-based chatbots. International Journal of Human-Computer Studies 151:102630.
    14. Ren F, Bao Y (2020) A review on human-computer interaction and intelligent robots. International Journal of Information Technology & Decision Making 19:5-47.
    15. Santhoshkumar R, Geetha MK (2019) Deep learning approach for emotion recognition from human body movements with feedforward deep convolution neural networks. Procedia Computer Science 152:158-165.
    16. Tsai TH, Huang CC, Zhang KL, (2020) Design of hand gesture recognition system for human-computer interaction. Multimedia tools and applications 79:5989-6007.
    17. Tsiourti C, Weiss A, Wac K, Vincze M (2019) Multimodal integration of emotional signals from voice, body, and context: Effects of (in) congruence on emotion recognition and attitudes towards robots. International Journal of Social Robotics 11:555-573.
    18. Wu S, Wang Z, Shen B, Wang J H, Dongdong L (2020) Human-computer interaction based on machine vision of a smart assembly workbench. Assembly Automation 40:475-482.
    19. Xu M, Zhang F, Zhang W (2021) Head fusion: Improving the accuracy and robustness of speech emotion recognition on the IEMOCAP and RAVDESS dataset. IEEE Access 9:74539-74549.
    20. Yun Y, Ma D, Yang M (2021) Human–computer interaction-based decision support system with applications in data mining. Future Generation Computer Systems 114:285-289.

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

How to cite

Bargavi S. K., M., Bhambu, P., & Gupta, M. V. (2023). Spoken emotion recognition through human-computer interaction using a novel deep learning technology. Multidisciplinary Science Journal, 5, 2023ss0108. https://doi.org/10.31893/multiscience.2023ss0108
  • Article viewed - 613
  • PDF downloaded - 412