Scholarly open access journals, Peer-reviewed, and Refereed Journals, Impact factor 8.14 (Calculate by google scholar and Semantic Scholar | AI-Powered Research Tool) , Multidisciplinary, Monthly, Indexing in all major database & Metadata, Citation Generator, Digital Object Identifier(DOI)
This paper investigates the integration of text, audio, and visual cues to improve emotion recognition accuracy. Traditional psychological analyses often face uncertainty, ambiguity, and changing definitions because they rely on a single model. This study proposed several approaches that combine natural language processing (NLP) for text analysis, speech recognition, and face recognition to achieve the desired results. Good. The system uses convolutional neural networks (CNNs) for image-based visualization, deep learning-based speech processing for voice recognition, and text-based text transformation. The concept of fusion model improves the accuracy of classification theory by leveraging the complementary insights of multiple models. Evaluation using comparative data shows that the system has the potential to outperform the worst and has potential for use in mental health care, customer service and analytics.
Keywords:
Multimodal SentimentAnalysis, Deep Learning, Emotion Recognition, NLP, Computer Vision
Cite Article:
"Multimodal Sentiment Analysis: A Fusion of Text, Speech, and Facial Expression for Emotion Recognition", International Journal of Science & Engineering Development Research (www.ijrti.org), ISSN:2455-2631, Vol.10, Issue 4, page no.c226-c232, April-2025, Available :http://www.ijrti.org/papers/IJRTI2504250.pdf
Downloads:
000316
ISSN:
2456-3315 | IMPACT FACTOR: 8.14 Calculated By Google Scholar| ESTD YEAR: 2016
An International Scholarly Open Access Journal, Peer-Reviewed, Refereed Journal Impact Factor 8.14 Calculate by Google Scholar and Semantic Scholar | AI-Powered Research Tool, Multidisciplinary, Monthly, Multilanguage Journal Indexing in All Major Database & Metadata, Citation Generator