Scholarly open access journals, Peer-reviewed, and Refereed Journals, Impact factor 8.14 (Calculate by google scholar and Semantic Scholar | AI-Powered Research Tool) , Multidisciplinary, Monthly, Indexing in all major database & Metadata, Citation Generator, Digital Object Identifier(DOI)
Image caption generation is a challenging task that bridges the fields of computer vision and natural language processing. It involves understanding the visual content of an image and generating a meaningful textual description. This paper presents an advanced deep learning model that combines Convolutional Neural Networks (CNN) for image feature extraction and Long Short-Term Memory (LSTM) networks for sequence generation.
The CNN extracts the high-level visual features, while LSTM decodes these features into coherent natural language sentences. The system is trained and evaluated on the Flickr8k dataset, and performance is measured using BLEU, METEOR, and scores. Experimental results demonstrate that the proposed CNN-LSTM model effectively generates accurate and contextually relevant captions, showcasing the power of integrating vision and language models.
Keywords:
Image captioning, CNN, LSTM, Deep Learning, Feature Extraction, Natural Language Processing
Cite Article:
"Image Caption Generation using CNN and LSTM", International Journal for Research Trends and Innovation (www.ijrti.org), ISSN:2456-3315, Vol.11, Issue 2, page no.a461-a464, February-2026, Available :http://www.ijrti.org/papers/IJRTI2602063.pdf
Downloads:
000113
ISSN:
2456-3315 | IMPACT FACTOR: 8.14 Calculated By Google Scholar| ESTD YEAR: 2016
An International Scholarly Open Access Journal, Peer-Reviewed, Refereed Journal Impact Factor 8.14 Calculate by Google Scholar and Semantic Scholar | AI-Powered Research Tool, Multidisciplinary, Monthly, Multilanguage Journal Indexing in All Major Database & Metadata, Citation Generator