IJRTI
International Journal for Research Trends and Innovation
International Peer Reviewed & Refereed Journals, Open Access Journal
ISSN Approved Journal No: 2456-3315 | Impact factor: 8.14 | ESTD Year: 2016
Scholarly open access journals, Peer-reviewed, and Refereed Journals, Impact factor 8.14 (Calculate by google scholar and Semantic Scholar | AI-Powered Research Tool) , Multidisciplinary, Monthly, Indexing in all major database & Metadata, Citation Generator, Digital Object Identifier(DOI)

Call For Paper

For Authors

Forms / Download

Published Issue Details

Editorial Board

Other IMP Links

Facts & Figure

Impact Factor : 8.14

Issue per Year : 12

Volume Published : 11

Issue Published : 119

Article Submitted : 23355

Article Published : 9033

Total Authors : 23952

Total Reviewer : 831

Total Countries : 162

Indexing Partner

Licence

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License
Published Paper Details
Paper Title: CLIP-Based Image Caption Generation Using Transformer Decoder
Authors Name: Mr. V. Jeevan Kumar , Surapureddy Pushpa , Anil Kumar Reddy , Korlamadugu Ravi shankar sai
Download E-Certificate: Download
Author Reg. ID:
IJRTI_211335
Published Paper Id: IJRTI2604149
Published In: Volume 11 Issue 4, April-2026
DOI:
Abstract: Generating image captions is an essential process in computer vision and natural language processing, where an automatic description of the image is generated through a computer algorithm. Image captioning has attracted considerable attention owing to its usage in various domains such as assistive technologies, content management, and human-computer inter- action. This paper presents a smart image caption generator that integrates deep learning algorithms and effective feature extraction methods to create precise and coherent captions. The proposed intelligent image caption generator adopts the CLIP model, which enables the extraction of high-quality se- mantic features from the image. The model uses a Transformer decoder to create the captions. The system is trained using the TextCaps dataset, which consists of images and their corre- sponding captions, thus enabling the learning of the relationship between the visual aspect and text. The beam search decoding technique is implemented to enhance the quality of the generated captions by choosing the appropriate sequence of words. In order to ensure that the system is easy to use and deployable in the real world, a web application is built using Streamlit, allowing users to either upload images or take pictures through their device cameras. Moreover, features like activity detection and text-to-speech improve the ease of use and accessibility of the model. The proposed method proves that using CLIP for feature extraction and Transformer for caption generation is effective. Index Terms—Image Captioning, CLIP, Transformer Decoder, Beam Search, Deep Learning, TextCaps, Streamlit, Action De- tection, Text-to-Speech
Keywords: Image Captioning,CLIP,Transformer Decoder,Deep Learning Comuter vision,Beam Search,Natural Language Processing TextCaps Dataset
Cite Article: "CLIP-Based Image Caption Generation Using Transformer Decoder", International Journal for Research Trends and Innovation (www.ijrti.org), ISSN:2456-3315, Vol.11, Issue 4, page no.b97-b100, April-2026, Available :http://www.ijrti.org/papers/IJRTI2604149.pdf
Downloads: 00041
ISSN: 2456-3315 | IMPACT FACTOR: 8.14 Calculated By Google Scholar| ESTD YEAR: 2016
An International Scholarly Open Access Journal, Peer-Reviewed, Refereed Journal Impact Factor 8.14 Calculate by Google Scholar and Semantic Scholar | AI-Powered Research Tool, Multidisciplinary, Monthly, Multilanguage Journal Indexing in All Major Database & Metadata, Citation Generator
Publication Details: Published Paper ID: IJRTI2604149
Registration ID:211335
Published In: Volume 11 Issue 4, April-2026
DOI (Digital Object Identifier):
Page No: b97-b100
Country: Nandyal, Andhra Pradesh, India
Research Area: Computer Science & Technology 
Publisher : IJ Publication
Published Paper URL : https://www.ijrti.org/viewpaperforall?paper=IJRTI2604149
Published Paper PDF: https://www.ijrti.org/papers/IJRTI2604149
Share Article:

Click Here to Download This Article

Article Preview
Click Here to Download This Article

Major Indexing from www.ijrti.org
Google Scholar ResearcherID Thomson Reuters Mendeley : reference manager Academia.edu
arXiv.org : cornell university library Research Gate CiteSeerX DOAJ : Directory of Open Access Journals
DRJI Index Copernicus International Scribd DocStoc

ISSN Details

ISSN: 2456-3315
Impact Factor: 8.14 and ISSN APPROVED, Journal Starting Year (ESTD) : 2016

DOI (A digital object identifier)


Providing A digital object identifier by DOI.ONE
How to Get DOI?

Conference

Open Access License Policy

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License

Creative Commons License This material is Open Knowledge This material is Open Data This material is Open Content

Important Details

Join RMS/Earn 300

IJRTI

WhatsApp
Click Here

Indexing Partner