IJRTI
International Journal for Research Trends and Innovation
International Peer Reviewed & Refereed Journals, Open Access Journal
ISSN Approved Journal No: 2456-3315 | Impact factor: 8.14 | ESTD Year: 2016
Scholarly open access journals, Peer-reviewed, and Refereed Journals, Impact factor 8.14 (Calculate by google scholar and Semantic Scholar | AI-Powered Research Tool) , Multidisciplinary, Monthly, Indexing in all major database & Metadata, Citation Generator, Digital Object Identifier(DOI)

Call For Paper

For Authors

Forms / Download

Published Issue Details

Editorial Board

Other IMP Links

Facts & Figure

Impact Factor : 8.14

Issue per Year : 12

Volume Published : 11

Issue Published : 118

Article Submitted : 21470

Article Published : 8508

Total Authors : 22383

Total Reviewer : 805

Total Countries : 157

Indexing Partner

Licence

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License
Published Paper Details
Paper Title: Direct Speech to Image Translation unsing CNN
Authors Name: P. Kamakshi , M.Harini , D.Meghana , Y.Naga Sai Gopi Chand , T.Guna venkata prakash
Download E-Certificate: Download
Author Reg. ID:
IJRTI_189716
Published Paper Id: IJRTI2404123
Published In: Volume 9 Issue 4, April-2024
DOI:
Abstract: Direct speech to image translation is a challenging task in the realm of artificial intelligence, with applications ranging from aiding the visually impaired to enhancing human-computer interaction. This paper proposes a novel approach utilizing Convolutional Neural Networks (CNNs) to directly translate spoken descriptions into corresponding images. The system first converts speech input into text using automatic speech recognition (ASR), then employs a CNN-based architecture to generate images based on the extracted textual features. The proposed CNN architecture comprises convolutional layers for feature extraction, followed by deconvolutional layers for image reconstruction. To enhance the fidelity of generated images, techniques such as attention mechanisms and adversarial training are integrated into the network. Additionally, transfer learning may be employed to leverage pre-trained CNN models for better generalization and performance.
Keywords: speech recognition, CNN, direct speech translation, Image generation
Cite Article: "Direct Speech to Image Translation unsing CNN", International Journal for Research Trends and Innovation (www.ijrti.org), ISSN:2455-2631, Vol.9, Issue 4, page no.887 - 895, April-2024, Available :http://www.ijrti.org/papers/IJRTI2404123.pdf
Downloads: 000205219
ISSN: 2456-3315 | IMPACT FACTOR: 8.14 Calculated By Google Scholar| ESTD YEAR: 2016
An International Scholarly Open Access Journal, Peer-Reviewed, Refereed Journal Impact Factor 8.14 Calculate by Google Scholar and Semantic Scholar | AI-Powered Research Tool, Multidisciplinary, Monthly, Multilanguage Journal Indexing in All Major Database & Metadata, Citation Generator
Publication Details: Published Paper ID: IJRTI2404123
Registration ID:189716
Published In: Volume 9 Issue 4, April-2024
DOI (Digital Object Identifier):
Page No: 887 - 895
Country: krishna, Andhra Pradesh, India
Research Area: Information Technology 
Publisher : IJ Publication
Published Paper URL : https://www.ijrti.org/viewpaperforall?paper=IJRTI2404123
Published Paper PDF: https://www.ijrti.org/papers/IJRTI2404123
Share Article:

Click Here to Download This Article

Article Preview
Click Here to Download This Article

Major Indexing from www.ijrti.org
Google Scholar ResearcherID Thomson Reuters Mendeley : reference manager Academia.edu
arXiv.org : cornell university library Research Gate CiteSeerX DOAJ : Directory of Open Access Journals
DRJI Index Copernicus International Scribd DocStoc

ISSN Details

ISSN: 2456-3315
Impact Factor: 8.14 and ISSN APPROVED, Journal Starting Year (ESTD) : 2016

DOI (A digital object identifier)


Providing A digital object identifier by DOI.ONE
How to Get DOI?

Conference

Open Access License Policy

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License

Creative Commons License This material is Open Knowledge This material is Open Data This material is Open Content

Important Details

Join RMS/Earn 300

IJRTI

WhatsApp
Click Here

Indexing Partner