Scholarly open access journals, Peer-reviewed, and Refereed Journals, Impact factor 8.14 (Calculate by google scholar and Semantic Scholar | AI-Powered Research Tool) , Multidisciplinary, Monthly, Indexing in all major database & Metadata, Citation Generator, Digital Object Identifier(DOI)
Automatic Speech Recognition (ASR) of a target speaker in multi-speaker environments remains a significant challenge. Traditional ASR systems often fail to isolate a specific speaker's voice from overlapping and interfering audio sources. To address this, Target-Speaker ASR (TS-ASR) has emerged as a viable solution by conditioning the recognition process on speaker-specific embeddings. This paper presents a Streaming End-to-End TS-ASR system based on a neural transducer architecture that facilitates low-latency and on-device speech recognition. The proposed model integrates Target-Speaker Activity Detection (TSAD), allowing the system to remain silent when the target speaker is inactive, thereby reducing unnecessary outputs. Experimental evaluations demonstrate that the proposed TS-ASR model achieves superior performance compared to traditional cascade systems, with improvements in word error rate (WER), speaker identification accuracy, and real-time latency. The system is optimized for real-world deployment, offering high accuracy and low computational overhead suitable for mobile and edge applications.
"Streaming End-to-End Target-Speaker Automatic Speech Recognition and Activity Detection ", International Journal for Research Trends and Innovation (www.ijrti.org), ISSN:2455-2631, Vol.10, Issue 9, page no.a113-a117, September-2025, Available :http://www.ijrti.org/papers/IJRTI2509013.pdf
Downloads:
0002590
ISSN:
2456-3315 | IMPACT FACTOR: 8.14 Calculated By Google Scholar| ESTD YEAR: 2016
An International Scholarly Open Access Journal, Peer-Reviewed, Refereed Journal Impact Factor 8.14 Calculate by Google Scholar and Semantic Scholar | AI-Powered Research Tool, Multidisciplinary, Monthly, Multilanguage Journal Indexing in All Major Database & Metadata, Citation Generator