Scholarly open access journals, Peer-reviewed, and Refereed Journals, Impact factor 8.14 (Calculate by google scholar and Semantic Scholar | AI-Powered Research Tool) , Multidisciplinary, Monthly, Indexing in all major database & Metadata, Citation Generator, Digital Object Identifier(DOI)
Automated OCR-Based PAN Card Text Extraction System is proposed to automate and digitize the process of extracting
text data from Indian Government-issued PAN cards. Using Optical Character Recognition (OCR) technologies such as
PyTesseract, Google Vision API, OCR.Space API, and Support Vector Machines (SVM), the system improves text extraction
accuracy while maintaining data integrity. Higher-level image preprocessing methods, including deskewing, noise reduction, and
adaptive thresholding, dramatically enhance OCR accuracy, even with poor-quality images. The retrieved data are organized in
JSON and exportable to CSV, making integration with banking, government agencies, and automated documentation workflows
straightforward. The system dramatically decreases manual entry, reduces errors, and speeds up verification processes, highlighting
its applicability to wider identity document digitalization. Also, this study examines the comparative performance of different OCR
engines and sheds light on preprocessing as a necessary step to increase OCR accuracy. The system finds its application in identity
authentication, financial transactions, and legal records, and as such, can be a key component of future digitization. Future research
will investigate deep learning-based OCR upgrades and cross-platform support to extend its robustness and scalability even further.
This work adds to the expanding field of document automation through a solution that is efficient and scalable enough for identity
document digitization.
Keywords:
OCR, PAN Card, Text Extraction, PyTesseract, Google Vision API, OCR.Space API, Image Preprocessing
Cite Article:
"Automated OCR-Based PAN Card Text Extraction System", International Journal for Research Trends and Innovation (www.ijrti.org), ISSN:2455-2631, Vol.10, Issue 3, page no.b47-b53, March-2025, Available :http://www.ijrti.org/papers/IJRTI2503107.pdf
Downloads:
000609
ISSN:
2456-3315 | IMPACT FACTOR: 8.14 Calculated By Google Scholar| ESTD YEAR: 2016
An International Scholarly Open Access Journal, Peer-Reviewed, Refereed Journal Impact Factor 8.14 Calculate by Google Scholar and Semantic Scholar | AI-Powered Research Tool, Multidisciplinary, Monthly, Multilanguage Journal Indexing in All Major Database & Metadata, Citation Generator