Automating Document Workflows with ResNet-50 and Template-Based OCR

Srijan Gyawali; Rupak Neupane; Sarjyant Shrestha; Manish Pyakurel

doi:10.3126/joeis.v4i1.81610

Authors

Srijan Gyawali Department of Computer and Electronics Engineering, Khwopa College of Engineering
Rupak Neupane Department of Computer and Electronics Engineering, Khwopa College of Engineering
Sarjyant Shrestha Department of Computer and Electronics Engineering, Khwopa College of Engineering
Manish Pyakurel Department of Computer and Electronics Engineering, Khwopa College of Engineering

Keywords:

Optical Character Recognition (OCR), Deep Learning, ResNet-50

Abstract

In the banking sector, handling a large volume of customers has traditionally been a manual, time- consuming, and error-prone process due to unstructured storage and the absence of automated extraction mechanisms. To address these inefficiencies, we developed an AI-powered document classification and information extraction system that automates the entire workflow using deep learning and image processing techniques. The system was built around a four-stage pipeline: document classification, alignment with predefined templates, text extraction using Optical Character Recognition (OCR), and structured information storage. A fine-tuned ResNet-50 model served as the backbone for the classification module, accurately categorizing scanned or photographed documents into predefined types. Once classified, each document was aligned with a corresponding template to ensure consistency in the placement of key fields, which significantly improved the reliability of text extraction. OCR tools were used to extract textual content from the aligned documents, and the extracted data, such as names, document numbers, and dates of birth, were mapped to structured fields. The final structured data was securely stored in a database, enabling efficient querying and downstream use. While tailored for banking operations, the system’s architecture is adaptable to other sectors like healthcare, insurance, and government services, where semi-structured documents are prevalent.

Abstract

290

pdf

0

Author Biographies

Srijan Gyawali, Department of Computer and Electronics Engineering, Khwopa College of Engineering

Department of Computer and Electronics Engineering, Khwopa College of Engineering

Rupak Neupane, Department of Computer and Electronics Engineering, Khwopa College of Engineering

Department of Computer and Electronics Engineering, Khwopa College of Engineering

Sarjyant Shrestha, Department of Computer and Electronics Engineering, Khwopa College of Engineering

Department of Computer and Electronics Engineering, Khwopa College of Engineering

Manish Pyakurel, Department of Computer and Electronics Engineering, Khwopa College of Engineering

Department of Computer and Electronics Engineering, Khwopa College of Engineering

Automating Document Workflows with ResNet-50 and Template-Based OCR

Authors

Keywords:

Abstract

Author Biographies

Srijan Gyawali, Department of Computer and Electronics Engineering, Khwopa College of Engineering

Rupak Neupane, Department of Computer and Electronics Engineering, Khwopa College of Engineering

Sarjyant Shrestha, Department of Computer and Electronics Engineering, Khwopa College of Engineering

Manish Pyakurel, Department of Computer and Electronics Engineering, Khwopa College of Engineering

Downloads

Published

How to Cite

Issue

Section

License

How to Cite

Information