Automating Document Workflows with ResNet-50 and Template-Based OCR
DOI:
https://doi.org/10.3126/joeis.v4i1.81610Keywords:
Optical Character Recognition (OCR), Deep Learning, ResNet-50Abstract
In the banking sector, handling a large volume of customers has traditionally been a manual, time- consuming, and error-prone process due to unstructured storage and the absence of automated extraction mechanisms. To address these inefficiencies, we developed an AI-powered document classification and information extraction system that automates the entire workflow using deep learning and image processing techniques. The system was built around a four-stage pipeline: document classification, alignment with predefined templates, text extraction using Optical Character Recognition (OCR), and structured information storage. A fine-tuned ResNet-50 model served as the backbone for the classification module, accurately categorizing scanned or photographed documents into predefined types. Once classified, each document was aligned with a corresponding template to ensure consistency in the placement of key fields, which significantly improved the reliability of text extraction. OCR tools were used to extract textual content from the aligned documents, and the extracted data, such as names, document numbers, and dates of birth, were mapped to structured fields. The final structured data was securely stored in a database, enabling efficient querying and downstream use. While tailored for banking operations, the system’s architecture is adaptable to other sectors like healthcare, insurance, and government services, where semi-structured documents are prevalent.
Downloads
Downloads
Published
How to Cite
Issue
Section
License
Copyright is held by the authors.