Hybrid CRNN with Seq2Seq Attention Mechanism for Handwriting Recognition
DOI:
https://doi.org/10.3126/injet.v3i2.95148Keywords:
Handwritten text recognition, CNN-BiLSTM-Attention Hybrid Model, Text recognitionAbstract
Handwritten Text Recognition (HTR) faces significant challenges including limited annotated data, high handwriting variability, and complex character formations. This paper proposes a hybrid CRNN with Seq2Seq Bahdanau attention for robust HTR. The encoder employs a ten-layer CNN with residual connections for spatial feature extraction and Bidirectional LSTM for temporal modeling. The decoder uses Bahdanau attention to dynamically generate context vectors by focusing on relevant image regions at each decoding step, combining character embeddings and context vectors to produce character probabilities via SoftMax. To prevent overfitting with limited data, comprehensive regularization is applied: dropout (0.3), weight decay (1e-4), label smoothing (0.1), stochastic depth (0.1), exponential moving average (EMA), stochastic weight averaging (SWA), adaptive teacher forcing decay, cosine annealing, and layer-wise learning rate decay, supplemented by light augmentation. Evaluation on the IAM handwriting dataset demonstrates 13.59% WER and 4.28% CER (86.41%-word level accuracy, 95.72%-character level accuracy), outperforming recent comparable CRNN-based methods. Attention visualizations confirm meaningful spatial-sequential alignment, with diagonal attention patterns indicating systematic left-to-right character progression. These results validate that attention-based sequence modeling combined with systematic regularization achieves robust HTR performance in data-limited scenarios without relying on synthetic data or external lexicons.
Downloads
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 International Journal on Engineering Technology

This work is licensed under a Creative Commons Attribution 4.0 International License.
This license enables reusers to distribute, remix, adapt, and build upon the material in any medium or format, so long as attribution is given to the creator. The license allows for commercial use.