Automatic Nepali Image Captioning Using CNN-Transformer Model

Authors

  • Swarup Singh Tharu Department of Computer Engineering, United Technical College
  • Savin Basnet Department of Computer Engineering, United Technical College
  • Arun Thapa Department of Computer Engineering, United Technical College
  • Prashant Poudel Faculty of Computer Engineering, United Technical College

DOI:

https://doi.org/10.3126/juem.v3i1.84867

Keywords:

Deep Learning, Pre-trained Dataset, Nepali Image Captions, Convulutional Neural Network (CNN), Transformer Model, EfficientNetB0, Feature Extraction, Sequence Generation

Abstract

Image captioning has gained significant attention, with most of the research efforts directed toward the English language. While some work has been explored in regional languages such as Hindi and Bengali, Nepali remains largely underrepresented in this domain. Furthermore, publicly accessible Nepali-language datasets for image captioning are extremely limited. This study leverages an existing pre-trained dataset that includes Nepali image captions and employs deep learning methods to automatically generate descriptions in the Nepali language. The architecture used integrates a Convolutional Neural Network (CNN) for image understanding and a Transformer model for sequence generation. In our approach, EfficientNetB0, a pre-trained CNN model, is utilized to extract high-level features from images. These features are then fed into the Transformer, which generates the corresponding captions in Nepali. The experimental results demonstrate encouraging performance, suggesting the approach is effective and holds potential for further refinement in future research.

Downloads

Download data is not yet available.
Abstract
30
PDF
16

Downloads

Published

2025-09-29

How to Cite

Tharu, S. S., Basnet, S., Thapa, A., & Poudel, P. (2025). Automatic Nepali Image Captioning Using CNN-Transformer Model. Journal of UTEC Engineering Management, 3(1), 189–197. https://doi.org/10.3126/juem.v3i1.84867

Issue

Section

Research Articles