Optimizing BERT for Nepali Text Classification: The Role of Stemming and Gradient Descent Optimizers

Authors

  • Arjun Singh Saud Central Department of Computer Science and IT, Tribhuvan University, Kirtipur
  • Ajanta Dhakal School of Mathematical Sciences, Tribhuvan University, Kirtipur. Patan Multiple Campus

DOI:

https://doi.org/10.3126/ajmr.v1i1.82292

Keywords:

BERT, Natural Language Processing, Nepali News Classification, Stemming, AdamW

Abstract

This study investigates the use of BERT for classifying Nepali news articles, addressing the specific challenges associated with Nepali as a low-resource language in natural language processing (NLP). While traditional text classification methods have proven effective for high-resource languages, they often fall short in capturing the contextual nuances necessary for accurate classification in Nepali. To address this gap, a pre-trained BERT model was fine-tuned on a balanced dataset of Nepali news articles sourced from various outlets. The study examined the effects of different preprocessing techniques, such as stemming, and optimization algorithms including Adam, AdamW, and Momentum, on classification performance. Experimental results demonstrate that the combination of stemming and the AdamW optimizer yielded the best performance, achieving a weighted accuracy of 93.67%, along with balanced macro precision, recall, and F1-scores of 0.94. These findings underscore the effectiveness of advanced optimization strategies-particularly AdamW-in enhancing model performance.

Downloads

Download data is not yet available.
Abstract
90
PDF
96

Downloads

Published

2025-07-25

How to Cite

Saud, A. S., & Dhakal, A. (2025). Optimizing BERT for Nepali Text Classification: The Role of Stemming and Gradient Descent Optimizers. Aadim Journal of Multidisciplinary Research, 1(1), 25–38. https://doi.org/10.3126/ajmr.v1i1.82292

Issue

Section

Articles