Enhancement of Fine-Grained Part-of-Speech Tagging for Nepali Text using BiLSTM-CRF and Word2Vec

Authors

  • Tika Ram Khojwar Nepal College of Information Technology, Pokhara University, Nepal
  • Pradip Paudyal Nepal College of Information Technology, Pokhara University, Nepal

DOI:

https://doi.org/10.3126/jost.v5i1.93048

Keywords:

POS Tagging, Nepali Text, Natural Language Processing, GRU, LSTM, BiLSTM, CRF

Abstract

Part-of-speech (POS) tagging is an essential and foundational task in numerous Natural Language Processing (NLP) applications, including machine translation, sentiment analysis, text-to-speech conversion, speech recognition, text summarization, question answering, information retrieval, word sense disambiguation, and Named Entity Recognition. POS tagging entails assigning the correct tag to each token in the corpus, considering its context and the language's syntax. An optimal POS tagger plays a crucial role in computational linguistics. Its importance cannot be emphasized enough because inaccuracies in tagging can greatly affect the performance of complex natural language processing systems. In this work, a deep learning algorithm, BiLSTM with CRF, has been implemented for fine-grained Nepali POS tagging. The Gensim’s Word2Vec word embedding has been trained on sentences and has been used as the embedding layer while creating the model. Additionally, BiLSTM-CRF with Word2Vec has been compared to the well-known models GRU, LSTM, and BiLSTM. BiLSTM-CRF with the Word2Vec word embedding performed the best and achieved a new state of the art F1 score of 99.81% for fine grained Nepali text POS tagging on the Nepali Monolingual Written Corpus.

Downloads

Download data is not yet available.
Abstract
15
PDF
3

Downloads

Published

2026-04-20

How to Cite

Khojwar, T. R., & Paudyal, P. (2026). Enhancement of Fine-Grained Part-of-Speech Tagging for Nepali Text using BiLSTM-CRF and Word2Vec. Journal of Science and Technology, 5(1), 46–52. https://doi.org/10.3126/jost.v5i1.93048

Issue

Section

Articles