Tagging Angika Corpus Using BIS Scheme: A Preliminary Study

Jyoti Kumari

doi:10.3126/nl.v39i1.86158

Tagging Angika Corpus Using BIS Scheme: A Preliminary Study

Authors

Jyoti Kumari Department of Linguistics, Banaras Hindu University

Keywords:

Angika, NLP, POS Tagset, low-resource language

Abstract

Angika is an Eastern Indo-Aryan language spoken mainly in the southeastern regions of Bihar, Jharkhand and in some areas of Nepal. Angika is a Low-resource language due to the absence of linguistic resources and NLP tools.. The primary challenge for developing NLP tools for the Angika language is the lack of corpora. In this context, the BIS POS Tagset for Indian languages has been adopted to facilitate Part-of-Speech (POS) tagging for Angika. Part-of-Speech (POS) tagging is a fundamental task in Natural Language Processing (NLP) that involves assigning grammatical categories, such as nouns, verbs, adjectives, and adverbs, to words in a text. This article aims to explore the application of the BIS POS Tagset for Angika.

Abstract

PDF

Downloads

Published

2025-11-12

How to Cite

Kumari , J. (2025). Tagging Angika Corpus Using BIS Scheme: A Preliminary Study . Nepalese Linguistics, 39(1), 58-64. https://doi.org/10.3126/nl.v39i1.86158

Download Citation

Issue

Vol. 39 No. 1 (2025)

Section

Articles

How to Cite

Kumari , J. (2025). Tagging Angika Corpus Using BIS Scheme: A Preliminary Study . Nepalese Linguistics, 39(1), 58-64. https://doi.org/10.3126/nl.v39i1.86158

Download Citation

Tagging Angika Corpus Using BIS Scheme: A Preliminary Study

Authors

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

How to Cite

Information