Efficient Estimation of Nepali Word Representations in Vector Space

Authors

  • Janardan Bhatta Thapathali Campus, Institute of Engineering, Tribhuvan University, Kathmandu, Nepal
  • Dipesh Shrestha Thapathali Campus, Institute of Engineering, Tribhuvan University, Kathmandu, Nepal
  • Santosh Nepal Thapathali Campus, Institute of Engineering, Tribhuvan University, Kathmandu, Nepal
  • Saurav Pandey Thapathali Campus, Institute of Engineering, Tribhuvan University, Kathmandu, Nepal
  • Shekhar Koirala Thapathali Campus, Institute of Engineering, Tribhuvan University, Kathmandu, Nepal

DOI:

https://doi.org/10.3126/jiee.v3i1.34327

Keywords:

CBOW, NCE loss, One Hot Encoding, Skip-gram, TF-IDF, Word2Vec

Abstract

Word representation is a means of representing a word as mathematical entities that can be read, reasoned and manipulated by computational models. The representation is required for input to any new modern data models and in many cases, the accuracy of a model depends on it. In this paper, we analyze various methods of calculating vector space for Nepali words and postulate a word to vector model based on the Skip-gram model with NCE loss capturing syntactic and semantic word relationships.

This is an attempt to implement a paper by Mikolov on Nepali words.

Downloads

Download data is not yet available.
Abstract
775
PDF
778

Author Biographies

Janardan Bhatta, Thapathali Campus, Institute of Engineering, Tribhuvan University, Kathmandu, Nepal

Department of Electronics and Computer Engineering

Dipesh Shrestha, Thapathali Campus, Institute of Engineering, Tribhuvan University, Kathmandu, Nepal

Department of Electronics and Computer Engineering

Santosh Nepal, Thapathali Campus, Institute of Engineering, Tribhuvan University, Kathmandu, Nepal

Department of Electronics and Computer Engineering

Saurav Pandey, Thapathali Campus, Institute of Engineering, Tribhuvan University, Kathmandu, Nepal

Department of Electronics and Computer Engineering

Shekhar Koirala, Thapathali Campus, Institute of Engineering, Tribhuvan University, Kathmandu, Nepal

Department of Electronics and Computer Engineering

Downloads

Published

2020-03-31

How to Cite

Bhatta, J., Shrestha, D., Nepal, S., Pandey, S., & Koirala, S. (2020). Efficient Estimation of Nepali Word Representations in Vector Space. Journal of Innovations in Engineering Education, 3(1), 71–77. https://doi.org/10.3126/jiee.v3i1.34327

Issue

Section

Articles