Spell Correction using N-Gram Modeling and Zero Shot Learning

Authors

  • Nikita Subba Department of Computer and Electronics, Kantipur Engineering College, Dhapakhel, Lalitpur, Nepal
  • Bikal Devkota Department of Computer and Electronics, Kantipur Engineering College, Dhapakhel, Lalitpur, Nepal
  • Shuvra Baral Department of Computer and Electronics, Kantipur Engineering College, Dhapakhel, Lalitpur, Nepal

DOI:

https://doi.org/10.3126/injet.v3i2.95781

Keywords:

Natural Language Processing, N-Gram Model, Zero-Shot Contextual Inference, Spell Correction, LLaMA

Abstract

This paper presents an integrated spell-correction system that combines n-gram language modeling with zero-shot contextual inference using a pretrained LLaMA-2 7B model. The n-gram component efficiently generates correction candidates from local word-sequence statistics, while the zero-shot stage re-ranks those candidates by evaluating contextual and semantic plausibility without task-specific fine-tuning. The system is evaluated on two established benchmarks—BEA-60K and JFLEG—and compared against Hunspell, pyspellchecker, a standalone n-gram baseline, a standalone LLaMA-2 baseline, and the NeuSpell (BERT) toolkit. On BEA-60K, the integrated model achieves an F₁-score of 90.7%, improving over the n-gram-only baseline (59.6%) and the standalone LLaMA-2 model (80.9%). On JFLEG, the system obtains a GLEU score of 58.6, outperforming all individual baselines. An error analysis shows that the integrated model handles non-word errors with 94.1% accuracy and real-word context-sensitive errors with 85.9% accuracy. These results demonstrate that hybrid statistical–neural architectures can deliver strong correction performance while preserving the efficiency of the n-gram front end.

Downloads

Download data is not yet available.
Abstract
4
PDF
2

Downloads

Published

2026-06-18

How to Cite

Subba, N., Devkota, B., & Baral, S. (2026). Spell Correction using N-Gram Modeling and Zero Shot Learning. International Journal on Engineering Technology, 3(2), 362–377. https://doi.org/10.3126/injet.v3i2.95781