Traditional Nepali Object Detection using Fine-Tuned YOLOV8
DOI:
https://doi.org/10.3126/kjse.v10i1.93877Keywords:
YOLOv8n, Visual Learning, Object Detection, AdamW, Nepali objectsAbstract
Visual learning supports comprehension by using images and videos to present information in an intuitive way. This study develops and optimizes an object detection model for integration into Drishti Kosh, a mobile visual dictionary designed to identify and translate traditional Nepali objects. A dataset containing 3,591 images across twelve culturally significant classes was used to fine-tune the YOLOv8n model. Three optimization algorithms—Stochastic Gradient Descent, Adaptive Moment Estimation, and AdamW—were evaluated to determine the most effective configuration for real-time use. Among them, the model trained with the AdamW optimizer demonstrated superior performance, achieving the highest recall value of 0.8753 and a mean Average Precision at 50 percent of 0.6545. It also produced the fastest inference time of 2.15 milliseconds and the lowest post-processing time of 0.98 milliseconds, along with a low training loss of 0.00087, indicating stable convergence. The results show that a lightweight and culturally aware object detection model can support real-time mobile deployment and enhance user interaction in multilingual environments. This contributes to improved accessibility for learners, tourists, and general users engaging with traditional Nepali objects in everyday contexts.