Nepali Music Genre Classification Using CNN-SVM Hybrid Architecture

Authors

  • Bibas Shrestha Department of Computer and Electronics Engineering, Kantipur Engineering College, Dhapakhel, Lalitpur, Nepal
  • Darshan Deuja Department of Computer and Electronics Engineering, Kantipur Engineering College, Dhapakhel, Lalitpur, Nepal
  • Dhirendra Prasad Pant Department of Computer and Electronics Engineering, Kantipur Engineering College, Dhapakhel, Lalitpur, Nepal
  • Jenish Chapagain Department of Computer and Electronics Engineering, Kantipur Engineering College, Dhapakhel, Lalitpur, Nepal

DOI:

https://doi.org/10.3126/injet.v3i2.95495

Keywords:

Music Genre Classification, CNN, SVM, Mel Spectrogram, Machine Learning, Audio Features

Abstract

Nepali music genre classification using CNN–SVM hybrid model was developed to address the challenge of categorizing local genres such as Gazal, Lok Dohori, Nephop, and Pop. A dataset of 1,000 manually curated songs (250 per genre) was collected from YouTube, segmented into 30‑second clips at 25%, 50%, and 75% of each track’s duration, resulting in approximately 3,000 audio segments. Each segment was converted into a 128×128 Log‑Mel spectrogram. A four‑layer CNN extracted a 64‑dimensional embedding, which was then passed to an SVM classifier. Experiments showed that the CNN–SVM hybrid with an RBF kernel achieved 88.29% accuracy, outperforming the standalone CNN baseline (84.28%). Among evaluated kernels, RBF and Linear both achieved the highest accuracy of 88.29%, while the Sigmoid kernel performed worst at 79.60%. The results demonstrate that combining deep learning feature extraction with a traditional machine learning classifier is effective for Nepali music genre classification on moderate‑sized, domain‑specific datasets.

Downloads

Download data is not yet available.
Abstract
6
PDF
2

Downloads

Published

2026-06-18

How to Cite

Shrestha, B., Deuja, D., Pant, D. P., & Chapagain, J. (2026). Nepali Music Genre Classification Using CNN-SVM Hybrid Architecture. International Journal on Engineering Technology, 3(2), 24–33. https://doi.org/10.3126/injet.v3i2.95495