Deep-Learning Based Tomato Leaf Disease Classification using CNN, ConvNeXt, Vision Transformer, and Swin Transformer

Authors

  • Niraj Pandey Department of Computer Engineering, Kantipur Engineering College, Dhapakhel, Lalitpur, Nepal
  • Denish Oli Department of Computer Engineering, Kantipur Engineering College, Dhapakhel, Lalitpur, Nepal

DOI:

https://doi.org/10.3126/injet.v3i2.95510

Keywords:

Tomato leaf disease classification, deep learning, convolutional neural networks, ConvNeXt, Vision Transformer, Swin Transformer, deployment-aware AI, plant disease detection, agricultural AI, image classification

Abstract

Deep-learning-based tomato leaf disease classification remains challenging because it must address visually similar disease patterns, non-target inputs, and computational constraints. This paper presents a deployment-aware comparative study of four deep learning model families—Convolutional Neural Networks (CNN), ConvNeXt, Vision Transformer (ViT), and Swin Transformer—for tomato leaf disease classification. We used a dataset comprising 31,042 training images and 5,643 validation images and include a not_tomato rejection class to better reflect real-world deployment beyond closed-set classification. We also applied HSV-based leaf-focused background removal and evaluated model accuracy, loss behavior, precision, recall, and F1-score. Results showed that Swin Transformer achieved the best overall performance with 98.69% validation accuracy and superior plant detection capability. ViT demonstrated the most stable generalization behavior, while ConvNeXt remained competitive but computationally expensive. The custom CNN offered high efficiency but lower accuracy. These findings highlight the importance of balancing accuracy, generalization, and deployment constraints in agricultural AI systems.

Downloads

Download data is not yet available.
Abstract
4
PDF
1

Downloads

Published

2026-06-18

How to Cite

Pandey, N., & Oli, D. (2026). Deep-Learning Based Tomato Leaf Disease Classification using CNN, ConvNeXt, Vision Transformer, and Swin Transformer. International Journal on Engineering Technology, 3(2), 98–106. https://doi.org/10.3126/injet.v3i2.95510