Hybrid Text Summarizer Using SBERT Extractive Filtering and Fine-Tuned BART Abstractive Generation on a Custom Dataset

Authors

  • Aadarsha Chaulagain Department of Computer and Electronics Engineering, Kantipur Engineering College, Dhapakhel, Lalitpur, Nepal
  • Aaditya Bhandari Department of Computer and Electronics Engineering, Kantipur Engineering College, Dhapakhel, Lalitpur, Nepal
  • Bishwa Karna Department of Computer and Electronics Engineering, Kantipur Engineering College, Dhapakhel, Lalitpur, Nepal
  • Jagadish Pokharel Department of Computer and Electronics Engineering, Kantipur Engineering College, Dhapakhel, Lalitpur, Nepal
  • Binod Wosti Department of Computer and Electronics Engineering, Kantipur Engineering College, Dhapakhel, Lalitpur, Nepal

DOI:

https://doi.org/10.3126/injet.v3i2.95541

Keywords:

Text Summarization, Hybrid Summarization, Extractive Summarization, Abstractive Summarization, BART, DistilRoBERTa, SBERT, Natural Language Processing

Abstract

The exponential growth of digital information has made efficient extraction of key insights from large text corpora an increasingly critical challenge. Traditional extractive summarization methods often yield disjointed, incoherent summaries, while purely abstractive approaches, despite their fluency, are prone to hallucination and demand considerable computational resources. This paper presents a hybrid deep learning framework that integrates the complementary strengths of both paradigms. The system employs DistilRoBERTa, an encoder-only transformer, to identify the most semantically relevant sentences through a greedy labeling strategy. A Sentence-BERT (SBERT) semantic filtering module then re-ranks the extracted candidates using cosine similarity before serializing them as input to the abstractive module. The abstractive module is built upon the Facebook/BART-Large-CNN architecture, fine-tuned on a custom hybrid dataset of 18,000 samples constructed programmatically from CNN/DailyMail. Evaluation using ROUGE metrics yielded a ROUGE-1 score of 0.4935 and a ROUGE-2 score of 0.2421 at Epoch 2. The final system is deployed with a graphical user interface enabling users to upload documents and receive high-quality, factually grounded summaries.

Downloads

Download data is not yet available.
Abstract
7
PDF
2

Downloads

Published

2026-06-18

How to Cite

Chaulagain, A., Bhandari, A., Karna, B., Pokharel, J., & Wosti, B. (2026). Hybrid Text Summarizer Using SBERT Extractive Filtering and Fine-Tuned BART Abstractive Generation on a Custom Dataset. International Journal on Engineering Technology, 3(2), 198–205. https://doi.org/10.3126/injet.v3i2.95541