Predicting Toxicity of Herbal and Synthetic Organic Compounds Using Machine Learning-Based QSAR Models
DOI:
https://doi.org/10.3126/ltu-jace.v1i1.91934Keywords:
Machine learning, toxicity, herbal compounds, synthetic compounds, random forest, support vector machine, logistic regressionAbstract
This study focuses on the development of a machine learning-based Quantitative Structure-Activity Relationship (QSAR) model to predict the toxicity of organic compounds, including both traditional herbal remedies and synthetic compounds. The study employs Logistic Regression, Random Forest, and Support Vector Machines (SVM) to predict potential toxicity based on molecular descriptors calculated using RDKit, achieving over 90% accuracy across models. Feature importance analysis reveals that molecular descriptors such as lipophilicity (logP), hydrogen bond donors, and specific molecular fingerprints (e.g., FP_375, FP_243, FP_417) significantly correlate with toxicity. A Random Forest-based model highlighted these fingerprint bits as key contributors to toxicity prediction, showing strong correlations with known toxicological properties. The top 20 fingerprint features were analyzed, with their importance ranking depicted in a bar chart. The model demonstrates promising results in predicting hepatotoxicity and neurotoxicity, offering an early-stage toxicity screening tool for drug discovery. Validated on external datasets, the model generalizes well to unseen herbal and synthetic compounds, making it a valuable tool for pharmaceutical and herbal compound safety evaluation. This research underscores the potential of integrating traditional medicinal knowledge with advanced computational methods to enhance safety profiling of diverse organic compounds.
Downloads
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 The Author(s)

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
This license enables reusers to distribute, remix, adapt, and build upon the material in any medium or format for noncommercial purposes only, and only so long as attribution is given to the creator.