Diabetes Prediction using Machine Learning Algorithms
Diksha Gaikwad Gaikwad, Vaishnavi Patil , Dr. Prashant Wadkar, Vaishnavi Patil , Dr. Prashant Wadkar
Paper Contents
Abstract
In this study, we explored how machine learning can help predict diabetes by using two different models Logistic Regression and Random Forest. We worked with a dataset of 1,000 patient records containing details such as glucose level, blood pressure, insulin, BMI, and age. Before training the models, we handled missing data using the median method and divided the dataset into training and testing parts. The Logistic Regression model performed slightly better, achieving an accuracy of around 54%, while the Random Forest model reached about 48%. Logistic Regression gave more balanced results in identifying both diabetic and non-diabetic cases, whereas Random Forest was less stable and showed signs of overfitting. Both models highlighted glucose level, BMI, and age as key factors for predicting diabetes. These findings suggest that Logistic Regression can serve as a reliable starting point, while Random Forest may require further tuning and better feature selection to improve results. Overall, this work shows how simple machine learning models can be applied effectively to medical data, supporting early detection of diabetes and emphasizing the importance of improving model performance for accurate predictions.
Copyright
Copyright © 2025 Diksha Gaikwad, Vaishnavi Patil , Dr. Prashant Wadkar. This is an open access article distributed under the Creative Commons Attribution License.