🔗 Research paper link: https://dl.acm.org/citation.cfm?id=3297737#
The Prima Indian Diabetes Dataset has been used in this study, provided by the UCI Machine Learning Repository. The dataset has been originally collected from the National Institute of Diabetes and Digestive and Kidney Diseases. The dataset consists of some medical distinct variables, such as pregnancy record, BMI, insulin level, age, glucose concentration, diastolic blood pressure, triceps skin fold thickness, diabetes pedigree function etc. This dataset has 768 patient’s data where all the patients are female and at least 21 years old. The number of true cases are 268 (34.90%) and the number of false cases are 500 (65.10%), respectively, in the dataset. I used six classification techniques, artificial neural network (ANN), Support Vector Machine (SVM), Decision tree (DT), random forest (RF), Logistics Regression (LR) and Naïve Bayes (NB).