Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
files		files
README.md		README.md

Repository files navigation

Project Descriptions

Codes and Reports file related to Project Work

Classifying Diabetes Cases in a Health Dataset | R and Python

Classified diabetes cases using machine learning techniques (Logistic Regression, Decision Tree, Random Forest, KNN) on a dataset with 21 features. Performed undersampling to handle imbalanced data and achieved 74 % Test Accuracy and 83 % precision in Random Forest. Extracted insights from health-related data and identified key features related to diabetes. The joint work was done on R. But since we couldn't use SMOTE in R ( was taking a lot time), I personally implimented it in Python and also done the whole work in Python. The main difference of the Python and R code is how the imbalanced data had beed handled. Both the methods are giving similar results. I would suggest to look theR code as it is more compact and detailed. Also the report is based on R code.

Link for the dataset: https://www.kaggle.com/datasets/alexteboul/diabetes-health-indicators-dataset

Github folder name: files

About

This Repo includes Project Files and Reports.

Report repository

Releases

No releases published

Packages

No packages published

Languages