Skip to content

Saheli2001/Classification-Project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 

Repository files navigation

Project Descriptions

Codes and Reports file related to Project Work

  • Classifying Diabetes Cases in a Health Dataset | R and Python

    Classified diabetes cases using machine learning techniques (Logistic Regression, Decision Tree, Random Forest, KNN) on a dataset with 21 features. Performed undersampling to handle imbalanced data and achieved 74 % Test Accuracy and 83 % precision in Random Forest. Extracted insights from health-related data and identified key features related to diabetes. The joint work was done on R. But since we couldn't use SMOTE in R ( was taking a lot time), I personally implimented it in Python and also done the whole work in Python. The main difference of the Python and R code is how the imbalanced data had beed handled. Both the methods are giving similar results. I would suggest to look theR code as it is more compact and detailed. Also the report is based on R code.

    Link for the dataset: https://www.kaggle.com/datasets/alexteboul/diabetes-health-indicators-dataset

    Github folder name: files

About

This Repo includes Project Files and Reports.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published