Skip to content

Built various machine learning models to diagnose disease - a binary classification problem.

Notifications You must be signed in to change notification settings

Jieer334/Machine-Learning-Disease-Prediction

Repository files navigation

Applied Machine Learning

Disease Prediction Project

Overview:

This machine learning project comes from the Applied Machine Learning course I took in Fall 2020.

Project Goal:

The goal is to predict whether or not a patient has a certain unspecified disease. This is a binary classification problem.

Dataset:

Provided by the professor the course, the training dataset has 49,000 rows and 12 columns. Methodology:

This analysis and report of two jupyter nootbooks all has below steps.

Data Preparation

I discussed the potential data quality issues I identified about the dataset and how I applied various data preprocessing techniques to cope with those issues and performed Exploratory Data Analysis (EDA). Whenever appropriate, I enhanced my EDA with the effective data visualization.

Build, tune and evaluate various machine learning algorithms

I applied a list of machine learning algorithms covered in the course to the training data and construct disease diagnosis models. I also performed extensive model experiments with hyper-parameters’ tuning.

The first jupyter notebook has NBC, KNN, linear SVM, non-linear SVM, Random Forest and Gradient Boosting Machine. The second jupyter notebook has Logistic Regression, Artificial Neural Network/Deep Learning and Decision Tree.

Prediction and Interpretation

After building the classification models, I applied them to the test dataset (Disease Prediction Testing.csv) provided to predict if each person in the testing dataset has the disease.

About

Built various machine learning models to diagnose disease - a binary classification problem.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published