Analysis of Wine Quality and Prediction Using Logistic Regression

Contributors:

Alix, Paramveer, Susannah, Zoe

Project Summary:

This project aims to analyze and predict the quality of wine based on various physicochemical properties. Using the UCI Wine Quality dataset, we conduct data preprocessing, exploratory data analysis, and build machine learning models to predict wine quality. The dataset includes multiple features, such as acidity, alcohol content, and sugar levels, which are critical in determining the quality score of wines. The project utilizes cross-validation and hyperparameter tuning to optimize model performance.

Data Analysis:

Dataset: The dataset was sourced from the UCI Machine Learning Repository.

Preprocessing: Standardization of numerical features. One-hot encoding for binary categorical features (e.g., color).

Exploratory Data Analysis: Distribution of wine quality scores. Correlation heatmaps to identify relationships between features. Key insights on influential features.

Modeling: Logistic regression was used as the base model. RandomizedSearchCV was applied for hyperparameter optimization. The model was evaluated using metrics such as accuracy, precision, recall, and F1-score.

Usage

For the first time running the project, create the conda environment by running the following in the root of the repository:

conda-lock install --name wine-quality-regressor conda-lock.yml

To run the analysis, open Jupyter lab from the root of the repository:

jupyter lab

Open notebooks/wine-quality.ipynb in Jupyter lab and run all cells using the new wine-quality-regressor kernel.

List of Dependencies:

conda (version 24.9.1 or higher)
conda-lock (version 2.5.7 or higher)
Python package ucimlrepo (version 0.0.7)
jupyterlab (version 4.2.0 or higher)
nb_conda_kernels (version 2.5.1 or higher)
Python and packages listed in environment.yml

Name		Name	Last commit message	Last commit date
Latest commit History 64 Commits
data		data
notebooks		notebooks
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE.md		LICENSE.md
README.md		README.md
conda-lock.yml		conda-lock.yml
environment.yml		environment.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Analysis of Wine Quality and Prediction Using Logistic Regression

Contributors:

Project Summary:

Data Analysis:

Usage

List of Dependencies:

About

Releases 1

Packages

Contributors 4

Languages

License

UBC-MDS/wine-quality-regressor-group-2

Folders and files

Latest commit

History

Repository files navigation

Analysis of Wine Quality and Prediction Using Logistic Regression

Contributors:

Project Summary:

Data Analysis:

Usage

List of Dependencies:

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 4

Languages

Packages