-
This is a modular Flask web application that provides a prediction endpoint for an ensemble of machine learning models. It takes input data from a web form and returns the predicted result.
-
The project also includes CI/CD using GitHub Actions for automated build, test, and deployment.
The dataset used for this project is the Student Performance Dataset from Kaggle. Which indeed is the data generated from Royce Kimmons: Understanding digital participation divides, though in the Kaggle a sample of 1000 rows is provided, here a 9000 rows dataset is used, by the data generator provided by Royce Kimmons, with duplicates removed.
P.S. The data generator is fictional.
├── .ebextensions # Elastic Beanstalk configuration files
├── .github/workflows
├── app.py
├── setup.py
├── requirements.txt
├── data
│ ├── data-s.csv
├── notebook
│ ├── EDA.ipynb
│ └── train.ipynb
├── src
│ ├── __init__.py
│ ├── logger.py
│ ├── exception.py
│ ├── utils.py
│ ├── components
│ │ ├── __init__.py
│ │ ├── data_ingestion.py
│ │ ├── data_transformation.py
│ │ ├── model_trainer.py
│ ├── pipeline
│ │ ├── __init__.py
│ │ ├── predit_pipeline.py
├── templates
│ ├── home.html
│ ├── index.html
├── logs
│ ├── daily-wise.log
├── test_app.py
├── .gitignore # Git ignore file
├── README.md # Readme file
To get started with the Flask application, follow the steps below:
- Clone the repository:
git clone https://github.com/Lagstill/MLProject.git
- Install the required dependencies:
pip install -r requirements.txt
- Run the Flask application:
python app.py
- Access the website through the browser:
http://localhost:5000/
This project utilizes **GitHub Actions for continuous integration and continuous deployment (CI/CD). The included workflows in the ".github/workflows" directory automate the build, test, and deployment processes.
The CI/CD workflow performs the following tasks:
- On each push or pull request to the main branch, it triggers the workflow.
- The workflow checks out the source code using the actions/checkout action.
- It sets up the Python environment using the actions/setup-python action and installs the required dependencies.
- The code is linted using flake8 and tested using pytest.
- If all the tests pass, the workflow proceeds to deploy the application.
The deployment workflow can be customized based on your deployment target. For example, it can be configured to deploy to AWS Elastic Beanstalk or any other platform. Update the workflow configuration file (deploy.yml) according to your deployment requirements.
The Flask application provides the following endpoints:
- GET /: Renders the index page with a web form to input the data.
- POST /predict: Accepts the data from the web form, processes it, and returns the predicted result.
The application expects the following data fields in the web form:
- gender: The gender of the student.
- ethnicity: The race/ethnicity of the student. parental_level_of_education: The parental level of education.
- lunch: The type of lunch the student has.
- test_preparation_course: Whether the student completed a test preparation course.
- reading_score: The score obtained in the reading test.
- writing_score: The score obtained in the writing test.
- math_score: The score obtained in the math test. (prediction target)
The response from the /predict endpoint will be displayed on the home page via PredictPipeline.