Course Project

The goal of this project is to apply everything we learned in this course and build an end-to-end machine learning project.

Remember that to pass the project, you must evaluate 3 peers. If you don't do that, your project can't be considered compelete.

Submitting

Project Cohort #1

Project:

Form: TBA
Deadline: 15 August, 23:00 CEST

Peer reviewing:

Peer review assignments: TBA
Form: TBA
Deadline: 22 August, 23:00 CEST

Project feedback: TBA

Project Cohort #2

Project:

Form: TBA
Deadline: 5 September, 23:00 CEST

Peer reviewing:

Peer review assignments: TBA
Form: TBA
Deadline: 12 September, 23:00 CEST

Project feedback: TBA

Problem statement

For the project, we will ask you to build an end-to-end ML project.

For that, you will need:

Select a dataset that you're interested in (see datasets.md)
Train a model on that dataset tracking your experiments
Create a model training pipeline
Deploy the model in batch, web service or streaming
Monitor the performance of your model
Follow the best practices

Technologies

You don't have to limit yourself to technologies covered in the course. You can use alternatives as well:

Cloud: AWS, GCP, Azure or others
Experiment tracking tools: MLFlow, Weights & Biases, ...
Workflow orchestration: Prefect, Airflow, Flyte, Kubeflow, Argo, ...
Monitoring: Evidently, WhyLabs/whylogs, ...
CI/CD: Github actions, Gitlab CI/CD, ...
Infrastructure as code (IaC): Terraform, Pulumi, Cloud Formation, ...

If you use something that wasn't covered in the course, be sure to explain what the tool does.

If you're not certain about some tools, ask in Slack.

Peer review criteria

(This is still a draft. Feedbask is welcome)

Problem description
- 0 points: Problem is not described
- 1 point: Problem is described but shortly or not clearly
- 2 points: Problem is well described and it's clear what the problem the project solves
Cloud
- 0 points: Cloud is not used, things run only locally
- 2 points: The project is developed on the cloud
- 4 points: The project is developed on the cloud and IaC tools are used for provisioning the infrastructure
Experiment tracking and model registry
- 0 points: No experiment tracking or model registry
- 2 points: Experiments are tracked or models are registred in the registry
- 4 points: Both experiment tracking and model registry are used
Workflow orchestration
- 0 points: No workflow orchestration
- 2 points: Basic workflow orchestration
- 4 points: Fully deployed workflow
Model deployment
- 0 points: Model is not deployed
- 2 points: Model is deployed but only locally
- 4 points: The model deployment code is containerized and could be deployed to cloud or special tools for model deployment are used
Model monitoring
- 0 points: No model monitoring
- 2 points: Basic model monitoring that calculates and reports metrics
- 4 points: Comprehensive model monitoring that send alerts or runs a conditional workflow (e.g. retraining, generating debugging dashboard, switching to a different model) if the defined metrics threshold is violated
Reproducibility
- 0 points: No instructions how to run code at all
- 2 points: Some instructions are there, but they are not complete
- 4 points: Instructions are clear, it's easy to run the code, and the code works
Best practices
- There are unit tests (1 point)
- There is an integration test (1 point)
- Linter and/or code formatter are used (1 point)
- There's a Makefile (1 point)
- There are pre-commit hooks (1 point)
- There's a CI/CI pipeline (2 points)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

project_statement.md

project_statement.md

Course Project

Submitting

Project Cohort #1

Project Cohort #2

Problem statement

Technologies

Peer review criteria

Files

project_statement.md

Latest commit

History

project_statement.md

File metadata and controls

Course Project

Submitting

Project Cohort #1

Project Cohort #2

Problem statement

Technologies

Peer review criteria