Skip to content

Latest commit

 

History

History
79 lines (57 loc) · 2.27 KB

README.md

File metadata and controls

79 lines (57 loc) · 2.27 KB

Supervised Learning

This repository will hold the supervised learning content that I teach at Europeia University on a course about Big Data and Data Analytics.

The program (in Portuguese) can be found here.

The notebooks on this repository are separated into theory and practice.

Theory

  1. Data Collection
    • 1.1. Data Sources
    • 1.2. Data Collection Considerations
  2. Data Exploration and Preparation
    • 2.1. Data Exploration
    • 2.2. Data Preparation/Cleaning
  3. Split Data into Training and Test Sets
    • 3.1. Holdout Method
    • 3.2. Cross Validation
    • 3.3. Data Leakage
    • 3.4. Best Practices
  4. Choose a Supervised Learning Algorithm
    • 4.1. Consider algorithm categories
    • 4.2. Evaluate algorithm characteristics
    • 4.3. Try multiple algorithms
  5. Train the Model
    • 5.1. Objective Function (Loss/Cost Function)
    • 5.2. Optimization Algorithms
    • 5.3. Overfitting and Underfitting
  6. Evaluate Model Performance
    • 6.1. Performance Metrics for Regression Models
    • 6.2. Performance Metrics for Classification Models
  7. Model Tuning and Selection
    • 7.1. Hyperparameter Tuning
    • 7.2. Ensemble Methods

Practice

Clicking on the links below will open the notebooks in Colab and allow you to run it on the cloud.

Get Started

Ensure that you have install conda.

  1. Create a new environment
conda create -n ml
  1. Activate the new environment
conda activate ml
  1. Install poetry with conda
conda install poetry
  1. Install all packages
poetry install

Disclaimer

Much can be improved, so feel free to send a PR with suggestions.