English Premier League Match Outcome Forecasting with Random Forest

Overview

This repository contains a machine learning model based on the Random Forest algorithm for forecasting English Premier League match outcomes. The model utilizes various features, including rolling averages and dynamic retraining, to enhance precision in predicting both home and away outcomes.

Project Structure

The project is organized into different components:

Data Collection and Preprocessing: The data folder contains scripts for collecting and preprocessing the EPL match data. To scrape the data, run the scrape.py script. Otherwise, use the matches.csv file directly to train and test the model by running the prediction_model.ipynb file.
Feature Engineering: The features folder includes tools for creating relevant features for the machine learning model. This involves calculating rolling averages and other dynamic features that capture the teams' recent performances.
Model Training: The models folder contains the main machine learning model implemented using the Random Forest algorithm.
Prediction: The predict folder provides utilities for making predictions on new match data using the trained model.
Evaluation: The evaluation folder includes scripts and notebooks for evaluating the model's performance on historical data. The evaluation process helps fine-tune the model and understand its strengths and weaknesses.
Dynamic Retraining: The dynamic_retraining folder contains scripts for implementing dynamic retraining. This involves updating the model periodically with new data to ensure that it stays relevant and accurate over time.

Requirements

Python 3.x
Required Python packages are listed in the requirements.txt file. Install them using:

pip install -r requirements.txt

Usage

Data Collection and Preprocessing:
- To scrape EPL match data, run the scrape.py script in the data folder.
- Alternatively, use the pre-existing matches.csv file to train and test the model.
Feature Engineering:
- Utilize tools in the features folder to generate relevant features for the machine learning model.
Model Training:
- Train the Random Forest model using the implementation in the models folder.
Prediction:
- Use utilities in the predict folder for making predictions for a specific set of matches.
Evaluation:
- Utilize scripts and notebooks in the evaluation folder to assess the model's performance on historical data.
Dynamic Retraining:
- Periodically implement dynamic retraining strategies from the dynamic_retraining folder to update the model with new data.

Contributing

If you would like to contribute to this project, please follow the standard GitHub flow: fork the repository, create a branch, make your changes, and submit a pull request.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
README.md		README.md
matches.csv		matches.csv
prediction_model.ipynb		prediction_model.ipynb
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

English Premier League Match Outcome Forecasting with Random Forest

Overview

Project Structure

Requirements

Usage

Contributing

About

Releases

Packages

Languages

soham2002/Premier-League-Predictor

Folders and files

Latest commit

History

Repository files navigation

English Premier League Match Outcome Forecasting with Random Forest

Overview

Project Structure

Requirements

Usage

Contributing

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages