Skip to content

r1cky0/Italian-energy-consumption-prediction

 
 

Repository files navigation

Italian Energy Consumption Prediction

Model Identification and Data Analysis course project at University of Pavia
Developed in collaboration with: @simoneghiazzi, @riccardocrescenti, @chiarabertocchi and @lucacolombo97

General Overview

Goal: identification of an annual profile model for the long-term prediction of the Italian energy consumption time series

The provided dataset is composed of Italian energy consumption data for a two-year period.
From initial observations of the available dataset, there is a periodic pattern, both annual and weekly. For this reason, Fourier Series will be used in the development of the model.
The first step is to make the series stationary on average through the operation of detrending

image

Model Development

For what concern the training and the validation of the model, the dataset is divided as follows:

  • Training: energy consumption data of the first year
  • Validation: second year energy consumption data

2 main models have been developed for the 2 different periodicities detected in the data:

  1. Weekly periodicity model: Phi_settimanale consisting of 6 harmonics, of period 7

image

  1. Annual periodicity model: 12 annual models were developed, up to 24 harmonics

image

Additive Model Validation

A new validation Phi was created for the weekly model, using the weekly days of validation data: 12 final models were created consisting of the sum of the weekly validation model and the annual models created in the training phase.
Using the AIC and Crossvalidation tests, the model that best represents the data is chosen.

AIC Test:

image

CrossValidation Test:

image

From the tests we see that the best annual model is model 10, consisting of 20 regressors.
For this model we calculated:

  • MSE = 3.836955832
  • RMSE = 1.958814904

Validation Data 3D Plot:

image

Final Model Surface:

image

Holiday's Problem

An analysis of the error histogram shows a concentration of errors around zero. However, there is an "anomalous" zone between -6 and -10, which represents the errors found in correspondence with the holidays.

image

As can be seen from the graph, the periods of greatest fluctuation in the validation epsilon (which represents the magnitude of the error) are those in correspondence with holidays, where:

  • Blue: Easter
  • Yellow: mid-August
  • Red: Christmas

image

The final model was then retrained on the data without the holiday periods (Christmas and mid-August holidays, which have a fixed date) to improve the prediction of "normal" days. An average was made on the data values assumed during these holiday periods, which was then added to the final model as a "correction index".
In this way the validation parameters improve:

  • SSR = 1.186772448087091e+03
  • MSE = 3.251431364622168
  • RMSE = 1.803172583149535

Final Function

The final function takes in input 2 scalars (day of the year, day of the week) and returns the prediction of energy consumption. It consists of:

  • A method for solving null values
  • Detrending technique: estimation of the trend of the 2 years
  • Identification of the model on the 2 years supplied data
  • Generation of the matrix containing all possible combinations day year - day week
  • Trend extension: extension of the last value of the trend that is added to the data of forecast

The forecast data is then read from the matrix using the 2 input indices.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • MATLAB 100.0%