Germany's dependence on natural gas highlights the need for accurate forecasting methods. Despite advances, many traditional forecasting models lack reproducibility due to limited data access. This project addresses this gap by providing open-source, publicly available forecasts of Germany's natural gas consumption. Our focus is on using publicly available data, using state-of-the-art models for time-efficient forecast implementation, and ensuring continuous publication and producibility of these forecasts.
This repository is dedicated to the backend of this project, hosting the forecast pipeline and data analysis notebooks.
git clone git@github.com:f-linus/natural_gas_consumption_modelling.git
cd natural_gas_consumption_modelling
This project requires a Python version of 3.9
or 3.10
. To install Python, follow the instructions on the Python website.
pyenv is a convenient tool to manage multiple Python versions on your machine. To install pyenv, follow the instructions on the pyenv GitHub page. (There is also a fork for Windows available here)
This project uses Poetry for dependency management. To install Poetry, follow the instructions on the Poetry website.
To install the project dependencies, run
poetry install
This will create a virtual environment and install all dependencies necessary and specified in pyproject.toml
.
The environment can be activated by running
poetry shell
or when using VS Code by selecting the environment.
Installing the dependencies allows you to run all Notebooks in the notebooks folder which cover the Data Analysis part of the project.
To run the full forecasting pipeline a Google Cloud Storage backend is required.
The project uses Google Cloud Storage as a backend for the forecasting pipeline. To use the pipeline you need to create a Google Cloud Storage bucket.
To create a Google Cloud Storage bucket you need a Google Cloud Platform account. You can create one here.
First, create a new project or select an existing one. You can do this by following the instructions here.
To use the Google Cloud Storage API you need to activate it. You can do this by following the instructions here.
To create a bucket follow the instructions here.
Name the bucket natural-gas-consumption-modelling
and select a region close to you.
To use the storage backend from a local version of the pipeline you need to connect to GCP. You can
do this by using the gcloud
command line tool. To install gcloud
follow the instructions
here.
If not already done, initialise gcloud
by running
gcloud init
and follow the instructions.
This will allow all GCP python dependencies to connect to GCP and to use the storage backend.
The repository encompasses both Notebooks that cover Data Analysis and Model discussion as well as
the full forecasting pipeline in the src
folder.
The Notebooks are located in the notebooks
folder. Running them requires the dependencies to be
installed as described in the Setup section. However, the Notebooks do not require a GCP connection.
The Notebooks can be run natively in Jupyter or JupyterLab or in VS Code.
To run through Jupyter in browser, run
cd notebooks
poetry shell
jupyter notebook
From the browser, notebooks in all subfolders data_overview
, data_analysis
and modelling
can
be run.
The forecasting pipeline is located in the src
folder. It can be run locally or deployed to GCP.
It requires the dependencies to be installed as described in the Setup section and a GCP connection
as described in the GCP section.
The pipeline can be run locally by running
poetry shell
python -m src.daily_model_run
For the example frontend at https://linusfolkerts.com, the pipeline is deployed on GCP.
The model is run daily as a Cloud Run Job, triggered through Cloud Scheduler. From there, the model stores the results in a Cloud Storage bucket.
In the following, we will go through the steps to deploy a new version of the pipeline to GCP.
To use Cloud Run you need to activate it. You can do this by following the instructions here.
To use Cloud Build you need to activate it. You can do this by following the instructions here.
To connect the repository to Cloud Build follow the instructions here.
Here, you select this or your forked repository and create a new Cloud Build trigger. This will build a new Docker image of the pipeline whenever a new commit is pushed to the repository.
To create a new Cloud Run Job, follow the instructions here.
Here, you select the Docker image created in the previous step to create a new Cloud Run Job that executes the whole forecasting pipeline and stores the results in a Cloud Storage bucket.
To trigger the Cloud Run Job daily, you can create a Cloud Scheduler Job. Follow the instructions here.
Here, you select the Cloud Run Job created in the previous step and set the frequency to daily. For specific instructions on how to trigger the created job, see here.
Now the pipeline is deployed and will run daily. Results are stored in the Cloud Storage bucket and can be accessed through the Cloud Storage console.
To make them accessible for a frontend, you can make the bucket public. To do this, follow the instructions here.
An example frontend is provided at https://linusfolkerts.com (GitHub).
Historical data is already partly included in the repository. Data is provided through following sources.
Copernicus Climate Change Service (2020): Climate and energy indicators for Europe from 1979 to present derived from reanalysis. Copernicus Climate Change Service (C3S) Climate Data Store (CDS). 10.24381/cds.4bd77450 (Accessed on 02-Mar-2023)
https://cds.climate.copernicus.eu/cdsapp#!/dataset/sis-energy-derived-reanalysis
Open-Meteo: Historical Weather API
License: https://open-meteo.com/en/features#terms
Open-Meteo: Free Weather API
License: https://open-meteo.com/en/features#terms
Trading Hub Europe GmbH (2023): Imbalance prices. Format: CSV (Accessed on 03-Mar-2023)
https://www.tradinghub.eu/en-gb/Publications/Prices/Imbalance-Prices.
Trading Hub Europe GmbH (2023): Archive of publications for the former GASPOOL market area. Prices for compensation energy. Format: CSV (Accessed on 03-Mar-2023)
https://www.tradinghub.eu/en-gb/Download/Archive-GASPOOL#1301100-prices-fees-and-charges
Trading Hub Europe GmbH (2023): Archive of publications for the former NCG market area. Imbalance prices according to GaBi Gas 2.0. Format: XML (Accessed on 03-Mar-2023)
https://www.tradinghub.eu/en-gb/Download/Archive-NetConnect-Germany#1306111-conversion
U.S. Energy Information Administration, Crude Oil Prices: Brent - Europe [DCOILBRENTEU], retrieved from FRED, Federal Reserve Bank of St. Louis; March 1, 2023.
https://fred.stlouisfed.org/series/DCOILBRENTEU
Ember (2023): European wholesale electricity price data. Wholesale day-ahead electricity price data for European countries, sources from ENTSO-e and cleaned. Format: CSV (Accessed 03-Mar-2023)
https://ember-climate.org/data-catalogue/european-wholesale-electricity-price-data/
European Energy Exchange AG (2023): EUA Emission Spot Primary Market Auction Report - History. Format: XLS/XLSX (Accessed 04-Mar-2023)
https://www.eex.com/en/market-data/environmental-markets/eua-primary-auction-spot-download
GIE (Gas Infrastructure Europe): GIE AISBL 2022. Aggregated Gas Storage Inventory (AGSI): Germany. Format: CSV (Accessed 03-Mar-2023)
https://agsi.gie.eu/data-overview/DE
Trading Hub Europe GmbH (2023): Publication of the aggregate consumption data: Aggregated consumption data. Format: CSV (Accessed on 04-Mar-2023)
https://www.tradinghub.eu/en-gb/Publications/Transparency/Aggregated-consumption-data
Trading Hub Europe GmbH (2023): Archive of publications for the former GASPOOL market area. Other: Aggregated Consumption Date Market Area GASPOOL (CSV File). Format: CSV (Accessed on 04-Mar-2023)
https://www.tradinghub.eu/en-gb/Download/Archive-GASPOOL#1301161-other
Trading Hub Europe GmbH (2023): Archive of publications for the former NCG market area. Other: Aggregated consumption data (CSV File). Format: CSV (Accessed on 04-Mar-2023)
https://www.tradinghub.eu/en-gb/Download/Archive-NetConnect-Germany#1306157-other