This presentation is about forecasting with sktime.
It discusses different types of forecasting techniques: classical and contemporary techniques can successfully be used in practice, but it is important to understand the tradeoffs inherent to using one technique vs the other.
sktime
provides a unified interface to use a variety of forecasting methods in composable forecasting pipelines for different types of problems: univariate, multivariate, panel and hierarchical.
sktime
is easily extensible by anyone, and interoperable with the pydata/numfocus stack.
This presentation has the following parts:
- a general introduction to time series and forecasting
- an overview of what
sktime
is, and how it makes using different forecasting algorithms easier - common classical forecasting techniques, and how they are commonly used
- machine learning for forecasting, and common data processing steps to make them useful (reduction, pipelines)
Also check out our notebooks from the half-day sktime introduction workshop at pydata Prague 2023!
The slides for this presentation are made from the notebook titled forecasting.ipynb
You have different options how to run the accompanying notebook:
- Run the notebooks in the cloud on Binder - for this you don't have to install anything!
- Run the notebooks on your machine. Clone this repository, get conda, install the required packages (
sktime
,seaborn
,jupyter
) in an environment, and open the notebooks with that environment. For detail instructions, see below. For troubleshooting, see sktime's more detailed installation instructions. - or, use python venv, and/or an editable install of this repo as a package. Instructions below.
Please let us know on the sktime discord if you have any issues during the conference, or join to ask for help anytime.
This presentation is about forecasting with classical and machine learning methods, and how they can be used successfully in different settings. It's also about how to implement different forecasting techniques with sktime
, a unified interface for building end-to-end time series solutions. Forecasting is a domain that's undergone rapid improvements in both technique and theory, and its challenges make it a difficult domain to master.
What makes forecasting a unique problem for machine learning algorithms?
Under what circumstances would you expect an ML algorithm to outperform a traditional forecasting algorithm?
What are common challenges for successfully building forecasting pipelines with complicated datasets?
This presentation is designed to answer these questions in an engaging and informative style that will make it easy to properly implement forecasting solutions in a way that's powerful and effective.
sktime
not just a package, but also an active community which aims to be welcoming to new joiners.
We invite anyone to get involved as a developer, user, supporter (or any combination of these).
-
Europython 2023 - General sktime introduction, half-day workshop
-
PyCon Prague 2023 - Forecasting, Advanced Pipelines, Benchmarking
-
Pydata Amsterdam 2023 - Probabilistic prediction, forecasting, evaluation
-
ODSC Europe 2023 - Forecasting, Pipelines, and ML Engineering
-
Pydata London 2023 - Time Series Classification, Regression, Distances & Kernels
-
Pydata London 2022 - How to implement your own estimator in sktime
If you're interested in contributing to sktime, you can find out more how to get involved here.
Any contributions are welcome, not just code!
To run the notebooks locally, you will need:
- a local repository clone
- a python environment with required packages installed
To clone the repository locally:
git clone https://github.com/sktime/sktime-presentation-pydata-nyc-2023.git
- Create a python virtual environment:
conda create -y -n pydata_nyc_sktime python=3.9
- Install required packages:
conda install -y -n pydata_nyc_sktime pip sktime seaborn jupyter pmdarima statsmodels
- Activate your environment: `conda activate pydata_nyc_sktime
- If using jupyter: make the environment available in jupyter:
python -m ipykernel install --user --name=pydata_nyc_sktime
- Create a python virtual environment:
python -m venv pydata_nyc_sktime
- Activate your environment:
source pydata_nyc_sktime/bin/activate
- Install the requirements:
pip install sktime seaborn jupyter pmdarima statsmodels
- If using jupyter: make the environment available in jupyter:
python -m ipykernel install --user --name=pydata_nyc_sktime