Jaunty Estimation of Hierarchical Time Series Clustering
python setup.py install
JET is a Scikit-Learn BaseEstimator
with ClusterMixin
class. It has a fit_predict
function that expects a list of time series (List[np.ndarray]
) that will be clustered. The available time series distance measures are Shape Based Distance, Move Split Merge, and Dynamic Time Warping. These measure can handle only univariate time series. Therefore, JET can handle only univariate time series, too.
import numpy as np
from jet import JET, JETMetric
from scipy.cluster.hierarchy import dendrogram
import matplotlib.pyplot as plt
# generate 100 random example time series with lengths between 30 and 50
list_of_time_series = [np.random.rand(np.random.randint(30, 50)) for _ in range(100)]
jet = JET(
n_clusters=10, # number of clusters to find: $c$ in paper
n_pre_clusters=None, # number of pre-clusters to find: $c_{pre}$ in paper; default is $3\sqrt{n}$ (3*np.sqrt(len(X))) if None is set
n_jobs=1, # number of parallel jobs
verbose=False, # output status messages
metric=JETMetric.SHAPE_BASED_DISTANCE, # distance metric for time series distances; Options: SHAPE_BASED_DISTANCE, MSM, DTW, or custom
c = 700 # cost parameter for MSM distance metric
)
# returns cluster label for each time series
labels = jet.fit_predict(list_of_time_series)
# plot the dendrogram
dendrogram(jet._ward_clustering._linkage_matrix)
plt.show()
You can define your own distance measure function as shown below. (This enables you to cluster also multivariate time series if you have a suitable measure!)
import numpy as np
from jet import JET, JETMetric
def custom_distance_measure(x: np.ndarray, y: np.ndarray) -> float:
min_len = min(len(x), len(y))
distance = np.power(x[:min_len] - y[:min_len], 2)
return distance
jet = JET(
n_clusters=10,
metric=JETMetric(custom_distance_measure)
)
Code for the experiments was created with Tidewater and is described in the README.
@article{wenig2024jet,
title={JET: Fast Estimation of Hierarchical Time Series Clustering},
author={Wenig, Phillip and H{\"o}fgen, Mathias and Papenbrock, Thorsten},
journal={Engineering Proceedings},
volume={68},
number={1},
pages={37},
year={2024},
publisher={MDPI}
}