This is a quick code tutorial that demonstrates how you can compute the MPDist based pairwise distance matrix. This distance matrix can be used in any clustering algorithm that allows for a custom distance matrix.
from matrixprofile.algorithms.hierarchical_clustering import pairwise_dist import numpy as np
Class docstring: Utility function to compute all pairwise distances between the timeseries using MPDist. Note ---- scipy.spatial.distance.pdist cannot be used because they do not allow for jagged arrays, however their code was used as a reference in creating this function. https://github.com/scipy/scipy/blob/master/scipy/spatial/distance.py#L2039 Parameters ---------- X : array_like An array_like object containing time series to compute distances for. window_size : int The window size to use in computing the MPDist. threshold : float The threshold used to compute MPDist. n_jobs : int Number of CPU cores to use during computation. Returns ------- Y : np.ndarray Returns a condensed distance matrix Y. For each :math:`i` and :math:`j` (where :math:`i<j<m`),where m is the number of original observations. The metric ``dist(u=X[i], v=X[j])`` is computed and stored in entry ``ij``. Call docstring: Call self as a function.
This function computes a condensed distance matrix for all time series of interest. Below is an example of computing the distance matrix on a handful of randomly generated time series.
# generate 5 random time series data =  size = 100 for _ in range(5): data.append(np.random.uniform(size=size))
window_size = 8 n_jobs = 4 distance_matrix = pairwise_dist(data, window_size=window_size, n_jobs=n_jobs)
array([1.2334854 , 1.13236744, 1.124416 , 1.17065294, 1.14144607, 1.2107359 , 1.08488366, 1.09598017, 0.98853814, 0.98214056])
Converting to Square Form¶
Some clustering algorithms require the distance matrix to be square. In this case, we simply convert it.
from scipy.spatial.distance import squareform
square_distance_matrix = squareform(distance_matrix)
array([[0. , 1.2334854 , 1.13236744, 1.124416 , 1.17065294], [1.2334854 , 0. , 1.14144607, 1.2107359 , 1.08488366], [1.13236744, 1.14144607, 0. , 1.09598017, 0.98853814], [1.124416 , 1.2107359 , 1.09598017, 0. , 0.98214056], [1.17065294, 1.08488366, 0.98853814, 0.98214056, 0. ]])