Clustering: Computing the Pairwise Distance Matrix
Learn how to compute a MPDist based pairwise distance matrix for clustering.
This is a quick code tutorial that demonstrates how you can compute the MPDist based pairwise distance matrix. This distance matrix can be used in any clustering algorithm that allows for a custom distance matrix.
In [1]:
from matrixprofile.algorithms.hierarchical_clustering import pairwise_dist
import numpy as np
In [2]:
%pdoc pairwise_dist
This function computes a condensed distance matrix for all time series of interest. Below is an example of computing the distance matrix on a handful of randomly generated time series.
In [3]:
# generate 5 random time series
data = []
size = 100
for _ in range(5):
data.append(np.random.uniform(size=size))
In [4]:
window_size = 8
n_jobs = 4
distance_matrix = pairwise_dist(data, window_size=window_size, n_jobs=n_jobs)
In [5]:
distance_matrix
Out[5]:
Converting to Square Form¶
Some clustering algorithms require the distance matrix to be square. In this case, we simply convert it.
In [6]:
from scipy.spatial.distance import squareform
In [7]:
square_distance_matrix = squareform(distance_matrix)
In [8]:
square_distance_matrix
Out[8]:
Comments
Comments powered by Disqus