Computes the Matrix Profile and Profile Index for Univariate Time Series.
mpx( data, window_size, query = NULL, exclusion_zone = 0.5, idxs = TRUE, distance = c("euclidean", "pearson"), n_workers = 1, progress = TRUE )
data | Required. Any 1-dimension series of numbers ( |
---|---|
window_size | Required. An integer defining the rolling window size. |
query | Optional. Another 1-dimension series of numbers for an AB-join similarity. Default is |
exclusion_zone | A numeric. Defines the size of the area around the rolling window that will be ignored to avoid
trivial matches. Default is |
idxs | A logical. Specifies if the computation will return the Profile Index or not. Defaults to |
distance | A string. Currently accepts |
n_workers | An integer. The number of threads using for computing. Defaults to |
progress | A logical. If |
Returns a list with the Matrix Profile, Profile Index (if idxs
is TRUE
), and some information about the
settings used to build it.
This algorithm was developed apart from the main Matrix Profile branch that relies on Fast Fourier Transform
(FFT) at least in one part of the process. This algorithm doesn't use FFT and is several times faster. It also
relies on Ogita's work to better precision computing mean and standard deviation (part of the process). About
progress
, it is really recommended to use it as feedback for long computations. It indeed adds some (neglectable)
overhead, but the benefit of knowing that your computer is still computing is much bigger than the seconds you may
lose in the final benchmark. About n_workers
, for Windows systems, this package uses TBB for multithreading, and
Linux and macOS, use TinyThread++. This may or not raise some issues in the future, so we must be aware of slower
processing due to different mutexes implementations or even unexpected crashes. The Windows version is usually more
reliable. The data
and query
parameters will be internally converted to a single vector using as.numeric()
,
thus, bear in mind that a multidimensional matrix may not work as you expect, but most 1-dimensional data types
will work normally. If query
is provided, expect the same pre-procesment done for data
; in addition,
exclusion_zone
will be ignored and set to 0
. Both data
and query
doesn't need to have the same size and
they can be interchanged if both are provided. The difference will be in the returning object. AB-Join returns the
Matrix Profile 'A' and 'B' i.e., the distance between a rolling window from query to data and from data to query.
#> <simpleError in mpx_rcpp(data, window_size, ez, as.logical(idxs), as.logical(dist), as.logical(progress)): object '_matrixprofiler_mpx_rcpp' not found># }