Metrics: Soft dynamic-time-warping divergence for FDataGrid #412

Clej · 2022-01-13T15:48:20Z

Content
I'm proposing the soft dynamic-time-warping (SDTW) divergence as a metric between FDataGrid objects. The domain dimension (time) must be one. The co-domain dimension can be greater than one. I use it for my work and noted in #363 that asked classical dtw implementation.

This metric is a smooth version of the DTW which is defined by a min operator and depends on a pre-defined alignment cost (ie a discrepancy measure between all pairs (x(t_i), x(t_j))), eg squared euclidean cost, or roughly speaking a Gram-matrix. In SDTW, the min operator is replaced by the soft-min (through a negative log-sum-exp like function). The smoothness is controlled by gamma. SDTW divergence is positive, symmetric, indiscernable (SDTW(X, Y) = 0 <=> X=Y) but does not satisfy the triangle inequality (depends on the alignment cost). See "Differentiable Divergences Between Time Series", Blondel et al., 2021 AISTATS. Thus it can be used like a distance for clustering, k-nn classification, etc.

The main advantages of the SDTW divergence are:

it offfers possibility to compare multivariate functional samples with varying number of grid points,
it is more robust to temporal shifts than Lp distances, and thus the comparison is both temporal and geometric.

Implementation
Since SDTW is defined by the soft-min operator, computing SDTW divergence between two FDataGrid objects, X and Y, requires an iterative procedure to solve the minimization problem, that we solve with a dynamic-programming recursion in O(mn) where m and n are the respective size of the grid points of X and Y. I coded the recursion with ctyhon in sdtw_fast.pyx, and so have added a setup.py in ~/misc/metrics to generate the .c extension.

Improvements

Test: what do you suggest as unit test case(s) ?
Examples within the documentation sdtwDivergence, with different cost alignment
PairwiseMetric, tried to implement it but I did not manage to understand what to return exactly in _pairwise_metric_optimization_sdtw.

Usage:

import numpy as np
from skfda.datasets import make_sinusoidal_process
from skfda.misc.metrics import sdtwDivergence
from sklearn.gaussian_process.kernels import DotProduct, RBF

n = 50
n_grid = 10**3
x_curves = make_sinusoidal_process(
    n_samples=n,
    n_features=n_grid,
    start=0.0,
    stop=50.0,
    period=30.0,
    phase_mean=3.0,
    amplitude_mean=5.0,
    amplitude_std=1.0,
    random_state=11
)

# default alignment cost is the half squared euclidean distance
sdtw_hsqe= sdtwDivergence(gamma=1.0, cost=None)
sdtw_hsqe(x_curves[0], x_curves[1])

# which is equivalent to
cost_hsqe = DotProduct(sigma_0=0)
def cost_lin(X, Y):
    mat = -1 * cost_hsqe(X, Y)
    mat += 0.5 * cost_hsqe.diag(X)[:, np.newaxis]
    mat += 0.5 * cost_hsqe.diag(Y)
    return mat

sdtw_hsqe= sdtwDivergence(gamma=1.0, cost=cost_lin)
sdtw_hsqe(x_curves[0], x_curves[1])

# the cost can be a non-linear kernel:
rbf = RBF(length_scale=1.0)
def cost_rbf(X, Y):
    mat = -1 * rbf(X, Y)
    mat += 0.5 * rbf.diag(X)[:, np.newaxis]
    mat += 0.5 * rbf.diag(Y)
    return mat

sdtw_rbf= sdtwDivergence(gamma=1.0, cost=cost_rbf)
sdtw_rbf(x_curves[0], x_curves[1])

Any other suggestion

…sdtw

skfda/misc/metrics/sdtw_fast.c

skfda/misc/metrics/sdtw_fast.pyx

codecov · 2022-02-08T12:06:56Z

Codecov Report

Base: 85.53% // Head: 85.06% // Decreases project coverage by -0.47% ⚠️

Coverage data is based on head (f43c73b) compared to base (e69154b).
Patch coverage: 26.66% of modified lines in pull request are covered.

Additional details and impacted files

@@             Coverage Diff             @@
##           develop     #412      +/-   ##
===========================================
- Coverage    85.53%   85.06%   -0.47%     
===========================================
  Files          141      143       +2     
  Lines        11280    11370      +90     
===========================================
+ Hits          9648     9672      +24     
- Misses        1632     1698      +66

Impacted Files	Coverage Δ
skfda/misc/_math.py	`84.37% <ø> (ø)`
skfda/misc/metrics/_sdtw_distances.py	`24.63% <24.63%> (ø)`
skfda/misc/metrics/_sdtw_numba.py	`33.33% <33.33%> (ø)`

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

Clej · 2022-02-08T12:10:28Z

I wrote de code in python and used @jit decorators with fastmath=True and cache=True. Computation times are comparable to my cython version.

Clej · 2022-02-08T12:16:39Z

What is missing is the test part: any suggestion ? Checking expected properties of divergence like symmetry, positivity, indiscernibility would be enough ? @vnmabus

vnmabus

I still have to check the algorithm.

For the tests, IMHO the best test is to compare the results with an existing implementation if there is any, for a particular dataset.

skfda/misc/_math.py

skfda/misc/metrics/_sdtw_distances.py

vnmabus · 2022-02-09T17:30:14Z

skfda/misc/metrics/_sdtw_distances.py

+from skfda.misc._math import half_sq_euclidean
+
+
+def _check_shape_postive_cost_mat(cost, expected_shape, shape_only=False):


Try to use type hints even in private functions, to catch possible errors.

skfda/misc/metrics/_sdtw_distances.py

vnmabus · 2022-02-09T17:40:20Z

skfda/misc/metrics/_sdtw_distances.py

+                            " ({}, {})".format(n1, n2)
+                        )
+
+                    _check_shape_postive_cost_mat(


I don't think we should always check these things. Maybe we can check in the tests for particular data.

I know it looks inappropriate here but otherwise, when the cost matrix does not satisfay these properties the output value of the algorithm does not make any sense. That's why I put a boolean check_cost (default: False) as input.

Should I let it here or is the user supposed to check it before ?

I guess that if it is only done when check_cost is True, and it is false by default, it is not a problem.
By the way, the code can be made more compact (and better tested) by generating the costs if cost is a callable and then reusing the same code path as if cost where a sequence of ndarray.

Ok, so only a callable as for the input cost.
Actually, I could raise a warning instead of a ValueError, that lets more flexibility to the user (but increases the risk of making wrong conclusion) but at least make the user aware of the that risk WDYT ?

I didn't mean to allow only a callable. I wanted to say that after the callable is evaluated, the code path for a callable input is the same as for a ndarray, so both can be merged together.
I am not sure about the warning/exception thing. It is easier to handle an exception programatically, and you can always call the function again without checking if that is what you want. Also, if the algorithm require these preconditions maybe it shouldn't continue. Is there any case in which you would want to call the algorithm breaking the preconditions?

Sorry I've been busy too.
So I followed your suggestion, the main code path is the following:
(1) pre-computations of cost matrices,
(2) Check the properties of each cost matrix if check_cost is True,
(3) Return the sdtw value

skfda/misc/metrics/_sdtw_distances.py

vnmabus

Sorry for taking too long to review this more thoroughly, I have been busy.

docs/refs.bib

skfda/misc/metrics/_sdtw_distances.py

vnmabus · 2022-04-04T16:15:11Z

skfda/misc/metrics/_sdtw_distances.py

+    # and symmetry
+    if cost is not None:
+        if isinstance(cost, tuple):
+            if not len(cost) == 3:


I think it is better (easier to understand) to have the positive condition in the if instead of the else.

skfda/misc/metrics/_sdtw_distances.py

skfda/misc/metrics/_sdtw_numba.py

vnmabus

I think this PR is taking a great shape!

Remember to also add some tests for this feature, preferably comparing with an existing implementation and/or known results.

skfda/misc/metrics/_sdtw_distances.py

skfda/misc/metrics/_sdtw_numba.py

skfda/misc/metrics/_sdtw_distances.py

…atagrid

…etric_sdtw

vnmabus · 2022-11-03T14:00:58Z

skfda/misc/metrics/_sdtw_distances.py

+
+    # np.sum(X ** 2, axis=1)=inner_product(X, X)
+    cost_11 = 0.5 * inner_product(
+        arg1.data_matrix[0],


Is there any reason for not passing just the FDataGrid (thus using the information of the grid points to compute the functional inner product, instead of the multivariate one)?

Taking into account the grid points could be an extension yes, but the original implementation (not functional based) does not consider the grid points (this is reminded in the function header), so the functions inputs only needs the data_matrixs.

Ok, checking again the original paper I now see what this function is supposed to do.
So, it receives as input two functions $x: \mathbb{R} \to \mathbb{R}^p$, discretized at points $t_1, \ldots, t_n$, and $y: \mathbb{R} \to \mathbb{R}^p$, discretized at points $s_1, \ldots, s_m$ and computes the matrix

$$ \begin{pmatrix} d(x(t_1), y(s_1)) & \ldots & d(x(t_1), y(s_m)) \\ \vdots & & \vdots \\ d(x(t_n), y(s_1)) & \ldots & d(x(t_n), y(s_m)) \\ \end{pmatrix} $$

with $d$ a distance function $d: \mathbb{R}^p \times \mathbb{R}^p \to \mathbb{R}$, right?

If that is the case, I would put that explicitly in the documentation, and maybe check that there is only one observation per FDataGrid just in case. Of course then it makes no sense to consider the actual values of the grid points, unless you want to consider the output as a function of two variables.

Clej

I think it's okay now, except for the tests (another PR?).

Clej · 2022-06-03T11:46:10Z

skfda/misc/metrics/_sdtw_distances.py

+# @pairwise_metric_optimization.register
+# def _pairwise_metric_optimization_sdtw(
+#     metric: sdtwDivergence,
+#     elem1: FData,
+#     elem2: FData,
+# ) -> NDArrayFloat:
+
+#     return metric(elem1, elem2)


For this part, I let the code commented because I don't know how to the decorator @pairwise_metrice_optimization.register must be used. Is its goal to compute a similarity or Gram matrix in a distributed way ?

Its goal is to be able to provide a more efficient implementation of the pairwise distance matrix between two data sets. Given the datasets $X = (x_1, \ldots, x_n)$ and $Y = (y_1, \ldots y_m)$, it computes

$$ \begin{pmatrix} d(x_1, y_1) & \ldots & d(x_1, y_m) \\ \vdots & & \vdots \\ d(x_n, y_1) & \ldots & d(x_n, y_m) \\ \end{pmatrix} $$

Note that by default $d$ can compute paired distances, that is, given two FData $X' = (x'_1, \ldots, x'_n)$ and $Y' = (y'_1, \ldots, y'_n)$, both of size $n$, it computes

$$ d(X', Y') = \begin{pmatrix} d(x'_1, y'_1) \\ \vdots \\ d(x'_n, y'_n) \\ \end{pmatrix} $$

By default, what it is done (if you do not implement this optimization) is to build the datagrids $X' = (x_1, \ldots, x_1, x_2, \ldots, x_2, \ldots, x_n, \ldots, x_n)$ and $Y' = (y_1, y_2, \ldots, y_m, y_1, y_2, \ldots, y_m, \ldots, y_1, y_2, \ldots, y_m)$ and compute their paired distance. This works, but performs badly and consume lots of memory. Note: I want to try alternatively to iterate over the $x_i$ and compute $d(x_i, Y)$. This will certainly reduce memory consumption and will probably be faster due to cache effects, even if it does an iteration in pure Python, but I did not have time to try.

skfda/misc/metrics/_sdtw_distances.py

skfda/misc/metrics/_sdtw_numba.py

skfda/misc/metrics/_sdtw_distances.py

Clej · 2023-01-19T11:59:08Z

skfda/misc/metrics/_sdtw_distances.py

+
+    # np.sum(X ** 2, axis=1)=inner_product(X, X)
+    cost_11 = 0.5 * inner_product(
+        arg1.data_matrix[0],


Taking into account the grid points could be an extension yes, but the original implementation (not functional based) does not consider the grid points (this is reminded in the function header), so the functions inputs only needs the data_matrixs.

vnmabus

I would rather have some tests (comparing with known results) as part of this PR, to guarantee that it works.

If you also want to do an example for the documentation, that could be in a separate PR.

vnmabus · 2023-02-07T20:15:31Z

skfda/misc/metrics/_sdtw_distances.py

+
+    # np.sum(X ** 2, axis=1)=inner_product(X, X)
+    cost_11 = 0.5 * inner_product(
+        arg1.data_matrix[0],


Ok, checking again the original paper I now see what this function is supposed to do.
So, it receives as input two functions $x: \mathbb{R} \to \mathbb{R}^p$, discretized at points $t_1, \ldots, t_n$, and $y: \mathbb{R} \to \mathbb{R}^p$, discretized at points $s_1, \ldots, s_m$ and computes the matrix

$$ \begin{pmatrix} d(x(t_1), y(s_1)) & \ldots & d(x(t_1), y(s_m)) \\ \vdots & & \vdots \\ d(x(t_n), y(s_1)) & \ldots & d(x(t_n), y(s_m)) \\ \end{pmatrix} $$

with $d$ a distance function $d: \mathbb{R}^p \times \mathbb{R}^p \to \mathbb{R}$, right?

If that is the case, I would put that explicitly in the documentation, and maybe check that there is only one observation per FDataGrid just in case. Of course then it makes no sense to consider the actual values of the grid points, unless you want to consider the output as a function of two variables.

vnmabus · 2023-02-07T20:57:39Z

skfda/misc/metrics/_sdtw_distances.py

+# @pairwise_metric_optimization.register
+# def _pairwise_metric_optimization_sdtw(
+#     metric: sdtwDivergence,
+#     elem1: FData,
+#     elem2: FData,
+# ) -> NDArrayFloat:
+
+#     return metric(elem1, elem2)


Its goal is to be able to provide a more efficient implementation of the pairwise distance matrix between two data sets. Given the datasets $X = (x_1, \ldots, x_n)$ and $Y = (y_1, \ldots y_m)$, it computes

$$ \begin{pmatrix} d(x_1, y_1) & \ldots & d(x_1, y_m) \\ \vdots & & \vdots \\ d(x_n, y_1) & \ldots & d(x_n, y_m) \\ \end{pmatrix} $$

Note that by default $d$ can compute paired distances, that is, given two FData $X' = (x'_1, \ldots, x'_n)$ and $Y' = (y'_1, \ldots, y'_n)$, both of size $n$, it computes

$$ d(X', Y') = \begin{pmatrix} d(x'_1, y'_1) \\ \vdots \\ d(x'_n, y'_n) \\ \end{pmatrix} $$

By default, what it is done (if you do not implement this optimization) is to build the datagrids $X' = (x_1, \ldots, x_1, x_2, \ldots, x_2, \ldots, x_n, \ldots, x_n)$ and $Y' = (y_1, y_2, \ldots, y_m, y_1, y_2, \ldots, y_m, \ldots, y_1, y_2, \ldots, y_m)$ and compute their paired distance. This works, but performs badly and consume lots of memory. Note: I want to try alternatively to iterate over the $x_i$ and compute $d(x_i, Y)$. This will certainly reduce memory consumption and will probably be faster due to cache effects, even if it does an iteration in pure Python, but I did not have time to try.

vnmabus · 2023-02-16T16:16:35Z

skfda/misc/metrics/_sdtw_distances.py

+) -> None:
+    """check compatible, one sample and one-dimensional"""
+
+    if (fdata1.dim_domain != fdata2.dim_domain):


Maybe you can reuse here the new functions check_fdata_same_dimensions and check_fdata_dimensions in skfda.misc.validation.

vnmabus · 2023-02-20T11:24:25Z

skfda/misc/metrics/_sdtw_distances.py

+    )
+
+
+class sdtwDivergence():


Class names start with uppercase. Thus it would be SDTWDivergence.

skfda/misc/metrics/_sdtw_numba.py

vnmabus · 2023-02-20T11:32:04Z

skfda/misc/metrics/_sdtw_numba.py

+
+
+@njit(cache=True)
+def _sdtw(


Do you know if this is this the same (or similar) dynamic programming algorithm used for Fisher-Rao registration? If it is, we could also use it in that case and remove the dependency on fdasrsf.

Clej added 4 commits January 12, 2022 19:44

add soft-DTW divergence to metrics

0522905

add soft-dtw divergence paper references

44863e3

Merge branch 'metric_sdtw' of github.com:Clej/scikit-fda into metric_…

12054fb

…sdtw

code style correction

f6c6254

vnmabus reviewed Jan 14, 2022

View reviewed changes

skfda/misc/metrics/sdtw_fast.c Outdated Show resolved Hide resolved

skfda/misc/metrics/sdtw_fast.pyx Outdated Show resolved Hide resolved

Clej added 2 commits February 7, 2022 12:33

conversion of sdtw in cython to python@numba

37b5b16

sdtw cython version replaced by numba version

10de3d6

vnmabus requested changes Feb 10, 2022

View reviewed changes

Clej added 2 commits February 15, 2022 12:00

np.einsum in inner_product_matrix replaced by np.dot

0079ce4

replaced np.sum(X**2, axis=1) by inner_product

7a0c5c9

vnmabus reviewed Feb 15, 2022

View reviewed changes

skfda/misc/metrics/_sdtw_distances.py Outdated Show resolved Hide resolved

format, type hints

1668b20

vnmabus requested changes Apr 4, 2022

View reviewed changes

corrections asked by mantainers

81af9e3

vnmabus requested changes Jun 3, 2022

View reviewed changes

Clej and others added 12 commits October 19, 2022 11:32

tabs replaced by spaces

1881d20

shape_only parameter removed

bca8384

np.all corrected by np.any in indiscernability check

2cad642

isinstance in if's and new variable for evaluated cost

c4e1af3

isinstance in if's and new variable for evaluated cost

24fddea

set hse cost as default and constrain callable cost to be computed fd…

7d42906

…atagrid

Merge branch 'develop' into metric_sdtw

1ae216e

Merge branch 'develop' into metric_sdtw

e01ae84

remove fastmath

6c9ed49

Merge remote-tracking branch 'refs/remotes/origin/metric_sdtw' into m…

4128364

…etric_sdtw

make hse cost function more concise

07bc966

Remove extra quotation mark

a54d0cf

vnmabus reviewed Nov 3, 2022

View reviewed changes

Clej added 5 commits January 19, 2023 13:49

add comment

98e7dbb

Merge remote-tracking branch 'origin' into metric_sdtw

d4de35f

add type annotations

9ca71d1

Merge branch 'develop' into metric_sdtw

e5928fa

pep8 corrections

f43c73b

Clej commented Feb 6, 2023

View reviewed changes

vnmabus requested changes Feb 20, 2023

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Metrics: Soft dynamic-time-warping divergence for FDataGrid #412

Metrics: Soft dynamic-time-warping divergence for FDataGrid #412

Clej commented Jan 13, 2022 •

edited

Loading

codecov bot commented Feb 8, 2022 •

edited

Loading

Clej commented Feb 8, 2022 •

edited

Loading

Clej commented Feb 8, 2022

vnmabus left a comment

vnmabus Feb 9, 2022

vnmabus Feb 9, 2022

Clej Feb 15, 2022

vnmabus Feb 15, 2022

Clej Feb 15, 2022

vnmabus Feb 15, 2022

Clej May 19, 2022

vnmabus left a comment

vnmabus Apr 4, 2022

Clej May 31, 2022

vnmabus left a comment

vnmabus Nov 3, 2022

Clej Jan 19, 2023

vnmabus Feb 7, 2023

Clej left a comment

Clej Jun 3, 2022

vnmabus Feb 7, 2023

Clej Jan 19, 2023

vnmabus left a comment

vnmabus Feb 7, 2023

vnmabus Feb 7, 2023

vnmabus Feb 16, 2023

vnmabus Feb 20, 2023

vnmabus Feb 20, 2023

		from skfda.misc._math import half_sq_euclidean


		def _check_shape_postive_cost_mat(cost, expected_shape, shape_only=False):

		)


		class sdtwDivergence():

Metrics: Soft dynamic-time-warping divergence for FDataGrid #412

Are you sure you want to change the base?

Metrics: Soft dynamic-time-warping divergence for FDataGrid #412

Conversation

Clej commented Jan 13, 2022 • edited Loading

codecov bot commented Feb 8, 2022 • edited Loading

Codecov Report

Clej commented Feb 8, 2022 • edited Loading

Clej commented Feb 8, 2022

vnmabus left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vnmabus left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vnmabus left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Clej left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vnmabus left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Clej commented Jan 13, 2022 •

edited

Loading

codecov bot commented Feb 8, 2022 •

edited

Loading

Clej commented Feb 8, 2022 •

edited

Loading