SVM Outlier detector #814

mauicv · 2023-06-12T16:16:49Z

What is this?

This PR adds the SVM outlier detector. Example notebook here.

Runtime comparisons:

Note that because I've allowed the user to set the optimization by the kwarg rather than using their choice of device this means the user can run the sdg method on the gpu. In reality, this just means that the Nystrome approximation will be performed on the GPU and but the SVM will be fit on the CPU.

see this notebook. x_ref here is a dataset sampled from a normal distribution in 512 dimensions. dsize is the size of the dataset sampled. There are four combinations for each of device = 'cpu'/'gpu' or optimization equal to gd, sdg. We want to see gpu-gd as the lowest curve.

TODO:

Notes:

SVM is a kernel detector by default and uses GaussianRBF which isn't torch-scriptable yet. Hence this PR will not implement torch scriptable functionality for this detector.
The GaussianRBF is incompatible with the SKlearn backend which isn't really an issue because SKlearn users can specify the kernel as a string and this behaviour is exposed through the wrapper API. Will probably require a note in the docs though.
Originally this was implemented with the option of two separate backends. The PyTorch backend performs well for GPUs and the Sklearn backend performs well on CPUs. The benefit to this approach was that the Sklearn backend is not dependent on the PyTorch dependency however this came at the cost of the user having to use Sklearn kernels which have to be configured in a different way to our Kernels. This means that the initialization of each detector is quite different. Having a different API for each backend is likely to confuse users and will require lots of documentation to help them navigate. The value add for these detectors is primarily that they have GPU support. If users still want to use the CPU then the torch dependency shouldn’t be a massive blocker. Instead, this was changed to have GdSVMDetector and SdgSVMDetector both of which will be torch backends but not selectable by the user through the backend kwarg. Instead, the user specifies an optimization kwarg. This is similar to the PCA linear and kernel backends (here and here) so it's not a massive break from the approach taken elsewhere.

codecov · 2023-06-12T17:10:40Z

Codecov Report

Merging #814 (7a64454) into master (5e69f4b) will increase coverage by 0.20%.
The diff coverage is 98.78%.

Additional details and impacted files

@@            Coverage Diff             @@
##           master     #814      +/-   ##
==========================================
+ Coverage   81.63%   81.84%   +0.20%     
==========================================
  Files         156      159       +3     
  Lines       10152    10317     +165     
==========================================
+ Hits         8288     8444     +156     
- Misses       1864     1873       +9

Impacted Files	Coverage Δ
alibi_detect/od/_svm.py	`97.50% <97.50%> (ø)`
alibi_detect/od/pytorch/svm.py	`99.16% <99.16%> (ø)`
alibi_detect/od/pytorch/__init__.py	`100.00% <100.00%> (ø)`
alibi_detect/utils/pytorch/losses.py	`100.00% <100.00%> (ø)`

... and 4 files with indirect coverage changes

jklaise

Nice one! A few comments, mostly to do with tidiness.

jklaise · 2023-06-20T12:42:30Z

alibi_detect/utils/pytorch/losses.py

+
+
+def hinge_loss(preds: torch.Tensor) -> torch.Tensor:
+    "L(pred) = max(0, 1-pred) summed over multiple preds"


nit: "summed" -> "averaged" ?

👍🏻 (checking Janis' requested changes whilst he is away)

jklaise · 2023-06-20T12:49:52Z

alibi_detect/od/_svm.py

+        backend: Literal['pytorch', 'sklearn'] = 'sklearn',
+        device: Optional[Union[Literal['cuda', 'gpu', 'cpu'], 'torch.device']] = None,
+        kernel: Union['torch.nn.Module', Literal['linear', 'poly', 'rbf', 'sigmoid']] = 'rbf',
+        sigma: Optional[float] = None,
+        kernel_params: Optional[Dict] = None,


Argument order is different from other detectors that take a kernel (e.g. PCA), please double check.

This should be one of the tasks before the public release - consistency of public APIs wrt args (order, naming, types etc.)

I've added an issue to here, I'll leave it for now as it makes sense to just do this all at once at the end.

jklaise · 2023-06-20T12:51:37Z

alibi_detect/od/_svm.py

+        directly through its primal formulation. The Nystroem approximation can optionally be used to speed up
+        training and inference by approximating the kernel's RKHS.
+
+        We provide two backends for the one class svm, one based on PyTorch and one based on scikit-learn. The


one class -> one-class

jklaise · 2023-06-20T12:53:09Z

alibi_detect/od/_svm.py

+
+        Parameters
+        ----------
+        kernel


Order of args different again and some missing, see also above note.

jklaise · 2023-06-20T12:55:59Z

alibi_detect/od/_svm.py

+            Additional parameters (keyword arguments) for kernel function passed as a dictionary. Only used for the
+            '``sklearn``' backend and the kernel is a custom function.


Confusing grammar and wording - not clear what's possible. (and ->or if?)

Note: No longer relevant due to other changes...

jklaise · 2023-06-20T13:19:03Z

alibi_detect/od/sklearn/svm.py

+            nu=fit_kwargs.get('nu', 0.5),
+            tol=fit_kwargs.get('tol', 1e-3),
+            max_iter=fit_kwargs.get('max_iter', 1000),
+            verbose=fit_kwargs.get('verbose', 0),


Slightly worried that defaults are defined in two places - here and in fit. Is there a better way of doing this?

Good point, I've made an issue for this as I use the same pattern in the gmm detector.

jklaise · 2023-06-20T13:20:02Z

alibi_detect/od/sklearn/svm.py

+        """
+        self.check_fitted()
+        x = self.nystroem.transform(x)
+        return - self.gmm.score_samples(x)


nit: remove space after the minus sign (assuming it's not a bug to have a minus sign!)

jklaise · 2023-06-20T13:20:23Z

alibi_detect/od/sklearn/svm.py

+            kernel_params=self.kernel_params,
+        )
+        x_ref = self.nystroem.fit(x_ref).transform(x_ref)
+        self.gmm = SGDOneClassSVM(


Is the name gmm misleading?

🤦 Good catch, removed!

jklaise · 2023-06-20T13:26:05Z

alibi_detect/od/pytorch/svm.py

+        if isinstance(self.kernel, str):
+            if self.kernel not in ['rbf']:
+                raise ValueError(
+                    f'Currently only the rbf Kernel is supported for the SVM torch backend, got {self.kernel}.'


This is a bit misleading as the user can use a custom kernel.

jklaise · 2023-06-20T13:28:17Z

alibi_detect/od/pytorch/svm.py

+            self.n_components
+        )
+
+        X_nys = self.nystroem.fit(x_ref).transform(x_ref)


Noting that in sklearn backend the transformed data was called x_ref, although I think X_nys is clearer. Would suggest making the change in sklearn backend for consistency (and suggest sticking with lower-case convention, so x_nys).

ojcobb · 2023-06-21T14:45:32Z

alibi_detect/od/_svm.py

+
+        The Support vector machine outlier detector fits a one-class SVM to the reference data.
+
+        Rather than the typical approach of optimizing the exact kernel OCSVM objective through a dual formulation,


I wonder if this might be a bit too much detail for the highest level docstring. However I do intend to iterate slightly on these docstrings when doing the documentation so happy to leave until then.

Ok, I'll leave it for now.

ojcobb · 2023-06-21T14:46:05Z

alibi_detect/od/_svm.py

+
+        Rather than the typical approach of optimizing the exact kernel OCSVM objective through a dual formulation,
+        here we instead map the data into the kernel's RKHS and then solve the linear optimization problem
+        directly through its primal formulation. The Nystroem approximation can optionally be used to speed up


The Nystroem approximation isn't optional

ojcobb · 2023-06-21T14:48:44Z

alibi_detect/od/_svm.py

+        Parameters
+        ----------
+        kernel
+            Used to define similarity between data points. If using the pytorch backend, this can be either a string


The second sentence here doesn't make sense

ojcobb · 2023-06-21T15:07:55Z

alibi_detect/od/_svm.py

+        n_components: Optional[int] = None,
+        backend: Literal['pytorch', 'sklearn'] = 'sklearn',
+        device: Optional[Union[Literal['cuda', 'gpu', 'cpu'], 'torch.device']] = None,
+        kernel: Union['torch.nn.Module', Literal['linear', 'poly', 'rbf', 'sigmoid']] = 'rbf',


The kernel definition here has become a little unwieldy and inconsistent with how we do things elsewhere. Also, although it is technically possible, users should not be specifying linear or polynomial kernels for this detector as the method relies on the kernel inducing an infinite dimensional RKHS.

I wonder if we instead encourage users to pass kernels in exactly the same way as they do for other detectors (so here we'd just have the kernel kwarg and not the sigma and kernel_params kwarg). We then always do our own nystrtoem approximation to produce kernalised features and only rely on sklearn for the OCSVM part?

I think consistency across detector definitions is particularly important given we envisage the primary use-case of the detector is as a component within a large ensemble.

See PR notes for update on solution taken

ojcobb · 2023-06-21T16:01:49Z

alibi_detect/od/pytorch/svm.py

+            if self.kernel == 'rbf':
+                if self.sigma is not None:
+                    sigma = torch.tensor(self.sigma, device=self.device)
+                self.kernel = GaussianRBF(sigma=sigma)


Here if self.sigma is None then sigma is not defined and sigma=sigma causes an error

ojcobb · 2023-06-30T08:53:17Z

alibi_detect/od/_svm.py

+backends = {
+    'pytorch': {
+        'sgd': SdgSVMTorch,
+        'gd': DgSVMTorch


Here the class names have the d and g the wrong way round. Also could we perhaps use "batch gradient descent" (so BgdSVMTorch) for the Pytorch variant? I think it makes the distinction clearer as just gradient descent is general enough that it could refer to both variants.

Agree with BGD.

ojcobb · 2023-06-30T08:57:36Z

alibi_detect/od/_svm.py

+    def __init__(
+        self,
+        nu: float,
+        kernel: 'torch.nn.Module',


Maybe we could pass the gaussian RBF as a default here?

ojcobb · 2023-06-30T08:58:51Z

alibi_detect/od/_svm.py

+        Parameters
+        ----------
+        nu
+            The proportion of the training data that should be considered outliers. Note that this does not necessarily


Lets make clear this should be thought of as a regularisation parameter that affects how smooth the decsion boundary will be.

Ok, I've basically just added your comment on at the end:

The proportion of the training data that should be considered outliers. Note that this does not necessarily correspond to the false positive rate on test data, which is still defined when calling the infer_threshold method. nu should be thought of as a regularization parameter that affects how smooth the svm decision boundary is.

ojcobb · 2023-06-30T10:24:44Z

alibi_detect/od/pytorch/svm.py

+        self.check_fitted()
+        x_nys = self.nystroem.transform(x)
+        x_nys = x_nys.cpu().numpy()
+        return self._to_tensor(-self.svm.score_samples(x_nys))


Currently, due to differences between how we and sklearn set up our optimisations, scores returned via sgd vs gd variants differ by a linear transformation. Lets fix the intercept at 1 and scale coefficients accordingly such that we have consistency in this regard. (Note that out intercept vairable is equal to sklearn's offset variable as intercept = 1-offset.)

ojcobb · 2023-06-30T10:29:12Z

alibi_detect/od/_svm.py

+}
+
+
+class SVM(BaseDetector, ThresholdMixin, FitMixin):


Can we pass None as the default for n_components?

ojcobb · 2023-06-30T10:36:00Z

alibi_detect/od/pytorch/svm.py

+        """
+        if (isinstance(device, str) and device in ('gpu', 'cuda')) or \
+                (isinstance(device, torch.device) and device.type == 'cuda'):
+            warnings.warn(('The `sgd` optimization option is best suited for CPU. If '


Can we modify to convey that sgd will run on cpu regardless and that in this case only the nystroem approximation will be done on gpu.

ojcobb · 2023-06-30T10:47:29Z

Much cleaner detector definition and abstractions now - definitely a good shout to have user specify optimisation method themselves. Nice one!

ascillitoe · 2023-07-03T14:29:40Z

alibi_detect/od/_svm.py

+        kernel
+            Kernel function to use for outlier detection. Should be an instance of a subclass of `torch.nn.Module`.
+        n_components
+            Number of components in the Nystroem approximation By default uses all of them.


, needed after approximation.

ascillitoe · 2023-07-03T14:59:50Z

alibi_detect/od/_svm.py

+from alibi_detect.base import (BaseDetector, FitMixin, ThresholdMixin,
+                               outlier_prediction_dict)
+from alibi_detect.exceptions import _catch_error as catch_error
+from alibi_detect.od.pytorch import DgSVMTorch, SdgSVMTorch


Nitpick: a little atypical to have the abbreviation SVM in all caps, but SGD as Sgd. However, on the other hand, SGDSVM is a little hard to read... Maybe add a _, or just leave...

Yeah, for some reason SGD_SVMTorch and SgdSvmTorch seems worse than SgdSVMTorch to me. I'm ambivalent though, @ojcobb, any preference?

ascillitoe · 2023-07-03T15:29:25Z

alibi_detect/od/_svm.py

+            The number of iterations over which the loss must decrease by `tol` in order for optimization to continue.
+            This is only used for the ``'gd'`` optimization..
+        verbose
+            Verbosity level during training. ``0`` is silent, ``1`` a progress bar or sklearn training output for the


verbose description is referring to sklearn implementation only. Unclear if verbose is still relevant when the torch backend is used. Would be good to make the description more generic, or be more descriptive as to when the verbose kwarg is relevant.

ascillitoe · 2023-07-03T15:31:43Z

alibi_detect/od/_svm.py

+            as a tuple of the form `(min_eta, max_eta)` and only used for the ``'gd'`` optimization.
+        n_step_sizes
+            The number of step sizes in the defined range to be tested for loss reduction. This many points are spaced
+            equidistantly along the range in log space. This is only used for the ``'gd'`` optimization.


maybe spaced equidistantly -> uniformly distributed?

I've changed it to spaced evenly!

ascillitoe · 2023-07-03T15:39:01Z

alibi_detect/od/pytorch/svm.py

+        max_iter: int = 1000,
+        verbose: int = 0,
+    ) -> Dict:
+        """Fit the SVM detector.


If we are being more descriptive about what is being fit in SgdSVMTorch.fit method, is it worth being slightly more descriptive here? i.e. Fit the pytorch SVM detector or similar?

ascillitoe · 2023-07-03T15:39:36Z

alibi_detect/od/pytorch/svm.py

+
+        Returns
+        -------
+        Dictionary with fit results. The dictionary contains the following keys


keys -> keys: ?

ascillitoe

I've attempted to review the changes requested by @jklaise, and it looks like they've all been addressed. Also agree the new hidden sklearn backend approach is far cleaner.

Except for a few nitpicks (and @ojcobb's comments), everything else looks good! Although only moderate confidence...

ojcobb · 2023-07-06T16:56:48Z

alibi_detect/od/_svm.py

+        x_ref: np.ndarray,
+        tol: float = 1e-6,
+        max_iter: int = 1000,
+        step_size_range: Tuple[float, float] = (1e-6, 1.0),


Can we change default step_size_range to (1e-8, 1.0)?

Sorry, missed this! Now fixed!

ojcobb

LGTM!

Add inital implemnetation of svm

50837c8

mauicv added the Type: New method New method proposals label Jun 12, 2023

mauicv self-assigned this Jun 12, 2023

mauicv changed the title ~~Add inital implemnetation of svm~~ SVM Outlier detector Jun 12, 2023

mauicv added 16 commits June 13, 2023 10:14

Fix import error

7d21048

Fix optional dep tests

c7959cb

Add svm detector tests

cd29d43

Add tqdm pbar

3eab82b

Fix device logic

67a056e

Fix device logic

abeee34

Fix device logic

2e91cfa

Allow n_components to be None

4249d4e

Integrate sklearn backend and add tests

9d2835a

Add sigma parameter to svm implementation

bb1359f

Replace gamma with sigma

fd8da0f

Add better support for kernel

b5f0014

Update docstrings for svm detector

f8d58a3

Fix mypy errors

4ba9ff4

Minor change

7b648e0

Minor changes

4b4694e

mauicv requested review from jklaise and ojcobb June 15, 2023 13:51

jklaise requested changes Jun 20, 2023

View reviewed changes

ojcobb reviewed Jun 21, 2023

View reviewed changes

Replace sklearn backend with seperate algorithm classes

5098c9d

mauicv added 3 commits June 28, 2023 15:05

Minor changes

7e2588b

Make requested PR changes

d06f530

Minor changes

b32f2f3

mauicv requested review from ojcobb and ascillitoe June 29, 2023 11:19

ojcobb reviewed Jun 30, 2023

View reviewed changes

ascillitoe reviewed Jul 3, 2023

View reviewed changes

ascillitoe approved these changes Jul 3, 2023

View reviewed changes

mauicv added 5 commits July 4, 2023 17:30

Replace GdSVMTorch with BgdSVMTorch

e6f936c

Make svm requested SVM PR changes

96ef6a7

Fix optional dep tests

a623528

Normalize svm so that intercept is 1

9401378

Normalize svm coefs when scoring

ed689d0

ojcobb reviewed Jul 6, 2023

View reviewed changes

ojcobb approved these changes Jul 6, 2023

View reviewed changes

mauicv added 3 commits July 7, 2023 11:12

Change step_size_range lower bound value

bc8b2f2

Change step_size_range in other defaults too

b3044bf

Minor change

7a64454

mauicv merged commit e3771c3 into SeldonIO:master Jul 7, 2023
11 checks passed



		def hinge_loss(preds: torch.Tensor) -> torch.Tensor:
		"L(pred) = max(0, 1-pred) summed over multiple preds"

		Additional parameters (keyword arguments) for kernel function passed as a dictionary. Only used for the
		'``sklearn``' backend and the kernel is a custom function.


		The Support vector machine outlier detector fits a one-class SVM to the reference data.

		Rather than the typical approach of optimizing the exact kernel OCSVM objective through a dual formulation,

SVM Outlier detector #814

SVM Outlier detector #814

Conversation

mauicv commented Jun 12, 2023 • edited Loading

What is this?

Runtime comparisons:

TODO:

Notes:

codecov bot commented Jun 12, 2023 • edited Loading

Codecov Report

jklaise left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mauicv Jun 27, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ojcobb Jun 30, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ojcobb commented Jun 30, 2023

ascillitoe Jul 3, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ascillitoe Jul 3, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ascillitoe Jul 3, 2023 • edited Loading

Choose a reason for hiding this comment

ascillitoe left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ojcobb left a comment

Choose a reason for hiding this comment

mauicv commented Jun 12, 2023 •

edited

Loading

codecov bot commented Jun 12, 2023 •

edited

Loading

mauicv Jun 27, 2023 •

edited

Loading

ojcobb Jun 30, 2023 •

edited

Loading

ascillitoe Jul 3, 2023 •

edited

Loading

ascillitoe Jul 3, 2023 •

edited

Loading

ascillitoe Jul 3, 2023 •

edited

Loading