Feature LOF outlier detector #746

mauicv · 2023-02-21T14:41:51Z

What is this

Implementation of lof outlier detector. See this notebook for example. Branches off gmm-od.

TODO:

ojcobb · 2023-05-26T12:47:48Z

alibi_detect/od/_lof.py

+        values then the score method uses the distance/kernel similarity to each of the specified `k` neighbors.
+        In the latter case, an `aggregator` must be specified to aggregate the scores.
+
+        Note that, in the multiple k case, a normalizer can be provided. If a normalizer is passed then it is fit in


I think we should be a little clearer as to what the normalizer and aggregator refer to as it's not clear here or from the kwarg descriptions. I realise this applied to KNN too.

Agreed, I've opened an issue here. I'll include it in a final clean up PR i think

ojcobb · 2023-05-26T12:49:34Z

alibi_detect/od/_lof.py

+    def score(self, x: np.ndarray) -> np.ndarray:
+        """Score `x` instances using the detector.
+
+        Computes the local outlier factor for each instance in `x`. If `k` is an array of values then the score for


Maybe worth just noting here that the outlier factor is the density of each instance in x relative to those of its neighbours in x_ref.

ojcobb · 2023-05-26T13:57:39Z

alibi_detect/od/pytorch/lof.py

+        return mask
+
+    def _compute_K(self, x, y):
+        """Compute the distance/similarity matrix matrix between `x` and `y`."""


Could we remove "/similarity" here? A similarity matrix would have entries that icnrease with similarity, whereas this is the opposite

ojcobb · 2023-05-26T14:25:30Z

Ran through all the code and couldn't find anything worth commenting on! Just the nitpicks on docstrings above. Nice!

ascillitoe · 2023-05-30T10:18:36Z

couldn't find anything worth commenting on

Challenge accepted 😛

ascillitoe · 2023-05-30T10:50:18Z

alibi_detect/od/_lof.py

+
+        # set metadata
+        self.meta['detector_type'] = 'outlier'
+        self.meta['data_type'] = 'numeric'


Not isolated to this PR, but noting that we seem to be a little inconsistent across the new and old outlier detectors wrt to when data_type is hard-coded, and when it is optionally set via a kwarg. For some, it is hardcoded to time-series (which makes sense), for some (e.g. the old Mahalanobis) it is set via kwarg, and for some it is hard coded to numeric. Maybe worth opening an issue to review this more generally?

Already mentioned in #567 (comment), but highlighting here since we are setting data_type in new detectors too...

alibi_detect/od/_lof.py

ascillitoe · 2023-05-30T11:01:57Z

alibi_detect/od/_lof.py

+
+        Returns
+        -------
+        Outlier scores. The shape of the scores is `(n_instances,)`. The higher the score, the more anomalous the \


Nitpick: In a few places outlying is used e.g.

alibi-detect/alibi_detect/od/_mahalanobis.py

Line 96 in 21ca540

the l2-norm of the projected data. The higher the score, the more outlying the instance.

whereas in the score docstring (and for _pca, _gmm, _knn, mahalanobis) anomalous is used. Worth picking one or the other?

ascillitoe · 2023-05-30T11:05:19Z

alibi_detect/od/tests/test__lof/test__lof_backend.py

+    assert torch.all(lof_torch(x) == torch.tensor([0, 0, 1]))
+
+
+@pytest.mark.skip(reason="Can't convert GaussianRBF to torch script due to torch script type constraints")


What is the intention behind including this test if it is skipped in all cases?

I wrote it then decided not to include this functionality in the first set of outliers, however, it should be implemented at some point which is why I've left it in but skipped it. I can take it out if you prefer?

see #810

ascillitoe

Few minor comments, otherwise LGTM!

mauicv added 30 commits January 3, 2023 18:09

Update default knn ensemble aggregator and normalizer values

09412c4

Add tests for aggregator and normalizer default values

65fbeac

Merge branch 'master' into feature/knn-outlier-detector

633636a

Remove Optional type from aggregator

4cd2eba

Change X -> x throughout

98d2ba5

Change anomaly -> outlier throughout

3d57b6d

Improve fpr description

bc01ddf

Update PValNormalizer docstring

47566b8

Add custom error types

bda15fd

Test pval and shift and scale normalizer output values

bacbfcc

Test aggregator output values

ea436b0

Remove unneeded NotImplemnentedErrors from ABC abstract methods

c3af856

Fix typos

351e5a7

Fix method typo

a190cc3

Fix docstrings for KNNTorch

0daa98f

Set api signatures to accept np.ndarray and not List types

8d2fa3b

Fix mypy error

af8ab43

Move to numpy logic from OutlierDetectorOutput dataclass to base class

89c4815

Refactor init knn logic

c909f4c

Refator str to aggregator and normalizer methods to backend

bc7a771

Align kNN output with other outlier detectors

dfd613d

Refactor backend.pytorch into pytorch module

3fb167b

Fix optional dependency tests

f28cb0f

Add backticks do docstrings

6ef8c62

Update docstrings

76b31d7

reword numpy to torch tensor in transform object docstrings

6fb600c

Update return type hints

cf6bba1

Add Mahalanobis detector

f8ae93d

Add hasattr check in _accumulator method

96f9be0

Merge branch 'feature/knn-outlier-detector' into feature/mahalanobis-od

fa4f944

mauicv added 10 commits May 16, 2023 17:19

Fix tests and typing

2af6641

Update tests

6409bb7

Update optional dep tests for LOFTorch

023b4c3

Fix issues in _lof

a88630a

Add comments to fit and score logic

c13572d

Remove shape comments

6000309

Fix tests

3f091ad

Update lof docstrings

d52c2f6

Update docstrings for lof backend

b7b4a66

Minor change

b8842bd

mauicv requested review from ojcobb and ascillitoe May 24, 2023 15:27

ojcobb reviewed May 26, 2023

View reviewed changes

ascillitoe reviewed May 30, 2023

View reviewed changes

alibi_detect/od/_lof.py Show resolved Hide resolved

ascillitoe reviewed May 30, 2023

View reviewed changes

ascillitoe approved these changes May 30, 2023

View reviewed changes

mauicv mentioned this pull request Jun 12, 2023

Normalizer and Aggregator Docstrings #805

Open

mauicv added 2 commits June 12, 2023 10:38

Update score docstring

5cec76d

Update _compute_K docstring

32de716

This was referenced Jun 12, 2023

Review data_type in meta #807

Open

Replace anomalous with outlying throughout #808

Open

ojcobb approved these changes Jun 12, 2023

View reviewed changes

Merge branch 'master' into feature/lof-od

7610baa

mauicv merged commit 5e69f4b into SeldonIO:master Jun 12, 2023
11 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature LOF outlier detector #746

Feature LOF outlier detector #746

mauicv commented Feb 21, 2023 •

edited

Loading

ojcobb May 26, 2023

mauicv Jun 12, 2023

ojcobb May 26, 2023

ojcobb May 26, 2023 •

edited

Loading

ojcobb commented May 26, 2023

ascillitoe commented May 30, 2023

ascillitoe May 30, 2023 •

edited

Loading

ascillitoe May 30, 2023

ascillitoe May 30, 2023

mauicv Jun 12, 2023 •

edited

Loading

ascillitoe left a comment

		assert torch.all(lof_torch(x) == torch.tensor([0, 0, 1]))


		@pytest.mark.skip(reason="Can't convert GaussianRBF to torch script due to torch script type constraints")

Feature LOF outlier detector #746

Feature LOF outlier detector #746

Conversation

mauicv commented Feb 21, 2023 • edited Loading

What is this

TODO:

ojcobb May 26, 2023

Choose a reason for hiding this comment

mauicv Jun 12, 2023

Choose a reason for hiding this comment

ojcobb May 26, 2023

Choose a reason for hiding this comment

ojcobb May 26, 2023 • edited Loading

Choose a reason for hiding this comment

ojcobb commented May 26, 2023

ascillitoe commented May 30, 2023

ascillitoe May 30, 2023 • edited Loading

Choose a reason for hiding this comment

ascillitoe May 30, 2023

Choose a reason for hiding this comment

ascillitoe May 30, 2023

Choose a reason for hiding this comment

mauicv Jun 12, 2023 • edited Loading

Choose a reason for hiding this comment

ascillitoe left a comment

Choose a reason for hiding this comment

mauicv commented Feb 21, 2023 •

edited

Loading

ojcobb May 26, 2023 •

edited

Loading

ascillitoe May 30, 2023 •

edited

Loading

mauicv Jun 12, 2023 •

edited

Loading