All scores are zero except the last, which is a NaN #45

NAThompson · 2023-11-10T19:00:34Z

To reproduce:

#!/usr/bin/env python3
import numpy
from deepod.models.time_series.dif import DeepIsolationForestTS

arr = numpy.empty(shape=(33, 741))
arr.fill(1.0)

# Also reproduces the issue:
# arr = numpy.random.rand(33, 741)

dif = DeepIsolationForestTS(device=None, seq_len=min(arr.shape[0], 100), max_samples=min(arr.shape[0], 256), hidden_dims=100)
dif.fit(arr)
scores = dif.decision_function(arr)
print(scores)

Output:

$VIRTUAL_ENV/lib/python3.9/site-packages/deepod/models/tabular/dif.py:256: RuntimeWarning: invalid value encountered in divide
  scores = 2 ** (-depth_sum / (len(clf.estimators_) * _average_path_length([clf.max_samples_])))
[00:00<00:00, 1088.64it/s]
[ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.
  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0. nan]

The text was updated successfully, but these errors were encountered:

NAThompson · 2023-11-15T18:06:03Z

Some more clues as to the source of this issue: The first seq_len -1 samples are always zero (which should perhaps be the title of the issue):

#!/usr/bin/env python3
import numpy
from deepod.models.time_series.dif import DeepIsolationForestTS

arr = numpy.random.rand(256, 256)

dif = DeepIsolationForestTS(device='cpu', seq_len=10, max_samples=min(arr.shape[0], 256), hidden_dims=100)
dif.fit(arr)
scores = dif.decision_function(arr)
print(scores)
# Output:
[    0.             0.             0.             0.
     0.             0.             0.             0.
     0.         31026.44215442 30419.1558587  31492.80078777
 31453.71368052 31583.28407603 31244.13219935 31221.5422394
...

It would appear that these lines are responsible:

DeepOD/deepod/models/time_series/dif.py

Lines 157 to 158 in c2c7566

    
           padding = np.zeros(self.seq_len-1) 
        
           scores = np.hstack((padding, scores))

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

All scores are zero except the last, which is a NaN #45

All scores are zero except the last, which is a NaN #45

NAThompson commented Nov 10, 2023

NAThompson commented Nov 15, 2023 •

edited

Loading

All scores are zero except the last, which is a NaN #45

All scores are zero except the last, which is a NaN #45

Comments

NAThompson commented Nov 10, 2023

NAThompson commented Nov 15, 2023 • edited Loading

NAThompson commented Nov 15, 2023 •

edited

Loading