Skip to content

Commit f86d4ac

Browse files
committed
operations.pyx: Better estimate the error of NRI and NTI.
When comparing to the tolerance we allow when computing the NRI or the NTI, take into account the relative error coming from the 1/std term too. This is a crude approximation (error_std ~ diff between max and min of the last nmin std values), but reasonable(?). In the few tests I have done, it seems to be underestimating the real error though. Still, it could be useful approximation.
1 parent b297dea commit f86d4ac

File tree

1 file changed

+12
-5
lines changed

1 file changed

+12
-5
lines changed

ete4/core/operations.pyx

Lines changed: 12 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -901,6 +901,7 @@ def leaves_vs_random(tree, leaves, metric, tolerance=0.05, nmin=5, nmax=1000):
901901
all_leaves = list(tree.leaves())
902902
s = 0 # sum of the values
903903
s2 = 0 # sum of the squares (for the standard deviation)
904+
last_stds = [] # to estimate the error on the std
904905
for n in range(1, nmax+1):
905906
random_leaves = random.sample(all_leaves, len(leaves))
906907

@@ -909,14 +910,20 @@ def leaves_vs_random(tree, leaves, metric, tolerance=0.05, nmin=5, nmax=1000):
909910
s += x
910911
s2 += x*x
911912

912-
# Estimate the mean, the standard deviation, and total relative error
913-
# (assuming it comes mostly from the error on the mean, not from std).
913+
# Estimate the mean and the standard deviation.
914914
mean = s / n # mean
915915
std = sqrt(s2 / n - mean*mean) # standard deviation
916916

917-
error_mean = std / sqrt(n) # estimated absolute error of the mean
918-
error = error_mean / max(abs(mean - x0), 1e-12) # relative error
919-
# FIXME: If error(std) / std is big, we should add it!
917+
# Estimate the errors.
918+
error_mean = std / sqrt(n) # of the mean
919+
920+
last_stds.append(std)
921+
error_std = max(last_stds) - min(last_stds) # of the std (crude)
922+
if len(last_stds) > nmin:
923+
last_stds.pop(0)
924+
925+
error = (error_mean / max(abs(mean - x0), 1e-12) + # relative error
926+
error_std / max(std, 1e-12))
920927

921928
if n > nmin and error < tolerance:
922929
break

0 commit comments

Comments
 (0)