Skip to content

Commit

Permalink
Fix median sampling error if only 1 distinct value (#377)
Browse files Browse the repository at this point in the history
This commit addresses an exception that occurs during calculation of the
median sampling error if the data containts only 1 distinct value.

Error being addressed:
    The data appears to lie in a lower-dimensional subspace of the space
    in which it is expressed. This has resulted in a singular data
    covariance matrix, which cannot be treated using the algorithms
    implemented in `gaussian_kde`. Consider performing principle
    component analysis / dimensionality reduction and using
    `gaussian_kde` with the transformed data.
  • Loading branch information
michael-nml authored Apr 19, 2024
1 parent d4f553f commit e98cab9
Showing 1 changed file with 8 additions and 2 deletions.
10 changes: 8 additions & 2 deletions nannyml/sampling_error/summary_stats.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@

import numpy as np
import pandas as pd
import warnings
from scipy.stats import gaussian_kde, moment

logger = getLogger(__name__)
Expand Down Expand Up @@ -80,8 +81,13 @@ def summary_stats_median_sampling_error_components(col: pd.Series) -> Tuple:
(median, pdf(median): Tuple[np.ndarray]
"""
median = col.median()
kernel = gaussian_kde(col)
fmedian = kernel.evaluate(median)[0]
try:
kernel = gaussian_kde(col)
fmedian = kernel.evaluate(median)[0]
except np.linalg.LinAlgError as ex:
logger.warning("Suppressing LinAlgError in summary_stats_median_sampling_error_components: %r", ex)
warnings.warn(f"Suppressing LinAlgError in summary_stats_median_sampling_error_components: {ex}")
fmedian = np.inf
return (median, fmedian)


Expand Down

0 comments on commit e98cab9

Please sign in to comment.