Online (single pass) algorithms for statistical measures based on the Julia package OnlineStats.jl by Josh Day. Useful for streaming and big data.
This package is available on NuGet
Supports .NET Standard 2.0
Statistic | Description |
---|---|
Mean | The univariate mean |
Sum | The overall sum of double observations |
Sum<T> | The overall sum of any type that implements IAdditionOperators and IAdditiveIdentity |
Variance | The univariate variance |
Extrema | The min and max of double observations |
Extrema<T> | The min and max of any type that implements IComparable<T> |
Moments | The Mean, Variance, Skewness, and (excess) Kurtosis |
EmpiricalCdf | Approximate order statistics (quantiles) |
CountMap<T> | Counts for each unique value |
Histogram | A histogram with specified bin edges |
Distribution | Description |
---|---|
Normal | The univariate mean and variance |
Beta | The number of successes and failures |
All running statistics implement the following interfaces:
public interface IRunningStatistic
{
public long Nobs { get; }
public void Reset();
public IRunningStatistic CloneEmpty();
public IRunningStatistic Clone();
public void UnsafeMerge(IRunningStatistic other);
}
public interface IRunningStatistic<TObs> : IRunningStatistic
{
public void Fit(TObs value);
public void Fit(TObs value, long count);
public void Fit(IEnumerable<TObs> values);
public void Fit(IEnumerable<KeyValuePair<TObs, long>> keyValuePairs);
public new IRunningStatistic<TObs> CloneEmpty();
public new IRunningStatistic<TObs> Clone();
public void UnsafeMerge(IRunningStatistic<TObs> other);
}
public interface IRunningStatistic<TObs, TSelf> : IRunningStatistic<TObs>
where TSelf : IRunningStatistic<TObs, TSelf>
{
public new TSelf CloneEmpty();
public new TSelf Clone();
public void Merge(TSelf other);
}
The interface is layered as three interfaces of increasing specificity. IRunningStatistic
is the
most generic and allows for mixed running statistics to be in a collection. IRunningStatistic<TObs>
expands by adding information about the type of observations that the running statistic can fit.
Finally IRunningStatistic<TObs, TSelf>
adds information about the concrete type of the running
statistic, and allows for precise cloning and merging.
The abstract class RunningStatisticBase
implements the IRunningStatistic<TObs, TSelf>
interface
and provides default implementations when possible.
See the RunningStatistics.Examples
project for more examples.
using RunningStatistics;
var mean1 = new Mean();
var mean2 = new Mean();
var ecdf = new EmpiricalCdf();
var rng = new Random();
for (var i = 0; i < 1000; i++)
{
var x = rng.NextDouble();
mean1.Fit(x);
mean2.Fit(2*x);
ecdf.Fit(x);
}
mean1.Merge(mean2);
var q1 = ecdf.Quantile(0.25);
If you are inheriting from RunningStatisticBase
, the biggest change should be the removal of the
GetStatsString
method. Now you should just override the ToString
method.
Before:
protected override string GetStatsString() => $"μ={Mean}, σ²={Variance}";
After:
public override string ToString() => base.ToString() + $" | μ={Mean}, σ²={Variance}";