DataCollocations.jl provides non-parametric data collocation functionality for smoothing timeseries data and estimating derivatives.
DataCollocations.jl offers two distinct methodologies for data collocation, each optimized for different data characteristics:
Robust regression-based approach for handling noisy measurements:
- Multiple kernel functions: Epanechnikov, Triangular, Gaussian, Quartic, Triweight, Tricube, Cosine, Logistic, Sigmoid, Silverman
- Automatic bandwidth selection: Optimally balances bias-variance tradeoff
- Noise robustness: Designed to handle measurement noise and outliers
- Regression splines: Smoothed fit that doesn't necessarily pass through data points
- Best for: Experimental data with significant measurement noise
Exact interpolation approach for high-quality data:
- Standard interpolation methods: CubicSpline, QuadraticInterpolation, BSpline, Akima, etc.
- Exact fitting: Interpolation curves pass exactly through data points
- Minimal noise assumption: Assumes data points are accurate measurements
- High efficiency: Fast computation for clean, well-sampled data
- Best for: Simulation data or high-precision measurements with minimal noise
| Data Characteristics | Recommended Method | Reason |
|---|---|---|
| Experimental measurements with noise | Kernel smoothing | Robust to noise, provides smoothed estimates |
| Simulation results | DataInterpolations | Exact, efficient, preserves accuracy |
| Sparse, clean data | DataInterpolations (CubicSpline) | Exact interpolation between points |
| Dense, noisy data | Kernel smoothing (Epanechnikov) | Optimal noise handling |
| Very noisy data | NoiseRobustDifferentiation.jl | Specialized for heavy noise |
- Multiple kernel functions for data smoothing
- Automatic bandwidth selection
- Support for DataInterpolations.jl integration
- Derivative estimation from noisy data
- Efficient implementation with pre-allocated arrays
Since this package is not yet registered, install it directly from GitHub:
using Pkg
Pkg.add(url="https://github.com/SciML/DataCollocations.jl")Once registered in the General registry:
using Pkg
Pkg.add("DataCollocations")using DataCollocations
using OrdinaryDiffEq
# Generate some sample data
f(u, p, t) = p .* u
prob = ODEProblem(f, [1.0], (0.0, 10.0), [-0.1])
t = collect(0.0:0.1:10.0)
data = Array(solve(prob, Tsit5(); saveat=t))
# Perform collocation to estimate derivatives and smooth data
u′, u = collocate_data(data, t, TriangularKernel(), 0.1)
# u′ contains the estimated derivatives
# u contains the smoothed dataDataCollocations.jl supports multiple kernel functions for noisy data:
Bounded Support Kernels (support on [-1, 1]):
EpanechnikovKernel()UniformKernel()TriangularKernel()(default)QuarticKernel()TriweightKernel()TricubeKernel()CosineKernel()
Unbounded Support Kernels:
GaussianKernel()LogisticKernel()SigmoidKernel()SilvermanKernel()
With DataInterpolations.jl loaded, you can use interpolation methods:
using DataInterpolations
# Use interpolation to generate data at intermediate timepoints
tpoints_sample = 0.05:0.1:9.95
u′, u = collocate_data(data, t, tpoints_sample, LinearInterpolation)u′, u = collocate_data(data, tpoints, kernel=TriangularKernel(), bandwidth=nothing)
u′, u = collocate_data(data, tpoints, tpoints_sample, interp, args...)Arguments:
data: Matrix where each column is a snapshot of the timeseriestpoints: Time points corresponding to data columnskernel: Kernel function for smoothing (default:TriangularKernel())bandwidth: Smoothing bandwidth (auto-selected ifnothing)tpoints_sample: Sample points for interpolation methodinterp: Interpolation method from DataInterpolations.jl
Returns:
u′: Estimated derivativesu: Smoothed data
Contributions are welcome! Please see the contributing guidelines for more information.
- DiffEqFlux.jl - Neural differential equations
- DataInterpolations.jl - Interpolation methods
- OrdinaryDiffEq.jl - ODE solvers
- NoiseRobustDifferentiation.jl - Specialized library for estimating derivatives from very noisy data
If you use DataCollocations.jl in your research, please cite the collocation methodology paper:
@article{roesch2021collocation,
title={Collocation based training of neural ordinary differential equations},
author={Roesch, Elisabeth and Rackauckas, Christopher and Stumpf, Michael P. H.},
journal={Statistical Applications in Genetics and Molecular Biology},
volume={20},
number={2},
pages={37--49},
year={2021},
publisher={De Gruyter},
doi={10.1515/sagmb-2020-0025},
url={https://doi.org/10.1515/sagmb-2020-0025}
}