Skip to content

Commit

Permalink
Fix stochastic bugs when sampling using StrictlyIncreasing constraints (
Browse files Browse the repository at this point in the history
#167)

* These changes will be breaking, so up minor version

* Add quantiles and sampling arguments to constraint,

* Introduce the more generic `sequence_exists`

* More generic methods

* Update tests

* Update Project.toml

* Rename MidpointOutwards to RandPtOutwards

* Update tests
  • Loading branch information
kahaaga authored Apr 21, 2021
1 parent 7fac470 commit 8cc28f8
Show file tree
Hide file tree
Showing 15 changed files with 561 additions and 746 deletions.
3 changes: 2 additions & 1 deletion Project.toml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,8 @@ name = "UncertainData"
uuid = "dcd9ba68-c27b-5cea-ae21-829cd07325bf"
authors = ["Kristian Agasøster Haaga <[email protected]>"]
repo = "https://github.com/kahaaga/UncertainData.jl.git"
version = "0.13.1"
version = "0.14.0"


[deps]
Bootstrap = "e28b5b4c-05e8-5b66-bc03-6f0c0a0a06e0"
Expand Down
10 changes: 10 additions & 0 deletions docs/src/changelog.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,16 @@

# Changelog

## UncertainData.jl v.0.14

### Breaking changes

- `sequence_exists` replaces `strictly_increasing_sequence_exists`/`strictly_decreasing_sequence_exists`.

### Bug fixes

- Fixed bug that could occasionally occur for certain types of data when performing resampling with the `StrictlyIncreasing`/`StrictlyDecreasing` sequential constraints.

## UncertainData.jl v0.10.4

### Documentation
Expand Down
18 changes: 5 additions & 13 deletions docs/src/sampling_constraints/sequential_constraints.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,21 +16,13 @@ StrictlyIncreasing
StrictlyDecreasing
```

## Checking for increasing/decreasing sequences
## Existence of sequences

There are a few built-in functions to check if your dataset allows the application of
certain [sequential sampling constraints](available_constraints). These functions will check
whether a valid sequence through your collection of uncertain values exists, so that you
can know beforehand whether a particular resampling scheme is possible to apply to your data.

### Strictly increasing
`sequence_exists` will check whether a valid sequence through your collection of
uncertain values exists, so that you can know beforehand whether a particular
sequential sampling constraint is possible to apply to your data.

```@docs
strictly_increasing_sequence_exists
sequence_exists
```

### Strictly decreasing

```@docs
strictly_decreasing_sequence_exists
```
7 changes: 4 additions & 3 deletions src/resampling/Resampling.jl
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,8 @@ using Reexport
import ..UVAL_COLLECTION_TYPES
import ..UncertainDatasets:
AbstractUncertainValueDataset,
UncertainIndexValueDataset
UncertainIndexValueDataset,
AbstractUncertainDataset
import ..UncertainValues:
UncertainValue,
AbstractUncertainValue
Expand Down Expand Up @@ -79,8 +80,8 @@ using Reexport
# Ordered resampling
#########################################
include("ordered_resampling/resample_sequential.jl")
include("ordered_resampling/resample_uncertaindataset_strictlyincreasing.jl")
include("ordered_resampling/resample_uncertaindataset_strictlydecreasing.jl")
include("ordered_resampling/strictlyincreasing.jl")
include("ordered_resampling/strictlydecreasing.jl")

#########################################
# Resampling with interpolation
Expand Down
105 changes: 76 additions & 29 deletions src/resampling/ordered_resampling/resample_sequential.jl
Original file line number Diff line number Diff line change
@@ -1,41 +1,88 @@
import ..SamplingConstraints:
SequentialSamplingConstraint,
OrderedSamplingAlgorithm
import ..UncertainDatasets:
AbstractUncertainValueDataset

"""
resample(udata::AbstractUncertainValueDataset,
sequential_constraint::SequentialSamplingConstraint;
quantiles = [0.0001, 0.9999])
const AUD = AbstractUncertainDataset

Resample a dataset by imposing a sequential sampling constraint.
const SC = Union{SamplingConstraint, Vector{S}} where S <: SamplingConstraint
const SEQ = SequentialSamplingConstraint{O} where O <: OrderedSamplingAlgorithm

Before drawing the realization, all furnishing distributions are truncated to the provided
`quantiles` range. This is to avoid problems in case some distributions have infinite
support.
const XD = Union{AbstractUncertainDataset, Vector{<:AbstractUncertainValue}}

"""
resample(x, ssc::SequentialSamplingConstraint)
resample(x, ssc::SequentialSamplingConstraint, c::Union{SamplingConstraint, Vector{SamplingConstraint}})
"""
resample(udata::AbstractUncertainValueDataset,
sequential_constraint::SequentialSamplingConstraint;
quantiles = [0.0001, 0.9999])
Sample `x` element-wise such that the samples obey the sequential constraints given by `ssc`.
Alteratively, apply constrain(s) `c` to `x` *before* sequential sampling is performed.
A check is performed before sampling to ensure that such a sequence exists.
Before the check is performed, the distributions in `x` are truncated element-wise
to the quantiles provided by `c` to ensure they have finite supports.
"""
resample(udata::AbstractUncertainValueDataset,
constraint::Union{SamplingConstraint, Vector{SamplingConstraint}},
sequential_constraint::SequentialSamplingConstraint;
quantiles = [0.0001, 0.9999])
If `x` is an uncertain index-value dataset, then the sequential constraint is only applied to
the indices.
Resample a dataset by first imposing regular sampling constraints on the furnishing
distributions, then applying a sequential sampling constraint.
resample!(s, x, ssc::SequentialSamplingConstraint, lqs, uqs)
Before drawing the realization, all furnishing distributions are truncated to the provided
`quantiles` range. This is to avoid problems in case some distributions have infinite
support.
"""
resample(udata::AbstractUncertainValueDataset,
constraint::Union{SamplingConstraint, Vector{SamplingConstraint}},
sequential_constraint::SequentialSamplingConstraint;
quantiles = [0.0001, 0.9999])
The same as above, but store the sampled values a pre-allocated vector `s`, where
`length(x) == length(s)`. This avoids excessive memory allocations during repeated
resampling. This requires pre-computing the element-wise lower and upper quantiles
`lqs` and `uqs` for the initial truncation step.
This method *does not* check for the existence of a strictly increasing sequence in `x`.
To check that, use [`sequence_exists`](@ref).
See also: [`sequence_exists`](@ref), [`StrictlyIncreasing`](@ref),
[`StrictlyDecreasing`](@ref), [`StartToEnd`](@ref).
## Examples
```julia
N = 100
t = [UncertainValue(Normal, i, 2) for i in 1:N];
resample(t, StrictlyIncreasing(StartToEnd()))
```
```julia
N = 100
t = [UncertainValue(Normal, i, 2) for i in 1:N];
# Verify that an increasing sequence through `t` exists
c = StrictlyIncreasing(StartToEnd())
exists, lqs, uqs = sequence_exists(t, c)
# Pre-allocate sample vector
s = zeros(Float64, N)
if exists
for i = 1:100
resample!(s, t, c)
# Do something with s
# ...
end
end
```
"""
function resample(x::XD, ssc::SEQ)
exists, lqs, uqs = sequence_exists(x, ssc)
exists || error("Sequence does not exist")
_draw(x, ssc, lqs, uqs)
end

function resample!(s, x::XD, ssc::SEQ, lqs, uqs)
_draw!(s, x, ssc, lqs, uqs)
end

resample(udata::XD, ssc::SEQ, c::Union{SC, AbstractVector{<:SC}}) =
resample(constrain(udata, c), ssc)


resample(udata::UncertainIndexValueDataset, ssc::SEQ) =
resample(udata.indices, ssc), resample(udata.values)

resample(udata::UncertainIndexValueDataset, ssc::SEQ, c::SC) =
resample(constrain(udata.indices, c), ssc), resample(constrain(udata.values, c))

resample(udata::UncertainIndexValueDataset, ssc::SEQ, ic::SC, vc::SC) =
resample(constrain(udata.indices, ic), ssc), resample(constrain(udata.values, vc))
Loading

0 comments on commit 8cc28f8

Please sign in to comment.