- HP.Choice
- HP.PChoice
- HP.Uniform
- HP.QuantUniform
- HP.Normal
- HP.QuantNormal
- HP.LogNormal
- HP.QuantLogNormal
- HP.LogQuantNormal
- HP.LogUniform
- HP.QuantLogUniform
- HP.LogQuantUniform
Choice(
label::Symbol,
options::Array{T,1} where T,
) -> TreeParzen.HP.Choice
Randomly choose which option will be extracted, and also supply the list of options. The elements of options can themselves be nested stochastic expressions. In this case, the stochastic choices that only appear in some of the options become conditional parameters. Initially each choice will be given an equal weight. To favour some choices over others, see PChoice
Example:
example_space = Dict(
:example => HP.Choice(:example, [1.0, 0.9, 0.8, 0.7]),
)
Example of conditional paramaters:
example_space_conditional = Dict(
:example => HP.Choice(:example, [
(:case1, HP.Uniform(:param1, 0.0, 1.0)),
(:case2, HP.Uniform(:param2, -10.0, 10.0)),
]),
)
:param1
and :param2
are examples of conditional parameters. Each of :param1
and :param2
only features in the returned sample for a particular value of :example
. If :example
is 0, then :param1
is used but not :param2
. If :example
is 1, then :param2
is used but not :param1
.
Example with nested arrays of different lengths:
example_space_nested = Dict(
:example => HP.Choice(:example, [
[HP.Normal(:d0_c0, 0.0, 5.0)],
[HP.Normal(:d1_c0, 0.0, 5.0), HP.Normal(:d1_c1, 0.0, 5.0)],
[
HP.Normal(:d2_c0, 0.0, 5.0), HP.Normal(:d2_c1, 0.0, 5.0),
HP.Normal(:d2_c2, 0.0, 5.0),
],
]),
)
Note that all labels (the symbol given as the first parameter to all the HP.*
functions) must be unique. These labels identify the parts of the space that the optimiser learns from over iterations.
PChoice(
label::Symbol,
probability_options::Array{TreeParzen.HP.Prob,1},
) -> TreeParzen.HP.PChoice
Choose from a list of options with weighted probabilities
struct Prob
probability::Float64
option::Any
State the weighted probability of an option, for use with PChoice
.
label
: Labelprobability_options
: Array of Prob objects
Example:
example_space = Dict(
:example => HP.PChoice(
:example,
[
Prob(0.1, 0),
Prob(0.2, 1),
Prob(0.7, 2),
]
)
)
Note that the Prob
probability weights must sum to 1.
To look at the distribution we can do the following:
trials = [ask(example_space) for i in 1:1000]
samples = getindex.(getproperty.(trials, :hyperparams), :example)
vals = sort(unique(samples))
counts = sum(samples .== vals'; dims=1)
probs = dropdims(counts'/sum(counts), dims=2)
Gadfly.plot(x=vals, y=probs, Gadfly.Geom.hair, Gadfly.Geom.point, Gadfly.Scale.y_continuous(minvalue=0.0), Gadfly.Guide.xticks(ticks=vals), Gadfly.Guide.yticks(ticks=Float64.(0:0.1:1)))
Uniform(
label::Symbol,
low::Union{Float64, TreeParzen.Delayed.AbstractDelayed},
high::Union{Float64, TreeParzen.Delayed.AbstractDelayed},
) -> TreeParzen.HP.Uniform
Returns a value with uniform probability from between low
and high
. When optimising, this variable is constrained to a two-sided interval.
example_space = Dict(
:example => HP.Uniform(:example, 0.0, 1.0),
)
where label
is the parameter and the returned value is uniformly distributed between low
at 0.0 and high
at 1.0
To look at the distribution we can do the following:
trials = [ask(example_space) for i in 1:1000]
samples = getindex.(getproperty.(trials, :hyperparams), :example)
Gadfly.plot(x=samples, Gadfly.Stat.density(bandwidth=0.05), Gadfly.Geom.polygon(fill=true, preserve_order=true))
N.B. the distribution looks like it has tails beyond 0 and 1 due to use of kernel density estimates, but in fact verify the sampled values are contained within specified range:
@show(minimum(samples));
@show(maximum(samples));
minimum(samples) = 0.0013577594712426144
maximum(samples) = 0.9985292682892326
QuantUniform(
label::Symbol,
low::Union{Float64, TreeParzen.Delayed.AbstractDelayed},
high::Union{Float64, TreeParzen.Delayed.AbstractDelayed},
q::Union{Float64, TreeParzen.Delayed.AbstractDelayed},
) -> TreeParzen.HP.QuantUniform
Returns a value uniformly between low and high, with a quantisation. When optimising, this variable is constrained to a two-sided interval.
example_space = Dict(
:example => HP.QuantUniform(:example, 0.0, 10.0, 2.0),
)
where label
is the parameter and the returned value is uniformly distributed between low
at 0.0 and high
at 10.0, with the q
uantisation set at 2.0. Valid sampled values would be 0.0, 2.0, 4.0, 6.0, 8.0 and 10.0.
To look at the distribution we can do the following:
trials = [ask(example_space) for i in 1:1000]
samples = getindex.(getproperty.(trials, :hyperparams), :example)
vals = sort(unique(samples))
counts = sum(samples .== vals'; dims=1)
probs = dropdims(counts'/sum(counts), dims=2)
Gadfly.plot(x=vals, y=probs, Gadfly.Geom.hair, Gadfly.Geom.point, Gadfly.Scale.y_continuous(minvalue=0.0), Gadfly.Guide.xticks(ticks=vals))
N.B. the falloff at the distribution tails, this is to do with the choice of 2.0 for quantisation and reduction in number of samples that appear to each of the extreme values after quantisation (specifically, roughly half).
Normal(
label::Symbol,
mu::Union{Float64, TreeParzen.Delayed.AbstractDelayed},
sigma::Union{Float64, TreeParzen.Delayed.AbstractDelayed},
) -> TreeParzen.HP.Normal
Returns a real value that's normally-distributed with mean mu and standard deviation sigma. When optimizing, this is an unconstrained variable.
example_space = Dict(
:example => HP.Normal(:example, 4.0, 5.0),
)
To look at the distribution we can do the following:
trials = [ask(example_space) for i in 1:1000]
samples = getindex.(getproperty.(trials, :hyperparams), :example)
Gadfly.plot(x=samples, Gadfly.Stat.density(bandwidth=1), Gadfly.Geom.polygon(fill=true, preserve_order=true))
QuantNormal(
label::Symbol,
mu::Union{Float64, TreeParzen.Delayed.AbstractDelayed},
sigma::Union{Float64, TreeParzen.Delayed.AbstractDelayed},
q::Union{Float64, TreeParzen.Delayed.AbstractDelayed},
) -> TreeParzen.HP.QuantNormal
Returns a real value that's normally-distributed with mean mu
and standard deviation sigma
, with a quantisation.
When optimizing, this is an unconstrained variable.
example_space = Dict(
:example => HP.QuantNormal(:example, 2., 0.5, 1.0),
)
In this example, the values are sampled normally first, and then q
uantised in rounds of 1.0, so one
only observes 1.0, 2.0, 3.0, etc, centered around 2.0.
To look at the distribution we can do the following:
trials = [ask(example_space) for i in 1:1000]
samples = getindex.(getproperty.(trials, :hyperparams), :example)
vals = sort(unique(samples))
counts = sum(samples .== vals'; dims=1)
probs = dropdims(counts'/sum(counts), dims=2)
Gadfly.plot(x=vals, y=probs, Gadfly.Geom.hair, Gadfly.Geom.point, Gadfly.Scale.y_continuous(minvalue=0.0), Gadfly.Guide.xticks(ticks=vals))
N.B. that due to rounding, the observed values do not follow exactly normal distribution, particularly when sigma is much smaller than quantisation.
LogNormal(
label::Symbol,
mu::Union{Float64, TreeParzen.Delayed.AbstractDelayed},
sigma::Union{Float64, TreeParzen.Delayed.AbstractDelayed},
) -> TreeParzen.HP.LogNormal
Returns a value drawn according to exp(normal(mu, sigma))
so that the logarithm of the sampled value is normally distributed. When optimising, this variable is constrained to be positive.
example_space = Dict(
:example => HP.LogNormal(:example, log(3.0), 1.0),
)
In this example, the log normal distribution will be centred around 3. The distribution is not truncated.
To look at the distribution we can do the following:
trials = [ask(example_space) for i in 1:1000]
samples = getindex.(getproperty.(trials, :hyperparams), :example)
Gadfly.plot(x=samples, Gadfly.Stat.density(bandwidth=0.5), Gadfly.Geom.polygon(fill=true, preserve_order=true), Gadfly.Scale.x_log10, Gadfly.Guide.xlabel("x (log)"))
Note that Gadfly density estimates appear to be wrong in log-scale.
Because mean is hard to determine on log-scale, lets inspect it directly:
@show(sum(samples)/length(samples));
sum(samples) / length(samples) = 5.099150419827341
However, kernel density estimates incorrectly show the distribution containing density below 0.
@show(minimum(samples));
minimum(samples) = 0.1251765366832742
LogQuantNormal(
label::Symbol,
mu::Union{Float64, TreeParzen.Types.AbstractDelayed},
sigma::Union{Float64, TreeParzen.Types.AbstractDelayed},
q::Union{Float64, TreeParzen.Types.AbstractDelayed},
) -> TreeParzen.HP.LogQuantNormal
Returns a value drawn according to exp(normal(mu, sigma)), with a quantisation, so that the logarithm of the sampled value is normally distributed. When optimising, this variable is constrained to be positive.
example_space = Dict(
:example => HP.LogQuantNormal(:example, log(1e-3), 0.5*log(10), log(sqrt(10))),
)
In this example, the log normal distribution will be centred around 1e-3, with stddev of multiplication
by sqrt(10), and a quantisation step for each sqrt(10) multiplier. The distribution is not truncated.
The distinct values would therefore be in every power of sqrt(10)
. In practice we don't observe values
beyond 3-4 std deviations from the mean (based on number of samples).
To look at the distribution we can do the following:
trials = [ask(example_space) for i in 1:1000]
samples = getindex.(getproperty.(trials, :hyperparams), :example)
vals = sort(unique(samples))
counts = sum(samples .== vals'; dims=1)
probs = dropdims(counts'/sum(counts), dims=2)
Gadfly.plot(x=vals, y=probs, Gadfly.Geom.hair, Gadfly.Geom.point, Gadfly.Scale.y_continuous(minvalue=0.0), Gadfly.Scale.x_log10, Gadfly.Guide.xlabel("x (log)"))
QuantLogNormal(
label::Symbol,
mu::Union{Float64, TreeParzen.Delayed.AbstractDelayed},
sigma::Union{Float64, TreeParzen.Delayed.AbstractDelayed},
q::Union{Float64, TreeParzen.Delayed.AbstractDelayed},
) -> TreeParzen.HP.QuantLogNormal
Returns a value drawn according to exp(normal(mu, sigma))
, with a quantisation, so that the logarithm of the sampled value is normally distributed. When optimising, this variable is constrained to be positive.
example_space = Dict(
:example => HP.QuantLogNormal(:example, log(3.0), 0.5, 2.0),
)
In this example, the log normal distribution will be centred around 3. The distribution is not truncated. The values with be quantised to multiples of 2, i.e. 2.0, 4.0, 6.0, etc.
To look at the distribution we can do the following:
trials = [ask(example_space) for i in 1:1000]
samples = getindex.(getproperty.(trials, :hyperparams), :example)
vals = sort(unique(samples))
counts = sum(samples .== vals'; dims=1)
probs = dropdims(counts'/sum(counts), dims=2)
Gadfly.plot(x=vals, y=probs, Gadfly.Geom.hair, Gadfly.Geom.point, Gadfly.Scale.y_continuous(minvalue=0.0), Gadfly.Guide.xticks(ticks=vals))
LogUniform(
label::Symbol,
low::Union{Float64, TreeParzen.Delayed.AbstractDelayed},
high::Union{Float64, TreeParzen.Delayed.AbstractDelayed},
) -> TreeParzen.HP.LogUniform
Returns a value drawn according to exp(uniform(low, high))
such that the logarithm of the return value is uniformly distributed. When optimizing, samples are constrained to the interval [exp(low), exp(high)]
, which is positive.
example_space = Dict(
:example => HP.LogUniform(:example, log(1.0), log(5.0)),
)
To look at the distribution we can do the following:
trials = [ask(example_space) for i in 1:1000]
samples = getindex.(getproperty.(trials, :hyperparams), :example)
Gadfly.plot(x=samples, Gadfly.Stat.density(bandwidth=0.25), Gadfly.Geom.polygon(fill=true, preserve_order=true))
N.B. the distribution looks like it has tails beyond 1 and 5 due to use of kernel density estimates, but in fact verify the sampled values are contained within specified range:
@show(minimum(samples));
@show(maximum(samples));
minimum(samples) = 1.0009891079479816
maximum(samples) = 4.996578568933641
QuantLogUniform(
label::Symbol,
low::Union{Float64, TreeParzen.Delayed.AbstractDelayed},
high::Union{Float64, TreeParzen.Delayed.AbstractDelayed},
q::Union{Float64, TreeParzen.Delayed.AbstractDelayed},
) -> TreeParzen.HP.QuantLogUniform
Returns a value drawn according to exp(uniform(low, high))
, with a quantisation q
, such that
the logarithm of the value is uniformly distributed.
Suitable for a discrete variable with respect to which the objective is "smooth" and gets smoother with the size of the value, but which should be bounded both above and below.
example_space = Dict(
:example => HP.QuantLogUniform(:example, log(1.0), log(5.0), 1.0),
)
In this example, the distribution will be log-uniform sampled from the range 1-5, with quantisation in steps of 1.
To look at the distribution we can do the following:
trials = [ask(example_space) for i in 1:1000]
samples = getindex.(getproperty.(trials, :hyperparams), :example)
vals = sort(unique(samples))
counts = sum(samples .== vals'; dims=1)
probs = dropdims(counts'/sum(counts), dims=2)
Gadfly.plot(x=vals, y=probs, Gadfly.Geom.hair, Gadfly.Geom.point, Gadfly.Scale.y_continuous(minvalue=0.0), Gadfly.Guide.xticks(ticks=vals))
N.B. due to quantisation, values at extreme ends of distribution contain fewer samples than they might ordinarily for their continuous counterpart.
LogQuantUniform(
label::Symbol,
low::Union{Float64, TreeParzen.Types.AbstractDelayed},
high::Union{Float64, TreeParzen.Types.AbstractDelayed},
q::Union{Float64, TreeParzen.Types.AbstractDelayed},
) -> TreeParzen.HP.LogQuantUniform
Returns a value drawn according to exp(uniform(low, high))
, quantised in log-space, such that
the logarithm of the return value is uniformly distributed. The value is constrained to be positive.
Suitable for searching logarithmically through a space while keeping the number of candidates bounded, e.g. searching learning rate through 1e-6 to 1e-1
example_space = Dict(
:example => HP.LogQuantUniform(:example, log(1e-5), log(1), log(10)),
)
In this example, the distribution will be log-uniform sampled in the range 1e-t to 1, with discrete points occuring at
every factor of 10 increase (the log(10)
argument for q
.
To look at the distribution we can do the following:
trials = [ask(example_space) for i in 1:1000]
samples = getindex.(getproperty.(trials, :hyperparams), :example)
vals = sort(unique(samples))
counts = sum(samples .== vals'; dims=1)
probs = dropdims(counts'/sum(counts), dims=2)
Gadfly.plot(x=vals, y=probs, Gadfly.Geom.hair, Gadfly.Geom.point, Gadfly.Scale.y_continuous(minvalue=0.0), Gadfly.Scale.x_log10, Gadfly.Guide.xlabel("x (log)"))
N.B. due to quantisation, values at extreme ends of distribution contain fewer samples than they might ordinarily for their continuous counterpart.