Prior Injection

HyperMapper allows users to inject their knowledge into the optimization process. This is done via a prior distribution that informs where in the input space the user expects to find "good" function values. During optimization, HyperMapper combines these priors with its probabilistic model to yield a posterior on "good" function values. The prior helps guide optimization towards "good" regions of the space and away from "bad" regions, speeding up convergence. Even if the prior is wrong, HyperMapper is robust and will still find "good" performing configurations.

For more details on how HyperMapper leverages the prior knowledge during optimization, we refer to our in πBO: Augmenting Acquisition Functions with User Beliefs for Bayesian Optimization.

@inproceedings{
hvarfner2022pibo,
title={{{PiBO}: Augmenting Acquisition Functions with User Beliefs for Bayesian Optimization}},
author={Carl Hvarfner and Danny Stoll and Artur Souza and Luigi Nardi and Marius Lindauer and Frank Hutter},
booktitle={International Conference on Learning Representations},
year={2022},
@comment={https://openreview.net/forum?id=MMAeCXIa89}
}

πBO's open-source code is released as part of HyperMapper.

Usage example

To use HyperMapper's prior injection, users must provide priors for each input parameter in the json. HyperMapper automatically leverages prior information if priors are provided. For example, we can define priors for the CurrinExp function with:

{
    "application_name": "currinexp",
    "optimization_objectives": ["Value"],
    "optimization_iterations": 20,
    "input_parameters" : {
        "x1": {
            "parameter_type" : "ordinal",
            "values" : [0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1],
            "prior" : [0.3, 0.15, 0.15, 0.1, 0.075, 0.075, 0.05, 0.025, 0.025, 0.025, 0.025]
        },
        "x2": {
            "parameter_type" : "ordinal",
            "values" : [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1],
            "prior" : [0.025, 0.025, 0.025, 0.025, 0.05, 0.075, 0.075, 0.1, 0.15, 0.15, 0.3]
        }
    }
}

Here, we give HyperMapper a list of probabilities with the probability of each ordinal value being "good". Since the CurrinExp function has its optimum at X = [0, 1], we give these values higher probabilities.

Prior types

These are the prior interfaces for each parameter type.

Real Parameters

The prior can be defined as:

A named distribution over the parameter interval. Supported distributions are gaussian, decay, exponential, and uniform. All four distributions are truncated, implemented with a Beta distribution.

"x1": {
    "parameter_type" : "real",
    "values" : [-5, 10],
    "prior" : "gaussian"
}

A list with the probability density for different points in the space. Users need only provide the probability densities and HyperMapper will automatically determine the corresponding point for each probability density. HyperMapper will interpolate the values provided to approximate the underlying continuous probability distribution.

"x1": {
    "parameter_type" : "real",
    "values" : [-5, 10],
    "prior" : [0.03, 0.035, 0.04, 0.045, 0.05, 0.055, 0.05, 0.045, 0.04, 0.035, 0.03]
}

A file containing previously explored configurations. HyperMapper will use these configurations to estimate a prior distribution using a quantile of the best configurations and a Gaussian Kernel Density Estimator. In this case, the user must provide the file as an additional field in the json. The file must be in a .csv format, with the input parameters and objective values, identical to HyperMapper's output file. Optionally, the user can define the quantile to use to choose the best configurations ("Prior-guided Optimization Hyperparameters" section below).

"prior_estimation_file": "prior_data.csv",
"x1": {
    "parameter_type" : "real",
    "values" : [-5, 10],
    "prior" : "estimate"
}

A custom Gaussian distribution, with chosen mean and standard deviation. Users must also provide the means for the gaussian of each parameter. Optionally, users can also provide the standard deviation. If the standard deviations are not provided, HyperMapper will set the deviation to half of the parameters range for each parameter. The user may also use Gaussian Mixtures by providing multiple Gaussian means and optionally deviations for each parameter.

"x1": {
    "parameter_type" : "real",
    "values" : [-5, 10],
    "prior" : "custom_gaussian",
    "custom_gaussian_prior_means": [3.1415],
    "custom_gaussian_prior_stds": [1]
}

Integer Parameters

The prior can be defined as:

An ordered list of probabilities, with the probability of each integer in the interval being good.

"P1": {
    "parameter_type" : "integer",
    "values": [1, 4],
    "prior": [0.4,0.3,0.2,0.1]
},

A named distribution over the parameter interval. Supported distributions are, as in the real parameters, gaussian, decay, exponential, and uniform. All four distributions are truncated, implemented with a Beta distribution.

"x1": {
    "parameter_type" : "integer",
    "values" : [-5, 10],
    "prior" : "gaussian"
}

Ordinal/Categorical Parameters

The prior is a list of probabilities, with the probability of each ordinal/categorical value being "good".

"x1": {
    "parameter_type" : "ordinal",
    "values" : [0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1],
    "prior" : [0.3, 0.15, 0.15, 0.1, 0.075, 0.075, 0.05, 0.025, 0.025, 0.025, 0.025]
},