Bayesian_Opt_Testing

Simple Sample Baye Opt

While a number of production ready algorithms exist such as:
- MOE,
- Spearmint,
- hyperopt,
- GPyOpt,

This notebook handles making a simple algorithm.

I did not come up with this algorithm, I am simply exploring its functionality!

Simply put, the Bayesian Optimizers requires:

A Gaussian process- sklearn.gaussian_process as gp

From which we will predict the posterior distribution of the target function (function of errors for models).
Because a Gaussian distribution is defined by its mean and variance, this model relies on the assumption that the Gaussian process is also completely defined by its mean function and variance function.
- Until math has another revolution and we discover that we know nothing about math (which I seem to find a lot) we can assume this is a pretty safe assumption (GP ~ mean function & variance function).\
A GP is a popular probability model, because it induces a posterior distribution over the loss function that is analytically tractable. This allows us to update our beliefs of what looks like, after we have computed the loss for a new set of hyperparameters. https://thuijskens.github.io/2016/12/29/bayesian-optimisation/

An Aquisition function

- Which decides at which point in our target function, we want to sample next.
- A number of well documented aquistion functions exist, listed below:

Probability of Improvement

- Looks where a function's **improvement is most likely**
- Can lead to odd behavior because it relies on the current minimum, rather than the magnatude of possiblity of improvement.

  - **Expected improvement (EI)** (also MEI?) - should confirm
    - Looks where a function **may most improve**, aka *maximal expected utility*
    - EI(x)=𝔼[max{0,f(x)−f(x̂ )}]
    - Crowd favorite
  
  - **Entropy search**
    - Improves function by **minimizing the uncertainty** of any predicted optimium.

  - **Upper Confidence Bound** (UCB)
    - Looks where a function's **improvement is most likely**
    - Looks to exploits possibly uncertainty by finding where the upperbound may be undetermined.
    
  - **Maximum probability of improvement** (MPI)
  
  - **PMAX**
  - **IEMAX**
  - **GP-Hedge**

As to follow the crowd this notebook will use the Expected Improvement function, for reasons I may revisit this notebook to explain.

With these two parts our program should:

Pseudo-code:

Given observed values of target function f(x), update the posterior expectation of f using the Gaussian Process.
Find new_x that maximises the EI: new_x = argmax EI(x).
Compute the value of f(new_x).
Update posterior expectation of f
- Repeat this process for n_iterations

Example Function

As we will ultimately be looking at hyperparameters, by treating the score or error as a function of the parameters. In this case we treat it as an optimization problem for an example function, finding the global minimum (for error).

GridSearch & RandomSearch

Using 15 points, finding minimums.

GridSearching misses the true global min, as true existed outside of discrete boundaries RandomSearch also misses true global min, but could have been better than Grid

Bayesian Opt

Generating priors(inital 5)

Updating process with each aquisition. Note the red line maps to the aquisition function on the bottom and refer to the where to look next

Bay finds global optimum, confirms it is global, and searchs for the true.

Comparing methods from 15 Points

Gridsearching would search the param space symetrically and systematically.
- thorough, inefficient, uniformity between samples may miss details.
  - weak in higher dimensional space
Randomsearching would search the param space randomly.
- efficient, less thorough, reliant on sufficent iterations
  - stronger in higher dimensional space

neither learn from previously selected elements in the parameter space.

Bayesian however, does learn from previous elements, and works effectively with increased dimensional space.

2 Parameters

Example Function

Again, by treating the score or error as a function of the parameters. In this case we treat it as an optimization problem for an example function, finding the global maximum (for score). In this particular example, there are 2 maximums, which otherwise with grid and random searched would lead to confusion.

GridSearch & RandomSearch

Using 25 points, finding minimums.

Bayesian Opt

Generating priors (inital 5)

Updating process with each aquisition. Note the point on the aquisition map and refer to the where to look next

Bayesian (10 Points)

Bayesian (15 Points)

Bayesian (20 Points) - At this point, the assumed function is fairly accurate to the true

Bayesian (25 Points)

Comparing Methods with 25 points

Again...

Gridsearching would search the param space symetrically and systematically.
- thorough, inefficient, uniformity between samples may miss details.
  - weak in higher dimensional space
Randomsearching would search the param space randomly.
- efficient, less thorough, reliant on sufficent iterations
  - stronger in higher dimensional space

neither learn from previously selected elements in the parameter space.

Bayesian however, does learn from previous elements, and works effectively with increased dimensional space.
At 20 aq points with the bayesian map, the theorized model closely resembles the true function, as compared grid & random with 25 points.

Resources:

https://towardsdatascience.com/shallow-understanding-on-bayesian-optimization-324b6c1f7083 https://thuijskens.github.io/2016/12/29/bayesian-optimisation/ https://github.com/fmfn/BayesianOptimization https://sheffieldml.github.io/GPyOpt/

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
README.md		README.md
Simplified.ipynb		Simplified.ipynb
Structural Tutorial.ipynb		Structural Tutorial.ipynb
test.ipynb		test.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Bayesian_Opt_Testing

Simple Sample Baye Opt

This notebook handles making a simple algorithm.

I did not come up with this algorithm, I am simply exploring its functionality!

Pseudo-code:

Example Function

GridSearch & RandomSearch

Bayesian Opt

Comparing methods from 15 Points

2 Parameters

Example Function

GridSearch & RandomSearch

Bayesian Opt

Comparing Methods with 25 points

At 20 aq points with the bayesian map, the theorized model closely resembles the true function, as compared grid & random with 25 points.

Resources:

About

Releases

Packages

Languages

Mikhail-Naumov/Bayesian_Opt_Testing

Folders and files

Latest commit

History

Repository files navigation

Bayesian_Opt_Testing

Simple Sample Baye Opt

This notebook handles making a simple algorithm.

I did not come up with this algorithm, I am simply exploring its functionality!

Pseudo-code:

Example Function

GridSearch & RandomSearch

Bayesian Opt

Comparing methods from 15 Points

2 Parameters

Example Function

GridSearch & RandomSearch

Bayesian Opt

Comparing Methods with 25 points

At 20 aq points with the bayesian map, the theorized model closely resembles the true function, as compared grid & random with 25 points.

Resources:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages