Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PosDef error for RBF kernel #402

Open
mjyshin opened this issue Jul 3, 2024 · 3 comments
Open

PosDef error for RBF kernel #402

mjyshin opened this issue Jul 3, 2024 · 3 comments
Labels
question Further information is requested

Comments

@mjyshin
Copy link

mjyshin commented Jul 3, 2024

Hello, I am not sure if this is just on my machine, but I can't seem to sample from prior RBF GPs or posteriors trained on them. For instance, something as simple as

x = range(0,5,30)
f = GP(SEKernel())
plot(rand(f(x), 10))

results in the error

PosDefException: matrix is not positive definite; Cholesky factorization failed.

Stacktrace:
 [1] checkpositivedefinite
   @ ~/.julia/juliaup/julia-1.10.2+0.aarch64.apple.darwin14/share/julia/stdlib/v1.10/LinearAlgebra/src/factorization.jl:67 [inlined]
 [2] cholesky!(A::Symmetric{Float64, Matrix{Float64}}, ::NoPivot; check::Bool)
   @ LinearAlgebra ~/.julia/juliaup/julia-1.10.2+0.aarch64.apple.darwin14/share/julia/stdlib/v1.10/LinearAlgebra/src/cholesky.jl:269
 [3] cholesky! (repeats 2 times)
   @ ~/.julia/juliaup/julia-1.10.2+0.aarch64.apple.darwin14/share/julia/stdlib/v1.10/LinearAlgebra/src/cholesky.jl:267 [inlined]
 [4] cholesky(A::Symmetric{Float64, Matrix{Float64}}, ::NoPivot; check::Bool)
   @ LinearAlgebra ~/.julia/juliaup/julia-1.10.2+0.aarch64.apple.darwin14/share/julia/stdlib/v1.10/LinearAlgebra/src/cholesky.jl:401
 [5] cholesky (repeats 2 times)
   @ ~/.julia/juliaup/julia-1.10.2+0.aarch64.apple.darwin14/share/julia/stdlib/v1.10/LinearAlgebra/src/cholesky.jl:401 [inlined]
 [6] rand(rng::Random._GLOBAL_RNG, f::AbstractGPs.FiniteGP{GP{ZeroMean{Float64}, SqExponentialKernel{Distances.Euclidean}}, StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, Diagonal{Float64, FillArrays.Fill{Float64, 1, Tuple{Base.OneTo{Int64}}}}}, N::Int64)
   @ AbstractGPs ~/.julia/packages/AbstractGPs/IOYUf/src/finite_gp_projection.jl:235
 [7] rand(f::AbstractGPs.FiniteGP{GP{ZeroMean{Float64}, SqExponentialKernel{Distances.Euclidean}}, StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, Diagonal{Float64, FillArrays.Fill{Float64, 1, Tuple{Base.OneTo{Int64}}}}}, N::Int64)
   @ AbstractGPs ~/.julia/packages/AbstractGPs/IOYUf/src/finite_gp_projection.jl:238
 [8] top-level scope
   @ In[128]:3

Unless I change the input sample size to something small like range(0,5,20).

Other kernels, e.g.,

x = range(0,5,30)
f = GP(Matern32Kernel())
plot(rand(f(x), 10))

run without any issue. Thank you in advance for your help.

@willtebbutt
Copy link
Member

Hi @mjyshin thanks for opening this issue!

The issue you're encountering is to be expected, and will be found in any GP package. It's a consequence of unavoidable numerical error due to finite precision arithmetic, and the eigenvalues of the kernel matrix you are working with being very close to zero.

The solution is to increase the variance of the noise of the model, something like:

x = range(0,5,30)
f = GP(SEKernel())
plot(rand(f(x, 1e-12), 10))

You'll find that this happens let with less smooth kernels, such as the Matern32Kernel because the dependence between nearby inputs decays more sharply / the samples produced by a GP with a Matern-3/2 kernel are "rougher" than those produced by the SEKernel.

Hope this helps!

@willtebbutt willtebbutt added the question Further information is requested label Jul 3, 2024
@mjyshin
Copy link
Author

mjyshin commented Jul 3, 2024

Thank you for your answer, this indeed fixed the issue. But in this case, if it's an expected numerical problem, is there no way to make it a default feature to include the 1e-12 noise variance when the GP(kernel) is instantiated? And if a second argument is given, it will be overridden.

@willtebbutt
Copy link
Member

So we actually do do this -- see here. The problem is that different kernels + different sets of inputs require different values of this, so if you're training kernel hyperparameters you'll potentially need to change this value during training as the kernel changes. What the correct thing to do in general is is still an open question tbh.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants