Skip to content

Conversation

@mlubin
Copy link
Member

@mlubin mlubin commented Nov 19, 2025

Rewrites the hessian sparsity detection code to be aware that, for some nonlinear operations, not all combinations of operands generate hessian terms. For example: x*y (no diagonal terms) and x/y (no (x,x) term).

Fixes #2527
CC @amontoison

@mlubin mlubin marked this pull request as ready for review November 19, 2025 21:29
@mlubin
Copy link
Member Author

mlubin commented Nov 19, 2025

Following #2527 (comment):

using NLPModels, NLPModelsJuMP, JuMP, SparseArrays

nlp = Model()

@variable(nlp,  x[i = 1:10])
@constraint(nlp, sum(x[i] for i = 1:10) / x[1]  == 0)
@objective(nlp, Min, x[1]^4)

nlp2 = MathOptNLPModel(nlp)
rows, cols = hess_structure(nlp2)
nnz_nlp = length(rows)
vals = ones(Int, nnz_nlp)
H = sparse(rows, cols, vals, 10, 10)

Before:

10×10 SparseMatrixCSC{Int64, Int64} with 55 stored entries:
 2  ⋅  ⋅  ⋅  ⋅  ⋅  ⋅  ⋅  ⋅  ⋅
 1  1  ⋅  ⋅  ⋅  ⋅  ⋅  ⋅  ⋅  ⋅
 1  1  1  ⋅  ⋅  ⋅  ⋅  ⋅  ⋅  ⋅
 1  1  1  1  ⋅  ⋅  ⋅  ⋅  ⋅  ⋅
 1  1  1  1  1  ⋅  ⋅  ⋅  ⋅  ⋅
 1  1  1  1  1  1  ⋅  ⋅  ⋅  ⋅
 1  1  1  1  1  1  1  ⋅  ⋅  ⋅
 1  1  1  1  1  1  1  1  ⋅  ⋅
 1  1  1  1  1  1  1  1  1  ⋅
 1  1  1  1  1  1  1  1  1  1

After:

10×10 SparseMatrixCSC{Int64, Int64} with 19 stored entries:
 2  ⋅  ⋅  ⋅  ⋅  ⋅  ⋅  ⋅  ⋅  ⋅
 1  1  ⋅  ⋅  ⋅  ⋅  ⋅  ⋅  ⋅  ⋅
 1  ⋅  1  ⋅  ⋅  ⋅  ⋅  ⋅  ⋅  ⋅
 1  ⋅  ⋅  1  ⋅  ⋅  ⋅  ⋅  ⋅  ⋅
 1  ⋅  ⋅  ⋅  1  ⋅  ⋅  ⋅  ⋅  ⋅
 1  ⋅  ⋅  ⋅  ⋅  1  ⋅  ⋅  ⋅  ⋅
 1  ⋅  ⋅  ⋅  ⋅  ⋅  1  ⋅  ⋅  ⋅
 1  ⋅  ⋅  ⋅  ⋅  ⋅  ⋅  1  ⋅  ⋅
 1  ⋅  ⋅  ⋅  ⋅  ⋅  ⋅  ⋅  1  ⋅
 1  ⋅  ⋅  ⋅  ⋅  ⋅  ⋅  ⋅  ⋅  1

@odow
Copy link
Member

odow commented Nov 19, 2025

@mlubin mlubin requested a review from odow November 20, 2025 01:52
@mlubin
Copy link
Member Author

mlubin commented Nov 20, 2025

@gdalle do you have any pointers for how we can rerun the comparisons in adrhill/SparseConnectivityTracer.jl#83? This change should reduce/eliminate discrepancies for the structural non-zeros off the diagonal.

@gdalle
Copy link

gdalle commented Nov 20, 2025

I think everything should be in those two files:

https://github.com/adrhill/SparseConnectivityTracer.jl/blob/main/benchmark/SparseConnectivityTracerBenchmarks/src/nlpmodels.jl

https://github.com/adrhill/SparseConnectivityTracer.jl/blob/main/test/nlpmodels.jl

Ping @adrhill, nice to know SCT is being used as reference for JuMP tests 😉

@mlubin
Copy link
Member Author

mlubin commented Nov 21, 2025

I ran on all the PureJuMP instances in OptimizationModels. Sampled at a random point, all hessians agreed up to numerical tolerances. Only two instances had a different number of nonzeros in the hessian:

  • hs114 reduced from 23 to 21
  • britgas reduced from 1415 to 1111

Based on adrhill/SparseConnectivityTracer.jl#83 (comment) it seems like there might still be room for improvement in a future PR (specific small examples would be useful, I'm unlikely to dig in myself).

@amontoison
Copy link
Contributor

amontoison commented Nov 21, 2025

I ran all the comparisons and here are the results:

┌ Warning: Inconsistencies were detected
┌ Warning: Inconsistency for Jacobian of hs117: SCT (75 nz)  JuMP (62 nz)
┌ Warning: Inconsistency for Jacobian of lincon: SCT (19 nz)  JuMP (17 nz) 
┌ Warning: Inconsistency for Hessian of argauss: SCT (8 nz)  JuMP (9 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of biggs5: SCT (9 nz)  JuMP (12 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of biggs6: SCT (9 nz)  JuMP (12 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of britgas: SCT (1087 nz)  JuMP (1111 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of chain: SCT (75 nz)  JuMP (100 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of channel: SCT (696 nz)  JuMP (672 nz) 
┌ Warning: Inconsistency for Hessian of dixmaane: SCT (493 nz)  JuMP (297 nz) 
┌ Warning: Inconsistency for Hessian of dixmaani: SCT (493 nz)  JuMP (297 nz) 
┌ Warning: Inconsistency for Hessian of dixmaanm: SCT (493 nz)  JuMP (297 nz) 
┌ Warning: Inconsistency for Hessian of hs114: SCT (19 nz)  JuMP (21 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs119: SCT (256 nz)  JuMP (76 nz) 
┌ Warning: Inconsistency for Hessian of hs250: SCT (6 nz)  JuMP (9 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs251: SCT (6 nz)  JuMP (9 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs36: SCT (6 nz)  JuMP (9 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs37: SCT (6 nz)  JuMP (9 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs40: SCT (15 nz)  JuMP (16 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs41: SCT (6 nz)  JuMP (9 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs45: SCT (20 nz)  JuMP (25 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs56: SCT (10 nz)  JuMP (13 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs68: SCT (9 nz)  JuMP (10 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs69: SCT (9 nz)  JuMP (10 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs87: SCT (9 nz)  JuMP (11 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs93: SCT (34 nz)  JuMP (36 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of polygon1: SCT (550 nz)  JuMP (600 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of polygon2: SCT (350 nz)  JuMP (400 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of robotarm: SCT (252 nz)  JuMP (276 nz) [diagonal difference only]

We have a few small problems with less than 20 nnz in the Jacobian / Hessian of the Lagrangian.

@gdalle
Copy link

gdalle commented Nov 21, 2025

@adrhill do you know why there are some cases where the SCT pattern is a superset of the JuMP pattern?

@amontoison
Copy link
Contributor

It is because some ADNLPProblems are badly implemented and we force some structural zeros to be non-zeros: JuliaSmoothOptimizers/OptimizationProblems.jl#397

@mlubin
Copy link
Member Author

mlubin commented Nov 21, 2025

Oh nice, looks like there's no evidence of more to optimize on the JuMP side except the diagonal terms.

@gdalle
Copy link

gdalle commented Nov 21, 2025

@amontoison if you have the code and environment ready, could you re-run the comparison for TracerLocalSparsityDetector() instead of TracerSparsityDetector() on the SCT side? It might help us diagnose the remaining mismatches.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

[Nonlinear] sparsity pattern of Hessian with :(x * y)

5 participants