Description
I was trying to use cvxpylayers to get derivatives of the problem below (essentially softmax-as-constrained optimization) and ran into a bug.
where
I reported the bug at cvxgrp/cvxpylayers#145 as well but it was suggested I post here as well, as it seems the bug may be in diffcp.
Here's my best attempt at a MWE that only uses diffcp. I ran the cvxpylayers code in a debugger and just pulled out the problem parameters and hardcoded them (please let me know if I'm using the library wrong!). The derivative of x1 at point v=0.6, b=0.58 wrt b should be -10.4994 but ends up being 0.01412.
import numpy as np
import diffcp
import scipy
# program is maximize_x v*x1 + b*x2 - smooth_coeff*sum(x_i log x_i) s.t. sum(x) == 1
# we care about derivative of solution wrt b
# the analytic solution is just x = softmax(v/smooth_coeff, b/smooth_coeff)
a = scipy.sparse.csc_matrix(np.array([[ 1., 1., 0., 0.],
[-1., 0., 0., 0.],
[ 0., -1., 0., 0.],
[ 0., 0., -1., 0.],
[-1., 0., 0., 0.],
[ 0., 0., 0., 0.],
[ 0., 0., 0., -1.],
[ 0., -1., 0., 0.],
[ 0., 0., 0., 0.]]))
bb = np.array([1., 0., 0., 0., 0., 1., 0., 0., 1.])
# second coordinate here corresponds to negative of parameter b
# last two coordinates correspond to smoothing param of 0.01
c = np.array([-0.6 , -0.58 , -0.01, -0.01])
cone_dict = {'l': 2, 'q': [], 'ep': 2, 's': [], 'p': [], 'z': 1}
kwargs = {'verbose': False, 'eps_abs': 1e-05, 'eps_rel': 1e-05}
solve_method = 'SCS'
x, y, s, D, DT = diffcp.solve_and_derivative(a, bb, c, cone_dict, **kwargs)
zeros = np.zeros_like(bb)
dx = np.array([1.0,0.0,0.0,0.0])
dA, db, dc = DT(dx, zeros, zeros)
# derivative wrt parameter b_val should be -dc[1]
print('calculated derivative wrt b parameter from problem (= -dc[1]) is', -dc[1])
print('deriv of softmax(v/smooth,b/smooth) wrt b parameter from problem is approx -10.4994')