You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have reported an equivalent issue on cvxpylayers cvxgrp/cvxpylayers#135 but since cvxpylayers uses diffcp, I wanted to report this issue here as well.
I have built an even more simple example with inaccurate gradient estimation:
Let's suppose we want to maximize $J(x) = x + \lambda (\log(1-x) + \log(1+x))$ where $\lambda \in R^+ $.
The solution is $x^* = -\lambda + \sqrt{\lambda²+1}$
And $\frac{dx^*}{d\lambda} = -1 + \frac{\lambda}{\sqrt{\lambda²+1}}$
Now, let's calculate the sensitivity of $x^{*}$ with respect to $\lambda$, at $\lambda=1$ and $\delta\lambda=10^{-6}$, with diffcp:
importnumpyasnpimportcvxpyascpimportdiffcp.cone_programascone_progimportdiffcp.utilsasutils# variablesx_=cp.Variable()
# parameters_lam=cp.Parameter(1, nonneg=True)
_lam.value=np.ones(1)
# objectiveobjective=cp.Maximize(x_+_lam*cp.log(1+x_) +_lam*cp.log(1-x_))
problem=cp.Problem(objective)
A, b, c, cone_dims=utils.scs_data_from_cvxpy_problem(problem)
x, y, s, D, DT=cone_prog.solve_and_derivative(A,
b,
c,
cone_dims,
solve_method="ECOS",
mode="dense",
feastol=1e-10,
abstol=1e-10,
reltol=1e-10)
dlam=1e-6dA=utils.get_random_like(A, lambdan: np.zeros(n))
db=np.zeros(b.size)
dc=np.zeros(c.size)
dc[1] =dc[1] -dlam# the minus sign stems from the fact that c = [-1., -1., -1.]dc[2] =dc[2] -dlam# the minus sign stems from the fact that c = [-1., -1., -1.]dx, dy, ds=D(dA, db, dc)
x_pert, y_pert, s_pert, _, _=cone_prog.solve_and_derivative(A+dA,
b+db,
c+dc,
cone_dims,
solve_method="ECOS",
mode="dense",
feastol=1e-10,
abstol=1e-10,
reltol=1e-10)
print(f"x_pert-x = {x_pert-x}")
print(f" dx = {dx}")
analytical=-1+_lam.value/np.sqrt(_lam.value**2+1)
print(f"analytical = {analytical*dlam}")
Where we see that $\frac{dx^*}{d\lambda} \delta\lambda$ has a finite difference approximation (-2.930e-7) which is much closer to the analytical gradient (-2.923e-7) than the Automatic Difference gradient (-2.612e-7), which is very surprising, especially for such a simple (unconstrained) problem.
Do you have an idea why AD performs so poorly on this example ?
The text was updated successfully, but these errors were encountered:
Hi,
I have reported an equivalent issue on cvxpylayers cvxgrp/cvxpylayers#135 but since cvxpylayers uses diffcp, I wanted to report this issue here as well.
I have built an even more simple example with inaccurate gradient estimation:
Let's suppose we want to maximize$J(x) = x + \lambda (\log(1-x) + \log(1+x))$ where $\lambda \in R^+ $ .$x^* = -\lambda + \sqrt{\lambda²+1}$ $\frac{dx^*}{d\lambda} = -1 + \frac{\lambda}{\sqrt{\lambda²+1}}$ $x^{*}$ with respect to $\lambda$ , at $\lambda=1$ and $\delta\lambda=10^{-6}$ , with diffcp:
The solution is
And
Now, let's calculate the sensitivity of
and you get
Where we see that$\frac{dx^*}{d\lambda} \delta\lambda$ has a finite difference approximation (-2.930e-7) which is much closer to the analytical gradient (-2.923e-7) than the Automatic Difference gradient (-2.612e-7), which is very surprising, especially for such a simple (unconstrained) problem.
Do you have an idea why AD performs so poorly on this example ?
The text was updated successfully, but these errors were encountered: