Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/issue 2401 alg solver adjoint #2421

Merged

Conversation

jgaeb
Copy link
Contributor

@jgaeb jgaeb commented Mar 12, 2021

Summary

Fixes Issue #2401. (See the discourse thread here.)

This pull request changes the implementation of auto diff in the Powell and Newton solvers to more efficiently compute cotangents by replacing matrix inversion with a smaller number of matrix solves.

(Parallel changes were made for the fixed point solver, but those will be put into a different PR.)

The new solution method was benchmarked against the old solution method across a variety of problem sizes. (See this comment.)

Tests

Additional tests have been added for the variadic interfaces, and make_unsafe_chainable_ptr().

Side Effects

In the course of making those changes, variadic interfaces (algebra_solver_powell_impl and algebra_solver_newton_impl) were added for both solvers.

Release notes

Updated Powell and Newton solvers to use an adjoint method to propagate derivatives in reverse mode. Should result in modest speed-up. Added variadic interfaces (algebra_solver_powell_impl and algebra_solver_newton_impl).

Checklist

  • Math issue Algebraic Solver Differentiation Speedup #2401

  • Copyright holder: Johann D. Gaebler

    The copyright holder is typically you or your assignee, such as a university or company. By submitting this pull request, the copyright holder is agreeing to the license the submitted work under the following licenses:
    - Code: BSD 3-clause (https://opensource.org/licenses/BSD-3-Clause)
    - Documentation: CC-BY 4.0 (https://creativecommons.org/licenses/by/4.0/)

  • the basic tests are passing

    • unit tests pass (to run, use: ./runTests.py test/unit)
    • header checks pass, (make test-headers)
    • dependencies checks pass, (make test-math-dependencies)
    • docs build, (make doxygen)
    • code passes the built in C++ standards checks (make cpplint)
  • the code is written in idiomatic C++ and changes are documented in the doxygen

  • the new changes are tested

Copy link
Member

@bbbales2 bbbales2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took a quick scan and left a few comments. I'll need to look closer to see if there's any more room for cleanups/whatever. I think 1st just getting the algo running without segfaults and try a few benchmarks before we worry too much about that.

stan/math/rev/functor/algebra_solver_powell.hpp Outdated Show resolved Hide resolved
stan/math/rev/functor/algebra_solver_powell.hpp Outdated Show resolved Hide resolved
stan/math/rev/functor/algebra_solver_powell.hpp Outdated Show resolved Hide resolved
stan/math/rev/functor/algebra_solver_newton.hpp Outdated Show resolved Hide resolved
stan/math/rev/functor/algebra_solver_powell.hpp Outdated Show resolved Hide resolved
stan/math/rev/functor/algebra_solver_powell.hpp Outdated Show resolved Hide resolved
stan/math/rev/functor/algebra_solver_powell.hpp Outdated Show resolved Hide resolved
stan/math/rev/functor/algebra_solver_powell.hpp Outdated Show resolved Hide resolved
@bbbales2
Copy link
Member

@jgaeb yoyo, I was looking at the algebra_solver to retool some of the internals for #2384

Basically the change that needs to happen is make this support variadic arguments in the same way that reduce_sum and the ode solvers currently do.

I don't think we'll expose this to the Stan language initially -- just add all the internals.

In terms of the algebra solver this would mean that the parameters can be a mix of anything (the solution would still be a vector).

Since the internal solvers really only need gradients with respect to the solutions vector, I think this won't be too complicated (we don't need to wire all the variadic stuff into Kinsol).

Since you are working directly on this stuff right now, want to meet and pair program this at some point? I don't want to refactor algebra_solver stuff and leave you out of the loop :D.

@jgaeb
Copy link
Contributor Author

jgaeb commented Mar 18, 2021

Hi @bbbales2 — that would be great! I'll have a lot of free time next week, if that would work for you.

(Also, this week is finals week, so I've been a little overwhelmed, but I'm hoping to get to all of your and @SteveBronder's comments next week / starting on Friday!)

@bbbales2
Copy link
Member

Next week is good. I'm free anytime other than Monday at 3PM and Thursday at 12PM (all eastern time zone). Probly plan for like a 2 hour block? We can probably hammer out quite a bit of stuff in that amount of time.

Don't worry too much about fixing things here it they don't make sense. I can explain those things in person probably pretty easily.

@jgaeb
Copy link
Contributor Author

jgaeb commented Mar 20, 2021

Cool! I sent you calendar invites for a couple of times to meet. Feel free just to cancel whichever one is less convenient. If neither of those is convenient for whatever reason, just let me know and I can try and suggest some other blocks that hopefully work a little better.

@jgaeb
Copy link
Contributor Author

jgaeb commented Mar 23, 2021

I tried to make most of the changes @bbbales2 and @SteveBronder asked for. A couple I didn't yet just because I'm still a little confused, but hopefully once I get a chance to talk with @bbbales2 tomorrow, I'll have a better sense of what's going on and be able to make those changes.

Previously some of the tests weren't passing because I was passing y_val instead of y_eval to the algebra_solver_vari. I guess this results in the varis actually being copied or something like that? I tried to read through the documentation to try to understand the difference, but I'm still having a little trouble wrapping my head around what y_val (versus y_eval) is supposed to be.

Now, the only tests that aren't passing are the degenerate_eq_tests. I'm guessing the issue is just that for the degenerate_eq_tests, Jf_x_ is ill-conditioned and so the particular matrix solve method I'm using is a little unstable. I'll take a closer look to try to understand exactly what the issue is and switch to a different method that's more stable but hopefully not too much slower.

(Also, I didn't realize that the code was being automatically reformatted, so I didn't pull before making these changes. I've added a couple commits to deal with that. When we actually get to the point of merging this pull request, I'm assuming I can just clean up the git history? Or should I plan on something else? Please let me know if I should be, e.g., making these changes in a scratch branch and then only merging and pushing periodically.)

@bbbales2
Copy link
Member

I'm assuming I can just clean up the git history?

Don't worry about it. Our git history is super messy. The autoformatter does its thing and then you gotta remember to pull or you'll have to do a local merge. The only big downside is the annoyance of the merge.

@bbbales2
Copy link
Member

@jgaeb want to do another call sometime this week and finish off the variadic stuff?

@jgaeb
Copy link
Contributor Author

jgaeb commented Aug 11, 2021

@SteveBronder — it looks like the expression tests are still failing for the algebra solver, with the following error:

test/expressions/tests6_test.cpp:22843: Failure
Expected: (matrix2_expr9_counter) <= (int21), actual: 4 vs 1

I'll see if I can figure out what's going on locally, but I won't push any changes so you can see the error for yourself.

@jgaeb
Copy link
Contributor Author

jgaeb commented Aug 11, 2021

Ok, I think I tracked down the bug that was causing the expression tests to fail! I've got my fingers crossed that it builds this time 🤞

@jgaeb
Copy link
Contributor Author

jgaeb commented Aug 12, 2021

@SteveBronder So, the expression tests ran fine, but there was a problem with the stan integration tests. In particular, two of the models involving the algebra solver failed to compile because of the following error:

non-const lvalue reference to type 'Eigen::HybridNonLinearSolver<stan::math::hybrj_functor_solver<const (lambda at /mnt/nvme1n1/workspace/Stan_downstream_tests/performance-tests-cmdstan/cmdstan/stan/lib/stan_math/stan/math/rev/functor/algebra_solver_powell.hpp:138:18) &>, double>::FVectorType' (aka 'Matrix<double, Dynamic, 1>') cannot bind to a value of unrelated type 'Eigen::Map<Eigen::Matrix<double, -1, 1, 0, -1, 1>, 0, Eigen::Stride<0, 0> >'
  solver.solve(x);

I think the way to solve this is to ensure that the x getting passed to the solver is the right plain type, so I switched to getting the x values in the body of the solver (edit: Powell solver) to

plain_type_t<decltype(to_ref(value_of(x)))> x_ref = to_ref(value_of(x));

from

auto& x_ref = to_ref(value_of(x));

It still seems to compile OK on my end, but the typename is pretty nasty, so I'm guessing there's a better way to do it. I just wanted to check that this is the idiomatic way of converting something (here the Eigen::Map<...>) to its corresponding plain type.

@SteveBronder
Copy link
Collaborator

Hey sorry for the delay was finishing up a few little things, I will look at this tmrw morning!

@SteveBronder
Copy link
Collaborator

I think you might just need

auto&& x_ref = to_ref(value_of(x));

or just a

const auto& x_ref = to_ref(value_of(x));

if your not modifying x

I know that looks really odd but the below does a good job of explaining it

https://www.fluentcpp.com/2021/04/02/what-auto-means/

@jgaeb
Copy link
Contributor Author

jgaeb commented Aug 12, 2021

@SteveBronder — Unfortunately, the integration tests are still failing for sort of a weird reason. The algebraic functor generated in stan/src/test/test-models/good/algebra_solver_good.stan has the following signature:

struct algebra_system_functor__ {
template <typename T0__, typename T1__, typename T2__>
Eigen::Matrix<stan::promote_args_t<stan::value_type_t<T0__>, stan::value_type_t<T1__>,
T2__>, -1, 1>
operator()(const T0__& x, const T1__& y, const std::vector<T2__>& dat,
           const std::vector<int>& dat_int, std::ostream* pstream__)  const 
{

This gets passed around as f inside the body of the variadic algebraic solvers, where we take all the arguments that are supposed to be passed onto f and store them in the arena so that they can be called on the reverse pass. The problem is that when we do something like

  auto f_wrt_x = [&args_vals_tuple, &f, msgs](const auto& x) {
    return apply(
        [&x, &f, msgs](const auto&... args) { return f(x, msgs, args...); },
        args_vals_tuple);
  };

all the args in args_vals_tuple are the wrong type (e.g., vectors have the arena_allocator instead of the standard allocator) and so substitution fails. (Edit: Specifically, I think dat and dat_int are the problem, since the function signature expects the standard allocator.)

Do you have any thoughts about the right way to deal with this? There must be a simpler way to deal with it than changing how the arguments are templated in the Stan-generated C++...

(On a separate note, using auto&& instead of the plain type unfortunately doesn't seem to solve the error I was posting about last night.)

…ainable pointer for the arguments passed to the user defined function (UDF).
@SteveBronder
Copy link
Collaborator

@jgaeb I think I sorted it out. We had to hard copy x before passing it to the function that accepted it by reference (and not constant reference). imo I think that's fine since this an internal function. For the tuple of arena types I used make_chainable_ptr() on the arena_args_tuple and did not call to_arena() on each individual element. What should happen there is that the memory arena during cleanup will call the deleter for the tuple, which should then call the deleter for all of the elements in the tuple. I can/should/will make a PR in stanc3 to let std::vector types take in an allocator template parameter but imo this is a good solution for now. Let's see if the tests pass and if not then I'll tinker with this more tmrw

@jgaeb
Copy link
Contributor Author

jgaeb commented Aug 13, 2021

Something weird seems to be happening with Jenkins... the tests have run and then aborted in the middle like ten times since I last checked in yesterday. Any idea what's going on?

@SteveBronder
Copy link
Collaborator

Not sure, I tried restarting them but yeah something weird seems to be happening on jenkins. Let me restart it one more time to see what's going on

@stan-buildbot
Copy link
Contributor


Name Old Result New Result Ratio Performance change( 1 - new / old )
gp_pois_regr/gp_pois_regr.stan 3.06 2.93 1.04 4.27% faster
low_dim_corr_gauss/low_dim_corr_gauss.stan 0.02 0.02 0.97 -2.68% slower
eight_schools/eight_schools.stan 0.11 0.11 1.01 1.19% faster
gp_regr/gp_regr.stan 0.16 0.16 0.99 -1.02% slower
irt_2pl/irt_2pl.stan 5.89 5.82 1.01 1.11% faster
performance.compilation 88.32 87.39 1.01 1.06% faster
low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan 8.53 8.63 0.99 -1.19% slower
pkpd/one_comp_mm_elim_abs.stan 30.87 30.66 1.01 0.66% faster
sir/sir.stan 129.5 128.45 1.01 0.81% faster
gp_regr/gen_gp_data.stan 0.04 0.04 1.01 0.64% faster
low_dim_gauss_mix/low_dim_gauss_mix.stan 2.97 2.99 0.99 -0.72% slower
pkpd/sim_one_comp_mm_elim_abs.stan 0.4 0.38 1.04 3.93% faster
arK/arK.stan 1.87 1.86 1.01 0.66% faster
arma/arma.stan 0.82 0.82 1.0 0.16% faster
garch/garch.stan 0.53 0.53 0.99 -1.14% slower
Mean result: 1.00550560724

Jenkins Console Log
Blue Ocean
Commit hash: 09ca12d


Machine information ProductName: Mac OS X ProductVersion: 10.11.6 BuildVersion: 15G22010

CPU:
Intel(R) Xeon(R) CPU E5-1680 v2 @ 3.00GHz

G++:
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

Clang:
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

@stan-buildbot
Copy link
Contributor


Name Old Result New Result Ratio Performance change( 1 - new / old )
gp_pois_regr/gp_pois_regr.stan 2.96 3.03 0.98 -2.3% slower
low_dim_corr_gauss/low_dim_corr_gauss.stan 0.02 0.02 1.02 1.66% faster
eight_schools/eight_schools.stan 0.1 0.11 0.98 -2.33% slower
gp_regr/gp_regr.stan 0.16 0.16 0.99 -1.22% slower
irt_2pl/irt_2pl.stan 5.87 5.87 1.0 0.07% faster
performance.compilation 88.4 87.4 1.01 1.13% faster
low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan 8.59 8.6 1.0 -0.05% slower
pkpd/one_comp_mm_elim_abs.stan 30.84 30.2 1.02 2.09% faster
sir/sir.stan 129.2 126.59 1.02 2.02% faster
gp_regr/gen_gp_data.stan 0.03 0.04 0.97 -2.7% slower
low_dim_gauss_mix/low_dim_gauss_mix.stan 2.98 2.96 1.01 0.59% faster
pkpd/sim_one_comp_mm_elim_abs.stan 0.38 0.39 0.99 -0.52% slower
arK/arK.stan 1.88 2.54 0.74 -35.39% slower
arma/arma.stan 0.83 0.91 0.91 -10.31% slower
garch/garch.stan 0.53 0.6 0.88 -14.23% slower
Mean result: 0.96722915298

Jenkins Console Log
Blue Ocean
Commit hash: 09ca12d


Machine information ProductName: Mac OS X ProductVersion: 10.11.6 BuildVersion: 15G22010

CPU:
Intel(R) Xeon(R) CPU E5-1680 v2 @ 3.00GHz

G++:
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

Clang:
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

@jgaeb
Copy link
Contributor Author

jgaeb commented Aug 14, 2021

@SteveBronder—Holy cow! It built!

@SteveBronder
Copy link
Collaborator

Alright! Lemme give this one more lookover tmrw then I think we'll be good!

Copy link
Collaborator

@SteveBronder SteveBronder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alrighty lgtm!!!

@jgaeb
Copy link
Contributor Author

jgaeb commented Aug 19, 2021

Awesome! Thanks for sticking with this, @SteveBronder, @bbbales2, and @charlesm93 (along with everybody else who helped out along the way). I really appreciate all the work you put in helping me through this, and realize it would have been much faster for somebody else to implement. I learned a ton diving into this, and am looking forward to continuing to contribute (hopefully in ways that require less of your time!) in the future!

@rok-cesnovar
Copy link
Member

@jgaeb thanks and congrats on your first merged PR to the Stan Math repository! 🎉 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants