Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/issue 202 vectorize all #276

Merged
merged 36 commits into from
Jul 21, 2016
Merged

Conversation

rayleigh
Copy link
Contributor

@rayleigh rayleigh commented Apr 1, 2016

Submisison Checklist

  • Run unit tests: ./runTests.py test/unit
  • Run cpplint: make cpplint
  • Declare copyright holder and open-source license: see below

Summary:

Vectorization of unary functions and tests of these vectorizations

Intended Effect:

This will add vectorized unary functions and their tests.

How to Verify:

Run unit tests

Side Effects:

Updated the existing testing vectorizing framework to skip tests if the expected value and test value are both NaN. The example function foo_fun now returns NaN for 0.

Added the appropriate apply_scalar_unary function to stan/math/prim/mat.hpp, stan/math/rev/mat.hpp, and stan/math/fwd/mat.hpp. All tests in stan/math/mix/scal/fun and test/unit/math/mix/scal/prob/normal_test.cpp were updated to include stan/math/mix/mat.hpp so that they wouldn't error out.

Documentation:

I have documented the code of the vectorized functions, but I don't think it's fully user-facing yet.

Reviewer Suggestions:

Copyright and Licensing

Please list the copyright holder for the work you are submitting (this will be you or your assignee, such as a university or company):
Rayleigh

By submitting this pull request, the copyright holder is agreeing to license the submitted work under the following licenses:

@bob-carpenter
Copy link
Contributor

That submissions checklist is supposed to be the pre-submission checklist!

@bob-carpenter
Copy link
Contributor

And it needs the model tests. The tests aren't even complete yet. Maybe you want to remove this pull request until it's complete.

@rayleigh rayleigh closed this Apr 1, 2016
@syclik
Copy link
Member

syclik commented Apr 1, 2016

The model tests will need to be in the Stan library.

(Bob, this is related to how we deal with a new feature across repos.)

On Fri, Apr 1, 2016 at 12:59 PM, Bob Carpenter [email protected]
wrote:

And it needs the model tests. The tests aren't even complete yet. Maybe
you want to remove this pull request until it's complete.


You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub
#276 (comment)

@rayleigh
Copy link
Contributor Author

rayleigh commented Apr 1, 2016

Okay. I'm closing the pull request and reopen it when it has everything.

@rayleigh rayleigh reopened this Apr 4, 2016
@rayleigh
Copy link
Contributor Author

rayleigh commented Apr 4, 2016

I ran all unit tests and make cpplint. There's one build/include error from the make cpplint report, but when I looked through an old thread on the Stan-dev discussion group, it sounded like Jenkins could help me better understand the report.

@bob-carpenter
Copy link
Contributor

Jenkins won't help you with cpplint (unless it shows up in
a different compiler).

What is the error? If it's the standard libraries at the
end one, just move the built-in library includes to the end.

  • Bob

On Apr 3, 2016, at 10:15 PM, rayleigh [email protected] wrote:

I ran all unit tests and make cpplint. There's one build/include error from the make cpplint report, but when I looked through an old thread on the Stan-dev discussion group, it sounded like Jenkins could help me better understand the report.


You are receiving this because you commented.
Reply to this email directly or view it on GitHub

@rayleigh
Copy link
Contributor Author

rayleigh commented Apr 4, 2016

Jenkins actually helped. I accidentally included stan/math/prim/mat/vectorize/apply_scalar_unary.hpp twice in log1p_exp.hpp instead of including it and the non-vectorized version.

@syclik
Copy link
Member

syclik commented Apr 4, 2016

It looks like the current problem is with a compiler error:
http://d1m1s1b1.stat.columbia.edu:8080/job/Math%20Pull%20Request%20-%20Tests%20-%20Header/257/warnings18Result/new/

You need to include cmath in order for that to be fixed.

On Sun, Apr 3, 2016 at 11:39 PM, rayleigh [email protected] wrote:

Jenkins actually helped. I accidentally included
stan/math/prim/mat/vectorize/apply_scalar_unary.hpp twice in log1p_exp.hpp
instead of including it and the non-vectorized version.


You are receiving this because you commented.
Reply to this email directly or view it on GitHub
#276 (comment)

@bob-carpenter
Copy link
Contributor

Is this ready to merge or if not, what's left to test?

@rayleigh
Copy link
Contributor Author

rayleigh commented Jun 5, 2016

I made the changes that you suggested. I want to note that I moved the function cgrad from the rev arr version of util.hpp to the rev scal version because test/unit/math/mix/scal/fun/log_mix_test.cpp uses cgrad.


/**
* This is the structure for testing mock function acos (defined in the
* testing framework). See README.txt for more instructions.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rayleigh we need to remove README.txt, along with all the references to the README

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay. I got a chance to work on it today and made those changes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does anyone still need me to code review this again?
I'm traveling and pretty booked this week.

  • Bob

On Jun 27, 2016, at 8:24 AM, rayleigh [email protected] wrote:

In test/unit/math/mix/mat/fun/acos_test.cpp:

@@ -0,0 +1,97 @@
+#include <stan/math/mix/mat.hpp>
+#include <gtest/gtest.h>
+#include <test/unit/math/prim/mat/vectorize/prim_scalar_unary_test.hpp>
+#include <test/unit/math/rev/mat/vectorize/rev_scalar_unary_test.hpp>
+#include <test/unit/math/fwd/mat/vectorize/fwd_scalar_unary_test.hpp>
+#include <test/unit/math/mix/mat/vectorize/mix_scalar_unary_test.hpp>
+#include <stan/math/prim/mat/fun/acos.hpp>
+#include <test/unit/math/prim/mat/vectorize/vector_builder.hpp>
+
+/**

  • * This is the structure for testing mock function acos (defined in the
  • * testing framework). See README.txt for more instructions.

Okay. I got a chance to work on it today and made those changes.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.

@rayleigh
Copy link
Contributor Author

rayleigh commented Jul 5, 2016

I know that I haven't been pushing this, but is there anything else that I need to do? I know that there are a few future issues, but I've changed the text, updated the rev arr util.hpp, and marked tests where both expected and test values are NaN as success.

@syclik
Copy link
Member

syclik commented Jul 5, 2016

One thing I haven't seen yet is how this performs for the univariate functions pre and post. Unless I've missed it. If so, can you repost? I'm actually concerned about that.

On Jul 5, 2016, at 6:51 PM, rayleigh [email protected] wrote:

I know that I haven't been pushing this, but is there anything else that I need to do? I know that there are a few future issues, but I've changed the text, updated the rev arr util.hpp, and marked tests where both expected and test values are NaN as success.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.

@rayleigh
Copy link
Contributor Author

rayleigh commented Jul 5, 2016

Daniel, what part of the univariate function performance are you concerned about?

@syclik
Copy link
Member

syclik commented Jul 5, 2016

Wall time. I just want to make sure we're not going to take a significant performance hit for the existing univariate functions.

On Jul 5, 2016, at 7:31 PM, rayleigh [email protected] wrote:

Daniel, what part of the univariate function performance are you concerned about?


You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.

@rayleigh
Copy link
Contributor Author

rayleigh commented Jul 6, 2016

I see. I didn't measure that because I wasn't aware that I needed to, but looking at foo_test, it takes 447 ms to run the whole tests. 437 ms is from running the mix expect_values test.

However, the vectorize code is essentially putting the univariate function within a for loop. There is no optimization within the vectorize code itself because I believe this project is to make the univariate functions easier to use when writing Stan code.

@bob-carpenter
Copy link
Contributor

Everything should get compiled away through the templates
and it's the autodiff time that's the killer, not the
function computes, but it can't hurt to test.

Do I need to review more code at this point?

  • Bob

On Jul 5, 2016, at 4:54 PM, Daniel Lee [email protected] wrote:

Wall time. I just want to make sure we're not going to take a significant performance hit for the existing univariate functions.

On Jul 5, 2016, at 7:31 PM, rayleigh [email protected] wrote:

Daniel, what part of the univariate function performance are you concerned about?


You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.

@rayleigh
Copy link
Contributor Author

rayleigh commented Jul 6, 2016

If test/unit/math/rev/scal/fun/util.hpp, test/unit/math/rev/arr/fun/util.hpp, and test/unit/math/prim/mat/vectorize/expect_val_eq.hpp have been reviewed, then there really haven't been any other code changes since I've pushed all the easy to vectorize univariate functions up.

If we want to test performance, what scenarios should I test these functions in? For instance, if my function is foo, would I need to run the following tests before and after vectorization:

foo(int x)
foo(double x)
foo(rev x)
foo(fwd x)
foo(mix x)

@bob-carpenter
Copy link
Contributor

If you want to be thorough, look at log() applied to

T in {var, double} for

  • T (scalar)
  • std::vector (array)
  • Eigen::Matrix<T, Dynamic, 1> (vector)
  • Eigen::Matrix<T, Dynamic, Dynamic> (matrix)

We didn't have any higher-order vectorization than that and
have never run speed tests on forward-mode.

It doesn't need to be elaborate or checked in.

Also, did all the non-C++ files get removed?

  • Bob

On Jul 5, 2016, at 5:35 PM, rayleigh [email protected] wrote:

If test/unit/math/rev/scal/fun/util.hpp, test/unit/math/rev/arr/fun/util.hpp, and test/unit/math/prim/mat/vectorize/expect_val_eq.hpp have been reviewed, then there really haven't been any other code changes since I've pushed all the easy to vectorize univariate functions up.

If we want to test performance, what scenarios should I test these functions in? For instance, if my function is foo, would I need to run the following tests before and after vectorization:

foo(int x)
foo(double x)
foo(rev x)
foo(fwd x)
foo(mix x)


You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.

@rayleigh
Copy link
Contributor Author

rayleigh commented Jul 6, 2016

I believe that all the non-C++ files got removed.

As for speed tests, I didn't suggest any vectorizations because for many of the univariate functions, there wasn't a vectorized version. So, what would I use to get the pre-vectorization results?

@syclik
Copy link
Member

syclik commented Jul 6, 2016

I just want to make sure the univariate functions for double and var are
the about the same speed as without these changes. I know that everything
should be compiled away, but I'll believe it when I see it.

I'd start simple:

  • time one function a whole bunch of times before and after
  • time a couple of Stan programs that use one of the functions before and
    after. Set seeds to known values and make sure everything is identical.

If those show no increase in speed, I'm happy to believe that the compilers
actually did what they're supposed to. If there's more than a 2% time
increase, I'd need to sit and think about it.

On Tue, Jul 5, 2016 at 9:10 PM, rayleigh [email protected] wrote:

I believe that all the non-C++ files got removed.

As for speed tests, I didn't suggest any vectorizations because for many
of the univariate functions, there wasn't a vectorized version. So, what
would I use to get the pre-vectorization results?


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#276 (comment), or mute
the thread
https://github.com/notifications/unsubscribe/AAZ_F14puqos3_4lVMABt7KXiFayUJpRks5qSwCigaJpZM4H92K0
.

@bob-carpenter
Copy link
Contributor

On Jul 5, 2016, at 5:08 PM, rayleigh [email protected] wrote:

I see. I didn't measure that because I wasn't aware that I needed to, but looking at foo_test, it takes 447 ms to run the whole tests. 437 ms is from running the mix expect_values test.

However, the vectorize code is essentially putting the univariate function within a for loop. There is no optimization within the vectorize code itself because I believe this project is to make the univariate functions easier to use when writing Stan code.

Daniel just wants to make sure that things aren't
slower now than they used to be. They weren't previously
optimized, either.

So just run a Stan program in the develop branch and
compare to your branch (CmdStan is easiest for that if
you check out the Stan branches as submodules) for each
of the previously vectorized functions (not many).

  • Bob

P.S. I'm off to Ann Arbor next Thursday! Side trip from
visiting my parents.

@bob-carpenter
Copy link
Contributor

Yes, I was suggesting just doing the log() function.
I hope I sent it! const references to following
argument types at some scale (say 100 or 1000
elements in the list).

T
vector
Matrix<T, -1, 1>
Matrix<T, 1, -1>
matrix<T, -1, -1>

for T in {double, var}

  • Bob

On Jul 5, 2016, at 9:49 PM, Daniel Lee [email protected] wrote:

I just want to make sure the univariate functions for double and var are
the about the same speed as without these changes. I know that everything
should be compiled away, but I'll believe it when I see it.

I'd start simple:

  • time one function a whole bunch of times before and after
  • time a couple of Stan programs that use one of the functions before and
    after. Set seeds to known values and make sure everything is identical.

If those show no increase in speed, I'm happy to believe that the compilers
actually did what they're supposed to. If there's more than a 2% time
increase, I'd need to sit and think about it.

On Tue, Jul 5, 2016 at 9:10 PM, rayleigh [email protected] wrote:

I believe that all the non-C++ files got removed.

As for speed tests, I didn't suggest any vectorizations because for many
of the univariate functions, there wasn't a vectorized version. So, what
would I use to get the pre-vectorization results?


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#276 (comment), or mute
the thread
https://github.com/notifications/unsubscribe/AAZ_F14puqos3_4lVMABt7KXiFayUJpRks5qSwCigaJpZM4H92K0
.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.

@syclik
Copy link
Member

syclik commented Jul 19, 2016

I think we're good. @rayleigh timed a couple functions and it looked like there were no surprises (which is great!).

@syclik syclik added this to the v2.10++ milestone Jul 19, 2016
@syclik
Copy link
Member

syclik commented Jul 19, 2016

Jenkins, retest this please.

@syclik syclik merged commit aa1730d into develop Jul 21, 2016
@syclik syclik deleted the feature/issue-202-vectorize-all branch July 21, 2016 02:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants