WIP Port Gamma to CUDA #102

fritzo · 2018-01-20T16:39:04Z

PR is for discussion only

fritzo · 2018-01-20T16:42:23Z

aten/src/ATen/native/Distributions.cpp

+      CPU_tensor_apply2<scalar, double>(ret, alpha,
+        [generator](scalar& ret_val, const double& alpha){
+          auto sample = sample_gamma(alpha, generator);
+          ret_val = sample > 0 ? sample : FLT_MIN;


Replace FLT_MIN with std::numeric_limits<scalar>::min()

fritzo · 2018-01-20T16:45:07Z

test/test_distributions.py

        if multivariate:
            # Project onto a random axis.
            axis = np.random.normal(size=torch_samples.shape[-1])
            axis /= np.linalg.norm(axis)
            torch_samples = np.dot(torch_samples, axis)
            ref_samples = np.dot(ref_samples, axis)
        samples = [(x, +1) for x in torch_samples] + [(x, -1) for x in ref_samples]
-        samples.sort()
+        shuffle(samples)  # necessary to prevent stable sort from making uneven bins for discrete


Oh, this wasn't intended to work for discrete distributions. I see you're implementing a different test for those.

fritzo · 2018-01-20T16:48:07Z

@rachtsingh Where are you seeing NANs?

rachtsingh · 2018-01-20T16:57:34Z

When I run the tests, the Dirichlet tests fail because of bad samples. I'll post a log in a few hours (I'm away from my computer).

Thanks for the numeric comment, that's exactly what I was looking for!

fritzo · 2018-01-20T17:13:38Z

Could you be more specific and paste Dirichlet test failure output? (There are like 10 Dirichlet tests 😉).

rachtsingh · 2018-01-20T17:15:21Z

Yes, will do asap, sorry!

rachtsingh · 2018-01-20T22:46:27Z

Sorry for the long delay. Here's the output (after fixing std::numeric_limits):

............................/datadrive/build/pytorch/torch/distributions/distribution.py:70: UserWarning: sample_n will be deprecated. Use .sample((n,)) instead
  warnings.warn('sample_n will be deprecated. Use .sample((n,)) instead', UserWarning)
....................................................../home/rachitsingh/venv/local/lib/python2.7/site-packages/scipy/stats/_distn_infrastructure.py:879: RuntimeWarning: invalid value encountered in greater
  return (self.a < x) & (x < self.b)
/home/rachitsingh/venv/local/lib/python2.7/site-packages/scipy/stats/_distn_infrastructure.py:879: RuntimeWarning: invalid value encountered in less
  return (self.a < x) & (x < self.b)
/home/rachitsingh/venv/local/lib/python2.7/site-packages/scipy/stats/_distn_infrastructure.py:1738: RuntimeWarning: invalid value encountered in greater_equal
  cond2 = (x >= self.b) & cond0
/home/rachitsingh/venv/local/lib/python2.7/site-packages/scipy/stats/_distn_infrastructure.py:876: RuntimeWarning: invalid value encountered in greater_equal
  return (self.a <= x) & (x <= self.b)
/home/rachitsingh/venv/local/lib/python2.7/site-packages/scipy/stats/_distn_infrastructure.py:876: RuntimeWarning: invalid value encountered in less_equal
  return (self.a <= x) & (x <= self.b)
FF.....
======================================================================
FAIL: test_beta_wrt_alpha (__main__.TestRsample)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "test_distributions.py", line 1142, in test_beta_wrt_alpha
    'at x = {}'.format(x[rel_error.argmax()]),
AssertionError: Bad gradient dx/dalpha for x ~ Beta(0.01, 0.01)
x [  1.19208998e-07   1.19208998e-07   1.19208998e-07              nan
   1.19208998e-07   1.19208998e-07   1.19208998e-07   1.19208998e-07
   8.55534563e-06   9.99999881e-01   9.99999881e-01   9.99999881e-01
   9.99999881e-01   9.99999881e-01   9.99999881e-01   9.99999881e-01
              nan   9.99999881e-01   9.99999881e-01              nan]
expected [ 0.00078592  0.00078592  0.00078592         nan  0.00078592  0.00078592
  0.00078592  0.00078592  0.05274682  0.00059625  0.00059625  0.00059625
  0.00059625  0.00059625  0.00059625  0.00059625         nan  0.00059625
  0.00059625         nan]
actual [ 0.0007859   0.0007859   0.0007859          nan  0.0007859   0.0007859
  0.0007859   0.0007859   0.0527457   0.00059624  0.00059624  0.00059624
  0.00059624  0.00059624  0.00059624  0.00059624         nan  0.00059624
  0.00059624         nan]
rel error [  2.16805036e-05   2.16805036e-05   2.16805036e-05              nan
   2.16805036e-05   2.16805036e-05   2.16805036e-05   2.16805036e-05
   2.12543943e-05   2.05788282e-05   2.05788282e-05   2.05788282e-05
   2.05788282e-05   2.05788282e-05   2.05788282e-05   2.05788282e-05
              nan   2.05788282e-05   2.05788282e-05              nan]
max error nan
at x = nan

======================================================================
FAIL: test_beta_wrt_beta (__main__.TestRsample)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "test_distributions.py", line 1172, in test_beta_wrt_beta
    'at x = {!r}'.format(x[rel_error.argmax()]),
AssertionError: Bad gradient dx/dbeta for x ~ Beta(0.01, 0.01)
x [  1.19208998e-07   1.19208998e-07   1.19208998e-07              nan
   1.19208998e-07   1.19208998e-07   1.19208998e-07   1.19208998e-07
   8.55534563e-06   9.99999881e-01   9.99999881e-01   9.99999881e-01
   9.99999881e-01   9.99999881e-01   9.99999881e-01   9.99999881e-01
              nan   9.99999881e-01   9.99999881e-01              nan]
expected [-0.00059625 -0.00059625 -0.00059625         nan -0.00059625 -0.00059625
 -0.00059625 -0.00059625 -0.04279102 -0.00078592 -0.00078592 -0.00078592
 -0.00078592 -0.00078592 -0.00078592 -0.00078592         nan -0.00078592
 -0.00078592         nan]
actual [-0.00059624 -0.00059624 -0.00059624         nan -0.00059624 -0.00059624
 -0.00059624 -0.00059624 -0.04279014 -0.0007859  -0.0007859  -0.0007859
 -0.0007859  -0.0007859  -0.0007859  -0.0007859          nan -0.0007859
 -0.0007859          nan]
rel error [ -2.05756588e-05  -2.05756588e-05  -2.05756588e-05              nan
  -2.05756588e-05  -2.05756588e-05  -2.05756588e-05  -2.05756588e-05
  -2.06265899e-05  -2.40871728e-05  -2.40871728e-05  -2.40871728e-05
  -2.40871728e-05  -2.40871728e-05  -2.40871728e-05  -2.40871728e-05
              nan  -2.40871728e-05  -2.40871728e-05              nan]
max error nan
at x = nan

----------------------------------------------------------------------
Ran 89 tests in 8.604s

FAILED (failures=2)

Based on the error message, it looks like it's sampling nan at some point. I'll look more into it as well - I was mostly wondering if you knew a quick reason why this might fail (e.g. like the check for a positive sample).

rachtsingh · 2018-01-21T16:22:04Z

Ah, I figured this out. It's a casting issue - will upload the fix in a second.

rachtsingh · 2018-01-21T16:36:13Z

Ok, yep, it's fixed. I will make the real PR to port this to CUDA (really, just a few lines of changes now) after the CUDA RNG changes are merged.

rachtsingh · 2018-01-22T22:18:16Z

Ok, CUDA changes are in this branch now - I'm waiting on review for pytorch#4556 and then I can turn this into a PR?

Unfortunately it looks like CUDA samplers will be lower accuracy in general (because of the Tesla / double thing).

fritzo

Have you checked any Q-Q plots or probability for the single-precision Gamma sampler? I'm curious how small alpha can be before samples start being clamped to zero.

fritzo · 2018-01-23T00:58:31Z

aten/src/TH/THRandom.c

-      return scale * d * v;
-  }
-}
+/* double THRandom_standard_gamma(THGenerator *_generator, double alpha) { */


Why is this commented out? Have you moved the CPU implementation?

Yes, it's been moved to https://github.com/probtorch/pytorch/pull/102/files#diff-6f5adabe13d89ad314ae10947a7f524aR250 - @apaszke brought up code duplication as an issue so I'm combining the implementations here (this is one attempt that at least works; very open to other ideas that are cleaner implementations).

Basically, by specifying precision_t you can sort of control the level of accuracy of the implementation used. For GPUs, that's float, and for CPU that's double right now.

rachtsingh · 2018-01-23T02:00:46Z

I haven't checked the q-q plots for this sampler (I checked mine a few months ago), but I've found that they can be hard to interpret. I suspect that there's a good opportunity here for someone to improve accuracy, but I'm not sure how.

rachtsingh · 2018-01-25T00:22:37Z

@fritzo should I move this to pytorch? Or did you want to take another pass on it? Also, there's some submodule gunk on this commit, but I'll clean it before upstreaming.

fritzo

Looks great, thanks for implementing this!

Ready to move upstream.

Additionally: - add support for calling functions that are not methods in the Python frontend - add an end-to-end test for the Python frontend - add a capture_stdout helper for checking that `print` actually works

Signed-off-by: Edward Z. Yang <[email protected]>

@apaszke

…h#5380) * Ignore FileNotFoundError when shutting down in data_queue.get * Address @apaszke comments

I know this works because I had to squelch a bunch of ASAN errors in multiprocessing. Signed-off-by: Edward Z. Yang <[email protected]>

fritzo added the WIP label Jan 20, 2018

fritzo commented Jan 20, 2018

View reviewed changes

fritzo commented Jan 23, 2018

View reviewed changes

rachtsingh force-pushed the gamma branch 2 times, most recently from 4aff41b to 2128818 Compare January 25, 2018 00:20

fritzo commented Jan 25, 2018

View reviewed changes

rachtsingh force-pushed the gamma branch 4 times, most recently from 63c1b4c to dbe16f8 Compare January 26, 2018 15:58

apaszke and others added 8 commits February 24, 2018 11:15

Implement no-attribute dispatch of ATen ops from the JIT (pytorch#5298)

fbf1f06

Add a print() function to the JIT script (pytorch#5274)

a011853

Additionally: - add support for calling functions that are not methods in the Python frontend - add an end-to-end test for the Python frontend - add a capture_stdout helper for checking that `print` actually works

Accept GPU perf test regression. (pytorch#5395)

c06c604

Signed-off-by: Edward Z. Yang <[email protected]>

Ignore FileNotFoundError when shutting down in data_queue.get (pytorc…

1ff537c

…h#5380) * Ignore FileNotFoundError when shutting down in data_queue.get * Address @apaszke comments

Turn on ASAN in continuous integration. (pytorch#5271)

40d79e4

I know this works because I had to squelch a bunch of ASAN errors in multiprocessing. Signed-off-by: Edward Z. Yang <[email protected]>

make CuDNN finders respect library major version (pytorch#5399)

d2f71cb

Refactor standard_gamma and implement CUDA gamma sampling

d7488f4

Attempt fixes for AT_CUDA_ENABLED changes

6ff5d33

rachtsingh force-pushed the gamma branch from dbe16f8 to 6ff5d33 Compare February 26, 2018 03:38

WIP Port Gamma to CUDA #102

Are you sure you want to change the base?

WIP Port Gamma to CUDA #102

Uh oh!

Conversation

fritzo commented Jan 20, 2018

Uh oh!

fritzo Jan 20, 2018

Choose a reason for hiding this comment

Uh oh!

fritzo Jan 20, 2018

Choose a reason for hiding this comment

Uh oh!

fritzo commented Jan 20, 2018

Uh oh!

rachtsingh commented Jan 20, 2018

Uh oh!

fritzo commented Jan 20, 2018

Uh oh!

rachtsingh commented Jan 20, 2018

Uh oh!

rachtsingh commented Jan 20, 2018

Uh oh!

rachtsingh commented Jan 21, 2018

Uh oh!

rachtsingh commented Jan 21, 2018

Uh oh!

rachtsingh commented Jan 22, 2018

Uh oh!

fritzo left a comment

Choose a reason for hiding this comment

Uh oh!

fritzo Jan 23, 2018

Choose a reason for hiding this comment

Uh oh!

rachtsingh Jan 23, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rachtsingh commented Jan 23, 2018

Uh oh!

rachtsingh commented Jan 25, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fritzo left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

rachtsingh Jan 23, 2018 •

edited

Loading

rachtsingh commented Jan 25, 2018 •

edited

Loading