Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Addition of CuPy as an Accelerated Computing Option #499

Merged
merged 63 commits into from
Mar 24, 2023

Conversation

kian1377
Copy link
Collaborator

This is a bit more of an extensive PR as it seeks to add CuPy as an accelerated math option for many computations including FFTs, MFTs, and exponentials for propagation with the OpticalSystem or FresnelOpticalSystem classes. To start with, CuPy is a package for GPU accelerated computing. While POPPY has GPU computing options available with PyOpenCL and with pyculib (which has been deprecated by Numba in favor of CuPy), this implementation seeks to perform all calculations on the GPU until the end of propagation. This reduces time for calculations as arrays no longer need to be transferred between GPU memory and standard memory when performing different calculations.

CuPy has been designed to be very similar to Numpy and has many of the same features requiring the same syntax to use. An example is numpy.fft.fft2, which with CuPy is cupy.fft.fft2. The catch is that CuPy functions can not be used on Numpy arrays since Numpy arrays are stored on standard memory. As such, the implementation strategy was to use the statement “import cupy as np” when POPPY’s Config has been set to use CuPy. This makes performing all calculations on GPU more seamless as wavefront arrays for Wavefront or FresnelWavefront objects along with optical element phasors are automatically calculated as CuPy arrays.

In addition to using CuPy for basic computations of FFTs and exponentials, CuPy also enables the use of many SciPy functions through its cupyx functions. The cupyx functions can be imported much like many scipy functions with a statement such as “import cupyx.scipy.ndimage as ndimage”. So many functions for computing array rotations or interpolations are also imported through cupyx or scipy based on POPPY’s Config setup. The same method is also used to compute Bessel functions.

Currently, many optics have been tested for use with CuPy. A table with these optics is provided below with some comments regarding the functionality of some optics.

Optical Element Class Compatible with CuPy Comments
CircularAperture Yes  
MultiCircularAperture Yes  
SquareAperture Yes  
RectangularAperture Yes  
HexagonAperture Yes  
MultiHexagonAperture Yes  
NgonAperture Yes  
SecondaryObscuration Yes  
AsymmetricSecondaryObscuration Yes  
CompoundAnalyticOptic Yes  
ThinLens Yes  
GaussianAperture Yes  
KnifeEdge Yes  
SquareFieldStop Yes These image plane elements have only been tested in Fraunhofer systems since it is difficult to use these elements in Fresnel systems anyway. This is because of the issue in converting to angular units in a Fresnel system.
RectagularFieldStop Yes
AnnularFieldStop Yes
HexagonFieldStop Yes
CircularOcculter Yes
BarOcculter Yes
BandLimitedCoronagraph Yes
IdealFQPM Yes
ScalarTransmission Yes  
InverseTransmission Yes  
ScalarOpticalPathDifference Yes  
ZernikeWFE Yes  
KolmogorovWFE Yes Does not pass test when using Tatarski power spectrum.
SineWaveWFE Yes  
StatisticalPSDWFE Yes Had to implement hack found in Issue #452, pre-existing bug not related to CuPyNote: slight differences in numpy vs cupy random generators so does not produce the exact same result with the same seed value.
PowerSpectrumWFE Yes Had to assume opd is calculated in nanometers and then rescale it to meters, there should be a better solution to this. Note: slight differences in numpy vs cupy random generators so does not produce exact same result
ContinuousDeformableMirror Yes Influence function must be provided. When using .set_surface(), you should still provide a numpy.ndarray instead of a cupy.ndarray.
HexSegmentedDeformableMirror Yes These segmented optics are functional but when converted to an ArrayOpticalElement with fixed_sampling_optic(), the OPD is only 0. Still not 100% sure why but it does not seem to be an issue with CuPy because get_opd() functions as expected.
CircularSegmentedDeformableMirror Yes
ArrayOpticalElement Yes  
FITSOpticalElement Yes  
FixedSamplingImagePlaneElement Yes  

All tests in the test suite for POPPY were also run. Currently, there is only an issue with the KolmogorovWFE test not passing due to a units issue in an exponential. This only comes up when using a Tatarski power spectrum. The tests have only been run with standard CPU computations to make sure POPPY is not running into critical errors because of the addition of CuPy even if a user isn’t using the CuPy feature.

Computation comparisons have been performed to illustrate the benefit of this accelerated computing feature. Below are comparisons of the times required for a PSF to be calculated for varying array sizes using the MKL FFT option versus the CuPy calculations. The optical systems tested had 5 different surfaces/optics. The system used for these comparisons was the University of Arizona’s HPC Puma nodes. The node utilized 32 AMD EPYC 7642 CPUs and the NVIDIA Tesla V100S GPU.

Propagation Type Array Size MKL Method Times [s] CuPy Method Times [s] Speed Up Factor
Fraunhofer 1024 0.218 0.0261 8.35
Fraunhofer 2048 0.755 0.0294 25.7
Fraunhofer 4096 3.36 0.0423 79.4
Fresnel 1024 0.714 0.0438 16.3
Fresnel 2048 4.16 0.0845 49.2
Fresnel 4096 17.5 0.225 77.8

One catch with this is none of POPPY’s display functionality is compatible with CuPy. This is because matplotlib cannot plot CuPy arrays since they are only on GPU memory. In order to obtain a Numpy array from a CuPy array, the “cupy.ndarray.get()” method can be used. So users can obtain intensity and phase arrays by adding .get().

It should be noted that the following POPPY features have not been tested for functionality with CuPy:

  • PhysicalFresnelWavefront
  • Active optics such as the TipTiltStage (found in active_optics.py)
  • The Instrument class
  • Special propagation features such as SemiAnalyticCoronagraph (found in special_prop.py)
  • Sub-sampled optics such as ShackHartmannWavefrontSensor
  • Floating-window centroid calculations

@douglase was also involved in the addition so I am including him in this PR.

Let me know your thoughts and what changes need to be made.

@codecov
Copy link

codecov bot commented Mar 29, 2022

Codecov Report

Patch coverage: 79.57% and project coverage change: -0.78 ⚠️

Comparison is base (7b8d44a) 74.74% compared to head (c2f925b) 73.97%.

Additional details and impacted files
@@             Coverage Diff             @@
##           develop     #499      +/-   ##
===========================================
- Coverage    74.74%   73.97%   -0.78%     
===========================================
  Files           18       18              
  Lines         6502     6612     +110     
===========================================
+ Hits          4860     4891      +31     
- Misses        1642     1721      +79     
Impacted Files Coverage Δ
poppy/dms.py 46.04% <36.53%> (-0.36%) ⬇️
poppy/accel_math.py 39.84% <60.00%> (+2.15%) ⬆️
poppy/poppy_core.py 79.72% <71.42%> (-0.95%) ⬇️
poppy/utils.py 53.56% <85.71%> (+0.27%) ⬆️
poppy/optics.py 81.81% <86.48%> (-0.11%) ⬇️
poppy/wfe.py 74.63% <87.50%> (-6.65%) ⬇️
poppy/physical_wavefront.py 92.59% <88.46%> (-0.92%) ⬇️
poppy/fresnel.py 84.95% <94.44%> (+0.02%) ⬆️
poppy/zernike.py 82.91% <94.64%> (+0.17%) ⬆️
poppy/__init__.py 98.00% <100.00%> (+0.08%) ⬆️
... and 3 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@kian1377
Copy link
Collaborator Author

kian1377 commented Mar 3, 2023

I am working through it to see what I changed, but I dont recall ever changing anything related to that test. I remember adding a test called test_inwave_fresnel in #402, but dont remember touching any other tests.

Also, there are a couple minor changes I also made since yesterday which I will be pushing shortly.

@BradleySappington
Copy link
Collaborator

@kian1377 - FYI that development has been updated to settle failed tests. Fetch/Pull required.

@kian1377
Copy link
Collaborator Author

@BradleySappington , Are you talking about the issue with the scikit-image test failing? I haven't seen any commits or changes to that test so I am a little confused as to what development you are talking about.

@BradleySappington
Copy link
Collaborator

@kian1377 I was speaking to the CI tests that have been historically failing for reasons unrelated to your branch. Looks like you just pushed develop so some of the CI tests should now pass.

@kian1377
Copy link
Collaborator Author

Ok, I saw there was a conflict so I just resolved that and updated my repo by pulling some of the other commits.

@kian1377
Copy link
Collaborator Author

@mperrin , I installed scikit-image onto my desktop again and I am recreating the issue with the scikit-image test. Because this is is an issue for both CPU and GPU versions of the code, is it necessary to resolve this within this PR? I was thinking this issue should be resolved in a separate PR so there is a more clear record of its problem and the solution.

Here is the current status of the tests on my desktop running with GPU.
image

@mperrin
Copy link
Collaborator

mperrin commented Mar 13, 2023

Bravo @kian1377 on getting the remainder of the tests passing! It's very nice to get the entire test suite passing again on GPU, and that gives a lot of confidence in the robustness of a PR this complicated.

I completely concur with splitting out the failing Fresnel text (which doesn't even get ran by the Github Actions CI) to a separate task in #552.

Let me give this one more review/reread in the next day or two (I was out last week), but I think we are very very close to pushing the merge button on this one :-)

@kian1377
Copy link
Collaborator Author

Sounds good to me. I put some comments in the code so it is easy to understand why I made those changes, but feel free to delete those comments to clean up the code as you review the changes.

One of your initial concerns that I have not addressed are those repeated lines of code that update the accel_math settings in the initialization of objects because I continued switching between GPU and CPU while I was debugging the tests, so that was a feature I kept using.

Let me know if there are aspects of the code that I should change.

@kian1377
Copy link
Collaborator Author

@mperrin One change that I just thought of that may make the code a little cleaner is instead of import cupy or numpy as _ncp, I could change it to just by xp. This way, anyone reading the code knows that xp is a stand in for np or cp. Just thought this may be a better practice to use so let me know if you want this to be changed.

@mperrin
Copy link
Collaborator

mperrin commented Mar 17, 2023

Hi @kian1377, yes, good suggestion. I read some and found that yes using xp is a common/recommended idiom for "this could be numpy or cupy", and I agree it's good practice to use common conventions like that when possible.

(Remind me which code editor you're using? I remember you had asked me about PyCharm which I use. That sort of global rename refactoring is super straightforward to do in PyCharm, with automated code refactoring tools that are much more sophisticated than just a string search-and-replace. One of the many reasons I like to use PyCharm, FYI)

@mperrin
Copy link
Collaborator

mperrin commented Mar 17, 2023

@kian1377 I'm doing a full reread/review now. FYI, I assume this will be OK with you, but for minor cleanup like removing commented-out lines, I'm just going to do that myself and push to this branch; it's as easy for me to do that as it would be to flag those lines here in GitHub.

To answer your earlier question about the repeated lines of code in the initialization of objects. I do think that's less than desirable to have blocks of repeated code. And isn't it a performance hit to have many many calls to update_math_settings which are most of the time going to be unnecessary? I don't want to get hung up on this, but I wonder if the use case of switching back and forth is rare enough or only needed in special cases for debugging, and so maybe it's not worth trying to add the extra complications to support switching arbitrarily in general?

@kian1377
Copy link
Collaborator Author

I just went ahead and replaced all _ncp instances with xp and fixed a couple bugs that came up after doing so. I agree that we should remove those repeated blocks of code that are for switching between packages, but I kept using them for debugging purposes so if we wait until everything else in the PR is ready to be merged, then I can delete those lines and do a final check to make sure it didn't break anything else.

And feel free to delete some lines/comments that I had put in and push them directly. I just left those so it would be easier for you during code review.

Copy link
Collaborator

@mperrin mperrin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After lots of excellent work by @kian1377, plus several passes of close edits and review by me, I'm going to declare this good to go! A major enhancement to poppy with substantial speed improvements using GPU hardware.

The entire test suite passes using GPU hardware on my laptop. And likewise all tests pass using CPU, locally on my laptop and also on the Github Actions CI.

Comments below are just some minor notes and comments on places we could further tune in subsequent PRs later.

Comment on lines +310 to +311
if isinstance(self.rotation, u.Quantity) and self.rotation.unit==u.degree:
angle = -np.deg2rad(self.rotation).value
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this change have anything to do with the GPU code? Or is it an unrelated bug fix? I guess the point is that you want the variable angle to be a bare float rather than a Quantity in the end, yes?

In any case this could be generalized & simplified to

   if isinstance(self.rotation, u.Quantity):
       angle = -np.deg2rad(self.rotation.to_value(u.degree))

That will work for any input rotation unit.

Comment on lines +185 to 188
elif _USE_CUPY:
return cp.fft.ifftshift(x)
else:
return np.fft.ifftshift(x)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't we simplify this whole part to use _ncp, or xp after that switch, to call the appropriate CPU/GPU code? In other words like:

else:
    return xp.fft.ifftshift(x)

This is true for several places here in accel_math.py.

For now I'm choosing to leave this as-is - we want to get this PR merged in rather than keep polishing indefinitely :-)

Comment on lines 292 to 296
if _USE_CUPY: #########################################################################
do_fft = cp.fft.fft2 if forward else cp.fft.ifft2
if normalization is None:
normalization = 1./wavefront.shape[0] if forward else wavefront.shape[0]
wavefront = do_fft(wavefront)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another place where we could simplify - I think we don't even need an _USE_CUPY branch of the if statement. The same case could handle plain numpy and cupy using the xp approach.

poppy/dms.py Outdated Show resolved Hide resolved
@mperrin
Copy link
Collaborator

mperrin commented Mar 24, 2023

Everything passes (including local tests on GPU hardware).

For the record the one 'failing' CI check is the code coverage, which has a slight decrease in coverage. This is not surprising since there's no way currently to test GPU code on Github Actions. (See https://github.com/orgs/github/projects/4247/views/1?filterQuery=+GPU&pane=issue&itemId=4967370; this is a "future" feature for Github Actions, with no particular timeline announced)

Good to merge!

@mperrin mperrin merged commit a8e641d into spacetelescope:develop Mar 24, 2023
@kian1377
Copy link
Collaborator Author

kian1377 commented Mar 24, 2023 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants