Replies: 2 comments 1 reply
-
Hi Jon,
Making the azimuthal integration much faster (and it is possible with batching) would gain only a factor 2 in actual speed according to Amdhal's law. I noticed the GPU performances can be used for other application like outlier removal Did you submit an ESRF project on this topic ? Who else (i.e non ESRF) is missing this feature ? Jerome |
Beta Was this translation helpful? Give feedback.
-
I changed this issue to a discussion, this allows to make polls to assess how relevant this idea is for users outside Jon's beamline. |
Beta Was this translation helpful? Give feedback.
-
Hi @kif,
I had a quick look at cuSPARSE via cupy. At first sight, it looks quite promising :
First run is slow (compilation). It didn't seem to care about csr vs csc for timing. After warmup it was a bit slower than pyFAI for single frames, but a claims to be quicker for doing a stack of 32 frames.
Assuming I haven't made a mistake, it is dropping from 600 us per frame to about 32 us on an L40s. This is a lot of kHz. It means doing batched transfer and decompression.
What do you think? Did you try making batched integrations on a GPU before?
Beta Was this translation helpful? Give feedback.
All reactions