-
Notifications
You must be signed in to change notification settings - Fork 4
Features list
rtobar edited this page Mar 19, 2018
·
6 revisions
- Add optimized convolution for finesampling case
- Brute-force OpenCL convolution can use some associative math to speed up further – I have a kind-of-working prototype in a local branch
- The radial profile profile evaluation can probably be improved, specially during subsampling
- One easy thing to do is to calculate the val and testval using double2/float2 types to probably execute them in parallel
- I should also think whether organizing this nicely in work groups would bring any benefit
- Probably a nicer handling of the memory buffers would benefit us...
- Add Dan's mythical FFT optimization.
- Dan already added some R/C++ code that does this, it simply needs to be ported, in theory
- Implement on-disk OpenCL kernel cache.
- Study which version/extensions are required for this so it is implemented correctly.
- Currently working on a local branch, a bit more of testing needed
- Better implement skinny ellipses. There is a whole mail thread discussion with Dan about this.
- Add finesampling support
- Use Dan's R2C/C2R transformations instead of the C2C transformations to implement convolution. This saves up memory and time!
- Implement FFTW wisdom storage
- The FFTW wisdom is tread-enablement-dependent, and therefore we basically need to maintain to wisdom files. Alongside this I created a new "profit home" directory concept, which is $HOME/.profit by default but can be something else via an environment variable.
- Include Dan's improvements to the brokenexp profile – both double and float versions, including my latest patches on ProFit
- Finish clfft integration.
- I found an issue with clfft, but on their end they responded that the library is currently in maintenance mode only. Since I didn't try to reproduce the other problem with other platforms back at the time, this means that I would need more time to investigate whether the bug is seen in Beignet only or in other platforms, and then maybe find out whether the bug actually lives in clfft or in Beignet. This is probably not going to happen.