Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding fast multipole method for Legendre-Chebyshev transforms #83

Merged
merged 7 commits into from
May 15, 2024

Conversation

mikaem
Copy link
Collaborator

@mikaem mikaem commented May 15, 2024

This PR adds code originally found in the https://github.com/spectralDNS/Legendre-to-Chebyshev repository. The code should work for single and double precision, but has not been tested for anything else. Basically, there is now

  1. A planner in create_fmm or in order to apply the transform to a 2D matrix: create_fmm_2d
  2. An executioner in execute

Usage:

size_t N = 1024;
FLT *input_array = (FLT *)calloc(N, sizeof(FLT)); // should be initialized
FLT *output_array = (FLT *)calloc(N, sizeof(FLT));
X(fmm_plan) *fmmplan = X(create_fmm)(N, maxs, M, BOTH, lagrange, verbose);
X(execute)(input_array, output_array, fmmplan, L2C, 1);
free(input_array);
free(output_array); 
X(free_fmm)(fmmplan);

where 'N' is the size of the 1D transform, 'fmm_plan' is a struct holding all necessary data for the transforms, 'maxs' is half the size of the smallest hierarchical submatrices, 'M' is the order of the Chebyshev approximation on the submatrices, 'BOTH' indicates that you should plan for both forward L2C and backward C2L transforms. Using either C2L or L2C and the plan will only be initialized for the given direction. The 'lagrange' argument is boolean and if 0 it is using a modal approach, whereas 1 leads to a nodal (the original of Alpert et al) approach. The 'verbose' is an integer and if 0 printing is minimised, whereas 2 leads to some printing, especially in the planning.

  1. The function names should probably be altered to be in line with the rest of the repository.
  2. The single precision seems to work, but it should be improved. It is actually the reason why FFTW is needed, because FFTW is used to do a DCT on the input matríces of shape 8x8. This can easily be hardcoded for efficiency, like it has been for double precision, where the matrices are of size 18x18.
  3. If required, the FFTW dependency can be removed.

@mikaem
Copy link
Collaborator Author

mikaem commented May 15, 2024

I'm expecting the mac runners to fail since they do so on my fork, but they do not fail for my code so I think I need some help here @MikaelSlevinsky

@MikaelSlevinsky
Copy link
Owner

Thank you for the contribution! It seems the macOS issue is that the runner does not respect the exclusion of arm64 on macOS-latest, but it does so on macOS-13, which I just tested. So I think I'm happy to merge!

@MikaelSlevinsky MikaelSlevinsky merged commit 914fc6f into MikaelSlevinsky:master May 15, 2024
7 of 9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants