Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Corrfunc 3.0 release discussion #328

Open
19 tasks
manodeep opened this issue Aug 8, 2024 · 7 comments
Open
19 tasks

Corrfunc 3.0 release discussion #328

manodeep opened this issue Aug 8, 2024 · 7 comments

Comments

@manodeep
Copy link
Owner

manodeep commented Aug 8, 2024

@lgarrison To guide what needs to/could happen for Corrfunc 3.0, here is my list:

Essential (?)

  • Remove python2 support completely

    • Remove python2 from setup.py
    • Remove python2 related code from C extensions
    • Remove from __future__ type constructs from python code
  • Add modern packaging with pyproject.toml / meson / whatever-else-we-should-be-using

    • how?
  • Solve the multiple OpenMP runtime library issue

    • While I don't know if there is any way to remove the duplicate OMP runtime library linking with C code, we might be able to detect that there is an issue by running some simple OMP reduction routines as a C extension and checking if the result matches the correct answer.
    • We might also want to switch to creating shared libraries rather than static libraries

Possibly (?)

  • Add numbins optimisation that only uses the number of bins necessary (as determined by the min. and max. distance possible between two cell-pairs)

    • Add a function that takes a pair of cells and the histogram bins and returns the min. and max. bin-indices needed
    • These bin indices should be stored as part of the cell-pair struct and used to call the SIMD kernels (the bin indices need to be initialised to 0 and nbins-1)
    • Ideally, there is a specialised kernel for a single histogram bin that uses a simd reduction clause, or simply a += . This could be in conjunction with a bit-setting routine that sets bits and shifts to count up to 64 pairs before calling popcnt. However, I don't see how to do that without a LOT OF code duplication.
  • Change the OpenMP parallelization to go over cell-pairs (improves cache utilisation, reduces memory requirement -> we can increase the max-bin-ref factors)

    • Create a new generate_cell_pairs function that returns the potential neighbouring cell-pairs for any given primary cell
    • Change the OpenMP parallelization to go over these cell-pairs (improves cache re-use since the primary cell is always one of the cells)
    • Test that the code gives identical results but (hopefully) faster

May be (?)

  • Add ARM64 kernels

    • Add kernels to all pair-counters
    • Run the INTEGRATION_TESTS on laptop
    • Add -march=armv8a (or -mcpu=apple-m1) to CFLAGS
    • Make sure that the OSX tests run for both ARM64 and Intel cpus
  • Rename package to corrfunc and release conda wheels

    • Add target_clones to all functions
    • cross-compile with recent gcc (possibly on linux + x86_64)
    • Add deprecation option so that import Corrfunc still works (how?)

This is also open for community discussion. If anyone has opinions on what should go in, please do add a comment.

@lgarrison
Copy link
Collaborator

Add modern packaging with pyproject.toml / meson / whatever-else-we-should-be-using

I see a packaging and build-system overhaul as the top priority for improving the user experience in Corrfunc 3. I have a student working on a prototype right now that uses meson, meson-python, and pybind11. The basic setup seems to work, and we're looking at runtime SIMD dispatch now. I would love to be able to build and distribute binaries for Corrfunc 3! When the code is ready to share, I'll link it here.

@manodeep
Copy link
Owner Author

manodeep commented Aug 8, 2024

That sounds great!

Any reason to go with pybind11 rather than cffi? Is that to future-proof in case we switch to C++ code?

@lgarrison
Copy link
Collaborator

Any reason to go with pybind11 rather than cffi? Is that to future-proof in case we switch to C++ code?

Yeah, future-proofing was part of it. Pybind11 is also a better developer experience; it tends to emit informative error messages instead of segfaults! It's also NumPy-aware, which makes it easy to enforce conditions on the memory order and shape of arrays.

@manodeep
Copy link
Owner Author

@lgarrison Would nanobind be a drop-in replacement for pybind11?

I really like how @dfm used templating within simple functions with C++ and nanobind - see here.

@lgarrison
Copy link
Collaborator

I think it would be great to use nanobind! Meson isn't an officially supported build system for nanobind so I decided to stick with pybind11 for the summer project, but I don't think it would be very hard to figure nanobind out. At the API level, it would probably be nearly a drop-in replacement since our usage is simple.

@manodeep
Copy link
Owner Author

Excellent!

What do you think about adding Typing into the function parameters? Looking at astropy code, type hints do seem to remove a lot of redundant code from the function body (into the function declaration ... )

@lgarrison
Copy link
Collaborator

I'm all for type hints! However, type hints don't have any effect at runtime, so it wouldn't let us remove code from the function body. They're just used by type checkers, language servers, and other static analysis tools.

I do think we could improve/change the Python API, including type hints. E.g. some positional arguments could probably be keyword arguments, and the return type should probably be a columnar container like a pandas df or an Astropy table (or maybe just a dict/dataclass of arrays). It's something we can iterate on once we get the harder (build system) stuff working.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants