Integration with RAIL #9

jlvdb · 2024-05-16T07:33:57Z

Since the project code is stable now I'm starting to prepare a release and integrating it correctly with the RAIL ecosystem. There are a few items that are not fully clear to me:

Did I choose the correct path structure in src/? As what kind of algorithm should the code be classified and in which place exactly should the final YawSummarizer stage go?
Where do the examples and notebooks belong? Do they stay in this repository or should they move some of the top level RAIL repos?
It would be great if someone experienced with RAIL could have a look at the repository in general to ensure that the quality requirements are met.
Are there any other steps required other than releasing the package on PyPI and registering it as a RAIL algorithm?

The text was updated successfully, but these errors were encountered:

aimalz · 2024-05-16T15:59:18Z

Thanks for setting this up!

The actual informer/summarizer pair should be in a single module in src/rail/estimation/algos. To match style, it would be best for the one module per algorithm containing all informer/estimator/classifier/summarizer stages that use it to be called yaw.py (or yaw_cc.py, yaw_clust_z.py, etc.) without having 'summarizer' in the module name, as we hope to eventually support both summarization and estimation with more methods that can actually be applied in both ways, and users can see what stages are available in a module pretty easily using python's standard help functionality. However, I don't think we've actually achieved standardization on whether anything besides the __init__.py can go in src/rail/[algo-name] (seems like usually no, but there are examples that do have other modules there), and we're a bit all over the place on what other supporting modules can go in src/rail/estimation/algos (and similarly for creation/engines, creation/degraders, and evaluation/metrics). Input from others would definitely be helpful here! For now, I'd recommend at least prepending an underscore if you don't expect users to directly call anything in the supporting modules and ensuring that there aren't already modules with identical names in the RAILiverse (to make it easier for folks to debug something like an import error that could arise in any of the same-named modules that might be hinted at in error messages without showing the full path).
Demos should indeed go in the primary RAIL repo's examples/estimation directory, as that's what gets synced for rendering the notebooks on ReadTheDocs. However, while the package is in development, it's fine to leave them in a top-level `examples' directory.
This should probably be reviewed by more than one person and perhaps not yet, so I'll leave that for later -- definitely worth mentioning it in a tag-up when you're ready!
I think @hangqianjun and @ztq1996 can advise best (and maybe identify if there are any holes in the development principles doc to help address all these questions).

jlvdb · 2024-05-16T16:18:37Z

I think one item to add is
5. add the algorithm to the citations page on the RAIL docs.

jlvdb · 2024-05-17T11:25:58Z

The code does not fit into the currently existing summarizers (CatSummarizer, PZSummarizer, SZPZSummarizer) because, even if you wrap/group some of the stages, there are always different numbers or types of inputs as can be seen here:

So subclassing anything is not an option at the moment.

If I understand you correctly @aimalz, you are suggesting that I put all five stages into src/rail/estimation/algos/cc_yaw.py and move the package with the remaining code to something like src/rail/yaw_rail/ (following the example of pz-rail-som)?

jlvdb · 2024-05-17T12:10:38Z

I restructured the code accordingly on the associated branch. Feel free to check if that better matches the RAIL principles.

1. Code structure
2. Collecting all examples and moving them to RAIL
3. Code review
4. Registering package in rail/rail_packages.yml
5. Listing package in rail/docs/source/citing.rst
6. Publishing v1.0

aimalz · 2024-05-17T22:01:40Z

Gotcha, thanks for the diagram, that makes a lot of sense. (Also, I would love to make the figures for the v1 paper using whatever you used for this -- let's compare notes offline!)

Organizationally, I don't want to suggest significantly rearranging the code you've written when it's already grouped reasonably given the complexity of the algorithm. I can, however, think of a way to use import statements so users can find the YAW functionality in parallel places relative to other algos. My suggestion is to make two new modules:

src/rail/tools/yaw_handles.py would contain import statements for all the YAW-specific DataHandle subclasses.
src/rail/estimation/algos/yaw_xcorr.py (where "xcorr" could be replaced with "clust_z", "cc_z" or whatever shorthand folks prefer for the spatial correlation/clustering-based approach to non-photometric redshift estimation) would contain import statements for all the YawRailStage subclasses.

Relatedly, I would also propose the following changes to follow the current naming conventions:

YawCacheCreate --> YawCacheMaker (because Creator already means something unrelated in the RAILiverse)
YawCacheDrop --> YawCacheDropper
YawCrossCorrelate --> YawCrossCorrelator
YawAutoCorrelate --> YawAutoCorrelator
YawSummarize --> YawSummarizer

As for where to put the modules currently in src/rail/yaw_rail, I think it would be best to move them out of there because, while I don't know the details of how the namespace trick works, nearly all the other rail_* repositories have nothing but an __init__.py in their equivalent of src/rail/yaw directory (yes, another suggested renaming, sorry), so . (Exceptions include rail_sompz, rail_deepdisc, and rail_bpz, which each have a utils.py that should be renamed to disambiguate if not also moved elsewhere in the directory structure.) In any case, the closest analogue to YAW is rail_tpz, which pretty reasonably has directories under src/rail/estimation/algos, so I'd suggest following that model of moving src/rail/yaw_rail into src/rail/estimation/algos until we establish a standard (and this is something we should definitely add to the development guide).

How does that sound?

jlvdb added the help wanted Extra attention is needed label May 16, 2024

jlvdb linked a pull request May 17, 2024 that will close this issue

updated stage network sketch #14

Merged

4 tasks

jlvdb closed this as completed in #14 May 17, 2024

jlvdb reopened this May 17, 2024

jlvdb mentioned this issue Jul 10, 2024

Redesign #18

Merged

4 tasks

jlvdb mentioned this issue Jul 18, 2024

Wrapping up for stable v1.0 release #23

Merged

4 tasks

jlvdb closed this as completed Jan 10, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Integration with RAIL #9

Integration with RAIL #9

jlvdb commented May 16, 2024

aimalz commented May 16, 2024

jlvdb commented May 16, 2024 •

edited

Loading

jlvdb commented May 17, 2024 •

edited

Loading

jlvdb commented May 17, 2024 •

edited

Loading

aimalz commented May 17, 2024

Integration with RAIL #9

Integration with RAIL #9

Comments

jlvdb commented May 16, 2024

aimalz commented May 16, 2024

jlvdb commented May 16, 2024 • edited Loading

jlvdb commented May 17, 2024 • edited Loading

jlvdb commented May 17, 2024 • edited Loading

aimalz commented May 17, 2024

jlvdb commented May 16, 2024 •

edited

Loading

jlvdb commented May 17, 2024 •

edited

Loading

jlvdb commented May 17, 2024 •

edited

Loading