Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Keep most config information (input to skyCatalogs API) partitioned into separate files by source #76

Closed
JoanneBogart opened this issue Oct 17, 2023 · 3 comments
Assignees

Comments

@JoanneBogart
Copy link
Collaborator

JoanneBogart commented Oct 17, 2023

Catalogs for different source types are created independently. The config information the API needs for that source should ideally be created by the same program creating the data, or at least at the same time, but currently all config information is in the same file. I would like to keep data and possibly also the config information for each source type in subdirectories of a top-level directory which would have the top-level config. Such a config would look something like this:
catalog_dir: top_dir
catalog_name: top_config
(more top-level keys)
object_types:
star: !include star/dc2_star.yaml
snana: !include snana/dc2_sn.yaml
galaxy: !include galaxy/cosmodc2_galaxy.yaml
bulge: !include galaxy/cosmodc2_bulge.yaml
disk: !include galaxy/cosmodc2_disk.yaml
knots: !include galaxy/cosmodc2_knots.yaml

yaml does not natively support !include but there are extensions which do. I've tested pyyaml_include and it seems to be adequate.

dc2_star.yaml, one of the included files, could have contents
subtype : cosmodc2_star
star_truth: /global/cfs/cdirs/lsst/groups/SSim/DC2/dc2_stellar_healpixel.db
MW_extinction: F19
area_partition:
nside: 32
ordering: ring
type: healpix
data_file_type: parquet
file_template: star/pointsource_(?P<healpix>\d+).parquet
flux_file_template: star/pointsource_flux_(?P<healpix>\d+).parquet
internal_extinction: None
sed_file_root_env_var: SIMS_SED_LIBRARY_DIR
sed_model: file_nm

@JoanneBogart
Copy link
Collaborator Author

JoanneBogart commented Oct 17, 2023

The chief advantage of this scheme would be independence of source types, which in general are not created at the same time or by the same means. Subdirectories containing the data could be symlinks if convenient.

The subtype keyword would allow the API user to, e.g., refer to source type galaxy without being concerned whether the galaxies in question are cosmodc2 galaxies or diffsky galaxies.

@JoanneBogart
Copy link
Collaborator Author

JoanneBogart commented Jul 4, 2024

Upon further thought and a start at implementation, I propose some changes to the scheme outlined above:

  • It's better to identify source types by the names used internally rather than use something generic like 'galaxy'. Could be someone will want to include, for example, both cosmodc2 stars and gaia stars in the same simulation
  • Although the !include implementation allows for include files for the different object types being in separate directories, it is probably simpler to start by lumping them - along with data for that object type - all into the same directory, along with the file that includes them. Otherwise there has to be some way to get the information about where the data are to the skyCatalogs API. (That could be done if we end up with too many files in a single directory; it's just an extra complication).
  • There is no need to treat galaxy_bulge, galaxy_disk, etc. as full-fledged object types. They can be subcomponents of galaxy, and similarly for diffsky_galaxy and its components. With these changes there would be no subtype keyword and the top-level file might look more like
catalog_dir:  top_dir              # this is often just .
catalog_name: skyCatalog
(more top-level keys)
object_types:
   star:  !include star.yaml
   snana: !include snana.yaml
   diffsky_galaxy: !include diffsky_galaxy.yaml
   

@JoanneBogart
Copy link
Collaborator Author

With merge of PR #112 this issue can be closed as complete.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant