Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to load ARTData quickly #5057

Open
yohad opened this issue Nov 13, 2024 · 3 comments
Open

Unable to load ARTData quickly #5057

yohad opened this issue Nov 13, 2024 · 3 comments
Labels
bug code frontends Things related to specific frontends frontend: art

Comments

@yohad
Copy link

yohad commented Nov 13, 2024

Bug report

I'm trying to use skip_particles when running yt.load as I only need the gas data and the loading gets very long. The loading fails, but when I run it without skip_particles I can load just fine

Code for reproduction

In [21]: df = yt.load("/sci/labs/dekel/lab_share/vela_v2/thin/VELA_v2_07/10MpcBox_csf512_00540.d", skip_particles=True)
yt : [INFO     ] 2024-11-13 18:24:16,081 discovered particle_header:/sci/labs/dekel/lab_share/vela_v2/thin/VELA_v2_07/PMcrd_00540.DAT
yt : [INFO     ] 2024-11-13 18:24:16,082 discovered particle_data:/sci/labs/dekel/lab_share/vela_v2/thin/VELA_v2_07/PMcrs0_00540.DAT
yt : [INFO     ] 2024-11-13 18:24:16,082 discovered particle_stars:/sci/labs/dekel/lab_share/vela_v2/thin/VELA_v2_07/stars_00540.dat
yt : [INFO     ] 2024-11-13 18:24:16,087 Using root level of 14
yt : [INFO     ] 2024-11-13 18:24:16,135 Max level is 12
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
Cell In[21], line 1
----> 1 df = yt.load("/sci/labs/dekel/lab_share/vela_v2/thin/VELA_v2_07/10MpcBox_csf512_00540.d", skip_particles=True)

File ~/miniforge3/lib/python3.12/site-packages/yt/_maintenance/deprecation.py:69, in future_positional_only.<locals>.outer.<locals>.inner(*args, **kwargs)
     60     value = kwargs[name]
     61     issue_deprecation_warning(
     62         f"Using the {name!r} argument as keyword (on position {no}) "
     63         "is deprecated. "
   (...)
     67         **depr_kwargs,
     68     )
---> 69 return func(*args, **kwargs)

File ~/miniforge3/lib/python3.12/site-packages/yt/loaders.py:144, in load(fn, hint, *args, **kwargs)
    136     if missing := cls._missing_load_requirements():
    137         warnings.warn(
    138             f"This dataset appears to be of type {cls.__name__}, "
    139             "but the following requirements are currently missing: "
   (...)
    142             stacklevel=3,
    143         )
--> 144     return cls(fn, *args, **kwargs)
    146 if len(candidates) > 1:
    147     raise YTAmbiguousDataType(_input_fn, candidates)

File ~/miniforge3/lib/python3.12/site-packages/yt/frontends/art/data_structures.py:172, in ARTDataset.__init__(self, filename, dataset_type, fields, storage_filename, skip_particles, skip_stars, limit_level, spread_age, force_max_level, file_particle_header, file_particle_data, file_particle_stars, units_override, unit_system, default_species_fields)
    170 self.force_max_level = force_max_level
    171 self.spread_age = spread_age
--> 172 Dataset.__init__(
    173     self,
    174     filename,
    175     dataset_type,
    176     units_override=units_override,
    177     unit_system=unit_system,
    178     default_species_fields=default_species_fields,
    179 )
    180 self.storage_filename = storage_filename

File ~/miniforge3/lib/python3.12/site-packages/yt/data_objects/static_output.py:294, in Dataset.__init__(self, filename, dataset_type, units_override, unit_system, default_species_fields, axis_order)
    291 self._create_unit_registry(used_unit_system)
    293 self._parse_parameter_file()
--> 294 self.set_units()
    295 self.setup_cosmology()
    296 self._assign_unit_system(unit_system)

File ~/miniforge3/lib/python3.12/site-packages/yt/data_objects/static_output.py:1405, in Dataset.set_units(self)
   1396             self.unit_registry.add(
   1397                 new_unit,
   1398                 my_u.base_value / (1 + self.current_redshift),
   (...)
   1401                 prefixable=True,
   1402             )
   1403         self.unit_registry.modify("a", 1 / (1 + self.current_redshift))
-> 1405 self.set_code_units()

File ~/miniforge3/lib/python3.12/site-packages/yt/data_objects/static_output.py:1460, in Dataset.set_code_units(self)
   1457 self._override_code_units()
   1459 # set attributes like ds.length_unit
-> 1460 self._set_code_unit_attributes()
   1462 self.unit_registry.modify("code_length", self.length_unit)
   1463 self.unit_registry.modify("code_mass", self.mass_unit)

File ~/miniforge3/lib/python3.12/site-packages/yt/frontends/art/data_structures.py:226, in ARTDataset._set_code_unit_attributes(self)
    224 # all other units
    225 Om0 = self.parameters["Om0"]
--> 226 ng = self.parameters["ng"]
    227 boxh = self.parameters["boxh"]
    228 aexpn = self.parameters["aexpn"]

KeyError: 'ng'
import yt
yt.load(PATH, skip_particles=True)

Version Information

  • Operating System: Debian
  • Python Version: python 3.12
  • yt version: yt 4.3.1
  • Other Libraries (if applicable):
@neutrinoceros
Copy link
Member

thanks for reporting
Here's a reusable version of the reproducer

import yt
yt.load_sample('D9p_500', skip_particles=True)

@neutrinoceros
Copy link
Member

So the bug is pretty simple to describe
self.parameters["ng"] is only defined on the following line if particles aren't skipped

self.parameters["ng"] = self.parameters["Ngridc"]

However, how to fix it is not (to me):
As we can see, the value of this parameter is heavily relied on to define the units

ng = self.parameters["ng"]
boxh = self.parameters["boxh"]
aexpn = self.parameters["aexpn"]
hubble = self.parameters["hubble"]
r0 = boxh / ng
v0 = 50.0 * r0 * np.sqrt(Om0)
rho0 = 2.776e11 * hubble**2.0 * Om0
aM0 = rho0 * (boxh / hubble) ** 3.0 / ng**3.0
velocity = v0 / aexpn * 1.0e5 # proper cm/s
mass = aM0 * 1.98892e33

(the good news is that ng is the only such parameter that gets skipped with particles)

How should we address this ? I can see two options:

  1. Would it make sense to have a default value for ng when particles aren't read (and if so, what should this default value be) ?
  2. should we still read it from file even with skip_particles=True instead ?

Option 1 seems dangerous to me (given, I don't know what ng stands for). Option 2 is probably better but may not be practical: I haven't studied the logic in detail; maybe we can't read this value and really skip reading particles, in which case it would make more sense to go with option 3: deprecate skip_particles as broken and unfixable (not my favorite option).

@neutrinoceros neutrinoceros added bug code frontends Things related to specific frontends frontend: art labels Nov 14, 2024
@chrishavlin
Copy link
Contributor

chrishavlin commented Nov 14, 2024

For what it's worth, at least with 'D9p_500', up in the section where it reads the amr header info these two lines:

(self.ncell) = fpu.read_vector(f, "i", ">")[0]
# Try to figure out the root grid dimensions
est = int(np.rint(self.ncell ** (1.0 / 3.0)))

match the values that are read from the particle header:

self.parameters["ng"] = self.parameters["Ngridc"]
self.parameters["ncell0"] = self.parameters["ng"] ** 3

that intermediate est in the amr header section gets printed in a debug log, so comparing the calculated values in the amr header section to those in the particle header section:

import yt
yt.set_log_level('debug')
ds = yt.load_sample('D9p_500')
print(f"n cell from amr header = {ds.ncell}")
print(f"particle header parameters ng={ds.parameters['ng']}, ncell0={ds.parameters['ncell0']}")

skipping unrelated logs...

yt : [DEBUG    ] 2024-11-14 10:17:37,775 Estimating 128 cells on a root grid side, 262144 root octs
n cell from amr header = 2097152
particle header parameters ng=128, ncell0=2097152

so based on that and the name of ng in the particle header (Ngridc), i'd hazard a guess that ng is an amr grid-related parameter that happens to only exist in the particle header? and it could be safely set by est from the amr header section if it's not reading particles? But it'd be nice to have confirmation of that from someone more familiar with art outputs....

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug code frontends Things related to specific frontends frontend: art
Projects
None yet
Development

No branches or pull requests

3 participants