Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

octree total gas masses improperly inversely scaling with nref #4818

Open
DhruvZ opened this issue Feb 14, 2024 · 14 comments
Open

octree total gas masses improperly inversely scaling with nref #4818

DhruvZ opened this issue Feb 14, 2024 · 14 comments

Comments

@DhruvZ
Copy link

DhruvZ commented Feb 14, 2024

Bug report

yt (gas) octree smoothing does not conserve mass, and the discrepancy from the true particle mass total seems to scale inversely proportional to nref. This test uses the gizmo_mhd_mwdisk example file.

Code for reproduction

# Paste your code here
import yt
import numpy as np
use_gather = True
nref_array = [1024,512,256,128,64,32,16,8]
ds = yt.load('gizmo_mhd_mwdisk/gizmo_mhd_mwdisk.hdf5')
sm = 'scat'
if(use_gather):
    ds.sph_smoothing_style = "gather"
    sm = 'gath'
ad = ds.all_data()
part_gas_mass = np.sum(ad[('PartType0','Masses')].to('Msun').value)
n_ratios = []
n_masses = []
for nval in nref_array:
    octree = ds.octree(n_ref=nval)
    val = np.sum(octree["PartType0","Masses"].to('Msun').value)
    n_masses.append(val)
    n_ratios.append(val/part_gas_mass)

n_masses = np.array(n_masses)
n_ratios = np.array(n_ratios)

print('particle')
print(part_gas_mass)
print('octree')
print(nref_array)
print(n_masses)
print(n_ratios)
np.save(f'mass_ratios_{sm}.npy',n_ratios)

Actual outcome

yt : [INFO     ] 2024-02-14 17:17:46,734 ComovingIntegrationOn does not exist, falling back to OmegaLambda
yt : [INFO     ] 2024-02-14 17:17:46,734 ComovingIntegrationOn != 1 or (not found and OmegaLambda is 0.0), so we are turning off Cosmology.
yt : [INFO     ] 2024-02-14 17:17:46,734 Assuming length units are in kpc (physical)
yt : [INFO     ] 2024-02-14 17:17:46,773 Parameters: current_time              = 0.5
yt : [INFO     ] 2024-02-14 17:17:46,773 Parameters: domain_dimensions         = [1 1 1]
yt : [INFO     ] 2024-02-14 17:17:46,773 Parameters: domain_left_edge          = None
yt : [INFO     ] 2024-02-14 17:17:46,773 Parameters: domain_right_edge         = None
yt : [INFO     ] 2024-02-14 17:17:46,773 Parameters: cosmological_simulation   = False
yt : [INFO     ] 2024-02-14 17:17:46,786 Allocating for 1.298e+06 particles
yt : [INFO     ] 2024-02-14 17:17:46,786 Bounding box cannot be inferred from metadata, reading particle positions to infer bounding box
yt : [INFO     ] 2024-02-14 17:17:46,873 Load this dataset with bounding_box=[[-335.41967773 -305.92471313 -304.52108765], [283.54318237 300.88858032 379.63751221]] to avoid I/O overhead from inferring bounding_box.
Loading particle index: 100%|███████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 3423.92it/s]
yt : [INFO     ] 2024-02-14 17:17:47,429 Allocating octree with spatial range [-3.5219e+02, -3.2122e+02, -3.1975e+02] code_length to [2.9772e+02, 3.1593e+02, 3.9862e+02] code_length
yt : [INFO     ] 2024-02-14 17:17:47,453 Allocating Octree for 1298060 particles
yt : [INFO     ] 2024-02-14 17:17:47,491 Allocated 5993 nodes in octree
yt : [INFO     ] 2024-02-14 17:17:47,491 Octree bound 1298060 particles
yt : [INFO     ] 2024-02-14 17:17:47,521 Loading KDTree from gizmo_mhd_mwdisk.hdf5.kdtree
yt : [INFO     ] 2024-02-14 17:17:47,715 Allocating octree with spatial range [-3.5219e+02, -3.2122e+02, -3.1975e+02] code_length to [2.9772e+02, 3.1593e+02, 3.9862e+02] code_length
yt : [INFO     ] 2024-02-14 17:17:47,737 Allocating Octree for 1298060 particles
yt : [INFO     ] 2024-02-14 17:17:47,777 Allocated 12001 nodes in octree
yt : [INFO     ] 2024-02-14 17:17:47,777 Octree bound 1298060 particles
yt : [INFO     ] 2024-02-14 17:17:48,004 Allocating octree with spatial range [-3.5219e+02, -3.2122e+02, -3.1975e+02] code_length to [2.9772e+02, 3.1593e+02, 3.9862e+02] code_length
yt : [INFO     ] 2024-02-14 17:17:48,025 Allocating Octree for 1298060 particles
yt : [INFO     ] 2024-02-14 17:17:48,071 Allocated 23217 nodes in octree
yt : [INFO     ] 2024-02-14 17:17:48,071 Octree bound 1298060 particles
yt : [INFO     ] 2024-02-14 17:17:48,454 Allocating octree with spatial range [-3.5219e+02, -3.2122e+02, -3.1975e+02] code_length to [2.9772e+02, 3.1593e+02, 3.9862e+02] code_length
yt : [INFO     ] 2024-02-14 17:17:48,475 Allocating Octree for 1298060 particles
yt : [INFO     ] 2024-02-14 17:17:48,530 Allocated 45201 nodes in octree
yt : [INFO     ] 2024-02-14 17:17:48,530 Octree bound 1298060 particles
yt : [INFO     ] 2024-02-14 17:17:49,212 Allocating octree with spatial range [-3.5219e+02, -3.2122e+02, -3.1975e+02] code_length to [2.9772e+02, 3.1593e+02, 3.9862e+02] code_length
yt : [INFO     ] 2024-02-14 17:17:49,229 Allocating Octree for 1298060 particles
yt : [INFO     ] 2024-02-14 17:17:49,298 Allocated 86441 nodes in octree
yt : [INFO     ] 2024-02-14 17:17:49,298 Octree bound 1298060 particles
yt : [INFO     ] 2024-02-14 17:17:50,522 Allocating octree with spatial range [-3.5219e+02, -3.2122e+02, -3.1975e+02] code_length to [2.9772e+02, 3.1593e+02, 3.9862e+02] code_length
yt : [INFO     ] 2024-02-14 17:17:50,545 Allocating Octree for 1298060 particles
yt : [INFO     ] 2024-02-14 17:17:50,640 Allocated 169449 nodes in octree
yt : [INFO     ] 2024-02-14 17:17:50,640 Octree bound 1298060 particles
yt : [INFO     ] 2024-02-14 17:17:52,917 Allocating octree with spatial range [-3.5219e+02, -3.2122e+02, -3.1975e+02] code_length to [2.9772e+02, 3.1593e+02, 3.9862e+02] code_length
yt : [INFO     ] 2024-02-14 17:17:52,934 Allocating Octree for 1298060 particles
yt : [INFO     ] 2024-02-14 17:17:53,071 Allocated 329273 nodes in octree
yt : [INFO     ] 2024-02-14 17:17:53,071 Octree bound 1298060 particles
yt : [INFO     ] 2024-02-14 17:17:57,313 Allocating octree with spatial range [-3.5219e+02, -3.2122e+02, -3.1975e+02] code_length to [2.9772e+02, 3.1593e+02, 3.9862e+02] code_length
yt : [INFO     ] 2024-02-14 17:17:57,332 Allocating Octree for 1298060 particles
yt : [INFO     ] 2024-02-14 17:17:57,540 Allocated 645025 nodes in octree
yt : [INFO     ] 2024-02-14 17:17:57,540 Octree bound 1298060 particles
particle
8927892526.082817
octree
[1024, 512, 256, 128, 64, 32, 16, 8]
[1.21842639e+08 2.42743976e+08 4.70150715e+08 9.14686093e+08
 1.74662989e+09 3.43095593e+09 6.67585159e+09 1.30833250e+10]
[0.01364741 0.02718939 0.05266088 0.10245263 0.19563742 0.38429628
 0.74775224 1.46544383]

The total octree grid mass is scaling inversely with the refinement parameter, as the last output line indicates. This should ideally be staying roughly constant at a ratio of 1 as nref changes. Gather smoothing is not necessary to get this behavior in the output. I also created a fresh yt 4.3 environment (python 3.10 again) and installed with pip and was able to reproduce this behavior.

Version Information

  • Operating System: linux (on University of Florida HiPerGator supercomputer)
  • Python Version: 3.10
  • yt version: 4.2.2 (hash 9e2bf0e)
  • Other Libraries (if applicable): numpy 1.26.0

I installed yt from source to get this output. However, I also tested in a fresh python 3.10 environment with the pip install today and was able to reproduce this scaling total mass behavior. I first encountered this issue with my own dataset (from a SIMBA run, which uses GADGET) before I attempted on this example snapshot. I also got this output by manually setting region boundaries to +-400 rather than letting the auto region work.

Copy link

welcome bot commented Feb 14, 2024

Hi, and welcome to yt! Thanks for opening your first issue. We have an issue template that helps us to gather relevant information to help diagnosing and fixing the issue.

@matthewturk
Copy link
Member

Hi! I think this is because the masses shouldn't be smoothed onto the octree, but rather the density, and then that multiplied by the volume. Can you check those results?

That being said, even if that produces the correct answer, this behavior should indeed be flagged in the output so that this doesn't cause issues in the future.

@DhruvZ
Copy link
Author

DhruvZ commented Feb 14, 2024

Hi! I'm not 100% sure that I did this correctly, so just pasting the volume code I added to the loop in the above script in case I messed up:

    val0 = octree["PartType0","Density"].to('Msun/kpc**3')
    #val = np.sum(octree["PartType0","Masses"].to('Msun').value)
    refined = octree[('index','refined')].astype('bool')
    volume = octree['index','dx'][~refined]*octree['index','dy'][~refined]*octree['index','dz'][~refined]
    val = np.sum(volume.to('kpc**3')*val0)

Assuming that this is right, this is the output I got for the ratios between the total octree mass and the total particle mass:

[1.98063079 2.382016   2.60086731 2.72800857 2.80849894 2.84185174
 2.87547579 2.91535503]

This is obviously not increasing as strongly with nref, but it is, and never seems to be about 1.

@dnarayanan
Copy link
Contributor

@DhruvZ this is a bit of a lark but: i'm actually having trouble loading the octree without defining a left edge and a right edge. i wonder if this owes to differing versions of yt or something.

i wonder if loading like this:

octree = ds.octree(ds.domain_left_edge,ds.domain_right_edge,n_ref=64)

makes any difference?

@DhruvZ
Copy link
Author

DhruvZ commented Feb 15, 2024

I managed to get the same output loading the octree as you described, unfortunately.

@nastasha-w
Copy link
Contributor

Based on pixelization_routines.pyx, interpolate_sph_grid_gather (line 1361), I wouldn't expect exact mass conservation from this method: it is calculating the SPH density at cell centers by SPH interpolation to the cell center position, not adding up the mass in each cell. That said, this does not explain why there seems to be a systematic dependence on the octree minimum cell sizes.

Is there any chance that the octree is using the pixelize_sph_kernel_arbitrary_grid function (line 1607) instead? There, I would expect densities to be overestimated in cells that are large compared to particle sizes (see also #4788; I'm working on that one).

@DhruvZ
Copy link
Author

DhruvZ commented Feb 15, 2024

Hi, thanks for your thoughts! I'm not sure what the most efficient way to check if a was function was called in a package during a script - is there a cleaner way to do it than using pdb and just stepping through?

@nastasha-w
Copy link
Contributor

I mean, my idea was the even more basic 'add a print statement to the function'

@DhruvZ
Copy link
Author

DhruvZ commented Feb 15, 2024

I've been playing around with adding a print statement to the function (and the other related functions), and I'm feel like I'm potentially doing something wrong, but I'm not sure what. I added print statements to the source and reinstalled, but I don't see them out anywhere when running. I do see them where they should be in the compiled code. I also made a new environment for testing this and no change again. This seems to imply that neither of the methods are being called in the script I posted above (either in the original mass-based version or the updated density one).

@nastasha-w
Copy link
Contributor

... ok, that's weird. It should be handled by the SPH backend, but maybe the octree stuff has its own routines? I'm not too familiar with those. Do you see the print statements if you add them to the very first thing that gets called? That might tell you if the issue is with the installation or the assumed functions.

I'm trying to trace what gets called here, but that's proving to be a bit tricky. I think

ds.octree(n_ref=nval)

calls the YTOctree class __init__ method (source), which I think should use that pixelization_routines.py function for its SPH smoothing.
On the other hand, the "scatter" option seems to use the CyOctree method for smoothing (source).

@DhruvZ
Copy link
Author

DhruvZ commented Feb 16, 2024

I just added a print statement to the _determine_fields method in data_objects/data_containers.py and I got many outputs. This certainly suggests that the installation is working correctly. Right now I'm running as in the script I initially posted with gather on, so that is where I also would have expected it to go.

@nastasha-w
Copy link
Contributor

Hey, I know it's been a while, but those fixes for the pixelization_routines.py SPH backend have been merged into the main brach now. If you're still working on this, you could try if this works with the latest yt.

@DhruvZ
Copy link
Author

DhruvZ commented Nov 13, 2024

Hi, thanks for the update. I updated my yt to the latest version and ran the test code I posted earlier in this thread for the scatter and gather SPH options. I got the following ratios between the smoothed gas mass and the true gas mass with decreasing nref for scatter and gather respectively:

[0.90303537 1.08098354 1.18220607 1.25738418 1.30595459 1.33412869
1.35144188 1.36544885]

[1.98063079 2.382016 2.60086731 2.72800857 2.80849894 2.84185174
2.87547579 2.91535503]

Scatter seems better in that it seems to at least be close to the true gas mass with less change in the smoothed mass results as a function of nref, but there is still change as a function of nref. Gather is never reproducing the true gas mass and is also changing more significantly with nref.

@nastasha-w
Copy link
Contributor

Ok, so I'm not too familiar with the backend for the gather method, so I can't help much there, but for 'scatter', I wouldn't expect perfect mass conservation. This is because the density field is being sampled at the cell centers, not integrated over each cell, so there's going to be some error from the limited number of points sampled. That said, I don't have a great sense for whether the differences you see are reasonable. Could you check what the scaling is for different centers/edges/offsets? The ratios should at least be 1 on average.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants