-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
need a data model that contains essential information of a calculation #1171
Comments
@JenkeScheen The biggest chunk are probably the nc files, we would benefit from having the header information of these (output of |
Here's the header for an
|
@JenkeScheen I don't see anything terribly wrong with it, @jchodera maybe you can spot something else here? EDIT: Check next comment. |
@JenkeScheen Oh actually, never mind that previous comment, I mixed the files so I am actually using 50 and you are using 250. That should be okay. Sorry for the noise. |
@JenkeScheen I was thinking again about this, and I think it makes sense that you have such big nc files, at least compared to what we commonly get running benchmarks. I don't really know how exactly the information is stored in the netCDF format, but I'm going to guess that the fundamental types are IEEE-754 standard C types (that is, float is a 32 bit data type, for practical purposes). Here is a quick comparison:
If we compute the ratio of the numbers in the NOTES:
|
thanks @ijpulidos, IIRC the |
From discussions on our dev syncs what we want to do here for now is changing the default to NOT store any special atom indices in the analysis .nc files, this should lower the size of the output by a considerable amount. This is done in the changes in #1185 |
When running
perses
on a protein-ligand system we've noticed that thehybrid factory
of edges is huge:Instead of just serializing out this object we need to come up with a data model that contains the essential information associated with a calculation.
Settings for the above calculation:
This is with
0.10.1 : pyha21a80b_1
The text was updated successfully, but these errors were encountered: