Improved History recording efficiency - no more exponential run-time increase possibly #275

sniffo · 2023-11-06T22:47:48Z

Hi,

Some people might be interested so I am posting here.

I noticed that increasing the length of a simulation using the EulerSolver caused the total run time of simulations to increase exponentially (Don't have testing data as I did not save the run times but noticed as I was tracking them, will gather some when I have time). Once I did some profiling and improved the efficiency of some geometry and gradient methods, the behavior persisted but then I found that the History.record method was to blame.

The method uses pd.concat method every time to add the datasets onto increasingly larger datasets in History.datasets. This causes the method to slow down during the course of the simulation as it is loaded on to memory each time with a larger dataframe. I've changed it so that History.datasets is now a dictionary of dictionaries. History.datasets now contains dictionaries for all elements (vert, edge, face, and cell if necessary), and each element dictionary now contains all the datasets as key value pairs with the keys being the time points and the values being the datasets at those time points.

I then introduced a History instance method History.update_datasets(), which is run within the EulerSolver.solve() method at the final time-point, which concatenates all of the datasets within the dictoinaries to provide the History.datasets as was implemented previously. This significantly reduced the time to record and it no longer increases the tun time exponentially. I have not done extensive testing yet, but all other History methods should still work as intended.

@glyg please let me know if you would like me to add these changes to my PR and any specific tests you'd like to see. I will be updating this issue with run-time tests with different numbers of time-steps.

Thanks,
Tasnif

The text was updated successfully, but these errors were encountered:

sniffo · 2023-11-07T02:56:18Z

Here's an initial comparison for up to 5120 time steps:

glyg · 2023-11-07T09:29:08Z

Hi Tasnif,

Thanks a lot for this contribution. Please add that to the PR it looks great! Just to make sure, this is transparent for the user (it does not change the I/O API)?
Thanks again

G

sniffo · 2023-11-07T13:04:22Z

It only changes the History class properties History.record method, and EulerSolver.solve method. I've used the to_archive and from_archive methods to save and load datasets and it works fine. Nothing else was changed.

glyg · 2023-11-07T13:30:06Z

great :) Please include that in the PR then !

sniffo · 2023-11-07T14:11:15Z

Perfect. Will add over the weekend :)

sniffo · 2023-11-07T19:33:30Z

I've encountered an issue with this implementation. For some reason, when using HistoryHdf5.from_archive to reload a dataset, the History.datasets returned is empty. But somehow History.retrieve still works. Not quite sure how that is possible will report back once the issue is sorted. Likely has to do with changes to the __init__() method in the base History class.

sniffo · 2023-11-07T20:08:44Z

Issue has been fixed. It was the __init__ method. Instead of changing History.datasets to the dictionaries mentioned above, I introduced another variable, History.dict. New datasets are added to this dictionary in History.record. But the History.datasets is left the same in the original __init__ method. This allows for HistoryHdf5.from_archive to recreate the accurate History object with datasets. Works well now.

sniffo · 2024-12-02T19:35:05Z

I have a fork from the up-to-date build implementing these changes if there is still interest. I can create a pull request if you would like to see the changes and suggest improvements.

Best,
Tasnif

glyg · 2024-12-04T08:55:43Z

Hi @sniffo yes do open a PR please

sniffo changed the title ~~Improved History recording efficiency - no more exponential increase possibly~~ Improved History recording efficiency - no more exponential run-time increase possibly Nov 7, 2023

glyg closed this as completed Jan 10, 2024

glyg reopened this Dec 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improved History recording efficiency - no more exponential run-time increase possibly #275

Improved History recording efficiency - no more exponential run-time increase possibly #275

sniffo commented Nov 6, 2023 •

edited

Loading

sniffo commented Nov 7, 2023

glyg commented Nov 7, 2023

sniffo commented Nov 7, 2023

glyg commented Nov 7, 2023

sniffo commented Nov 7, 2023

sniffo commented Nov 7, 2023 •

edited

Loading

sniffo commented Nov 7, 2023

sniffo commented Dec 2, 2024

glyg commented Dec 4, 2024

Improved History recording efficiency - no more exponential run-time increase possibly #275

Improved History recording efficiency - no more exponential run-time increase possibly #275

Comments

sniffo commented Nov 6, 2023 • edited Loading

sniffo commented Nov 7, 2023

glyg commented Nov 7, 2023

sniffo commented Nov 7, 2023

glyg commented Nov 7, 2023

sniffo commented Nov 7, 2023

sniffo commented Nov 7, 2023 • edited Loading

sniffo commented Nov 7, 2023

sniffo commented Dec 2, 2024

glyg commented Dec 4, 2024

sniffo commented Nov 6, 2023 •

edited

Loading

sniffo commented Nov 7, 2023 •

edited

Loading