Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Non-serializable PyBOP models and datasets can not be copied or parallelized #642

Open
YannickNoelStephanKuhn opened this issue Jan 28, 2025 · 5 comments · May be fixed by #645
Open
Labels
bug Something isn't working

Comments

@YannickNoelStephanKuhn
Copy link

YannickNoelStephanKuhn commented Jan 28, 2025

Python Version

3.11.0

Describe the bug

PyBOP models and datasets can not be pickled, which makes them unusable with my current approach to integrate EP-BOLFI into PyBOP. I need to be able to deepcopy them; furthermore, them not being pickleable makes it impossible to parallelize their evaluation with multiprocessing.

It's definitely something inside PyBOP, as the original PyBaMM models serialize just fine.

Is that a known limitation, and if so, is it a necessary side-effect of some desired functionality?

Steps to reproduce the behaviour

For the models:

>>> from copy import deepcopy
>>> import pybamm
>>> import pybop
>>> deepcopy(pybamm.lithium_ion.DFN())
<pybamm.models.full_battery_models.lithium_ion.dfn.DFN object at [....]>
>>> deepcopy(pybop.lithium_ion.DFN())

For the datasets:

from copy import deepcopy
import numpy as np
import pybop

model = pybop.lithium_ion.DFN()

t_eval = np.arange(0, 901, 3)
values = model.predict(t_eval=t_eval)

dataset = pybop.Dataset(
    {
        "Time [s]": t_eval,
        "Current function [A]": values["Current [A]"].data,
        "Voltage [V]": values["Voltage [V]"].data,
    }
)
deepcopy(dataset)

Relevant log output

For the models:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "[...]\Python311\Lib\copy.py", line 172, in deepcopy
    y = _reconstruct(x, memo, *rv)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "[...]\Python311\Lib\copy.py", line 271, in _reconstruct
    state = deepcopy(state, memo)
            ^^^^^^^^^^^^^^^^^^^^^
  File "[...]\Python311\Lib\copy.py", line 146, in deepcopy
    y = copier(x, memo)
        ^^^^^^^^^^^^^^^
  File "[...]\Python311\Lib\copy.py", line 231, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
                             ^^^^^^^^^^^^^^^^^^^^^
  File "[...]\Python311\Lib\copy.py", line 161, in deepcopy
    rv = reductor(4)
         ^^^^^^^^^^^
TypeError: cannot pickle 'module' object

For the datasets:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "[...]\Python311\Lib\copy.py", line 172, in deepcopy
    y = _reconstruct(x, memo, *rv)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "[...]\Python311\Lib\copy.py", line 271, in _reconstruct
    state = deepcopy(state, memo)
            ^^^^^^^^^^^^^^^^^^^^^
  File "[...]\Python311\Lib\copy.py", line 146, in deepcopy
    y = copier(x, memo)
        ^^^^^^^^^^^^^^^
  File "[...]\Python311\Lib\copy.py", line 231, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
                             ^^^^^^^^^^^^^^^^^^^^^
  File "[...]\Python311\Lib\copy.py", line 161, in deepcopy
    rv = reductor(4)
         ^^^^^^^^^^^
TypeError: cannot pickle 'dict_keys' object
@YannickNoelStephanKuhn YannickNoelStephanKuhn added the bug Something isn't working label Jan 28, 2025
@YannickNoelStephanKuhn YannickNoelStephanKuhn changed the title [Bug]: Non-serializable PyBOP models can not be copied or parallelized [Bug]: Non-serializable PyBOP models and datasets can not be copied or parallelized Jan 28, 2025
@BradyPlanden
Copy link
Member

Hi Yannick,

This is somewhat known and does need to be fixed. At the moment, we are able to circumvent the problem with the Pints' based optimisers via their ParallelEvaluation class. But I believe this issue is showing itself in the SciPy optimisers (see: #590).

For the moment, you can try using the new_copy method within BaseModel to acquire a deep copy of the pybop model class. See here for more information. Let me know if this does the trick for the time being!

@YannickNoelStephanKuhn
Copy link
Author

YannickNoelStephanKuhn commented Jan 28, 2025

Hi Brady, that would not solve the issue at hand, since I want to deepcopy an optimiser instance. Working around that may just be as much work as fixing the issue, so I'll have a cursory glance at it. Maybe a __deepcopy__ customization will do the trick.

@YannickNoelStephanKuhn YannickNoelStephanKuhn linked a pull request Jan 28, 2025 that will close this issue
@YannickNoelStephanKuhn
Copy link
Author

A deep-dive into Python class handling later, it is done: #645

@YannickNoelStephanKuhn
Copy link
Author

There is one issue remaining: whenever a problem gets evaluated, it consequently stops being pickleable. deepcopying it before evaluating "solves" that issue. So any parallel implementation might need to have a deepcopy step built-in.

It's the following attributes in problem.model._built_model that are not pickleable:
"y0S",
"rhs_eval",
"algebraic_eval",
"rhs_algebraic_eval",
"jac_rhs_eval",
"jac_rhs_action_eval",
"jacp_rhs_eval",
"jacp_algebraic_eval",
"jac_algebraic_action_eval",
"jacp_algebraic_eval",
"jac_rhs_algebraic_eval",
"jac_rhs_algebraic_action_eval",
"jacp_rhs_algebraic_eval",
"casadi_rhs",
"casadi_algebraic",

I didn't catch it initially since it does not occur when just running a PyBaMM model directly. Temporarily setting these attributes to None does not help either, because the "rhs_eval", "algebraic_eval", "casadi_rhs", "casadi_algebraic" are needed for all evaluateions after the first one.

@YannickNoelStephanKuhn
Copy link
Author

#585 now contains an example of how to work around that parallelization issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants