You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Global variables are notorious for introducing race conditions, which can cause bugs that are hard to reproduce and investigate. They also have shared ownership with multiple objects, thus client code cannot make assumptions about their state.
To illustrate pitfalls from global variables, consider the following example:
This is a simplified version of the pyMBE class that was available up to and including d10d172. The class members units and SEED are bound to the class, not to class instances. You can access them from outside with print(pymbe_library.SEED).
pmb1.SEED=42
pmb2.SEED=43
pmb1.units=<pint.registry.UnitRegistry object at 0x7235c663fa0>
pmb2.units=<pint.registry.UnitRegistry object at 0x7235c663fa0>
True
The SEED members are different, but the unit members are shared between the two objects (same memory address). These members differ in behavior because integers are immutable, while Pint objects are mutable (quick refresher). The rules for an object's mutability and scope can be a bit difficult to grasp, and while Python developers should know about them, we cannot expect them to know how the pyMBE class constructor uses these rules internally to manage the lifetime and ownership of its class members. I would advocate for the principle of least surprise by making these class members fully owned by the class instance, i.e. the constructed object, rather than by the class itself.
This can be achieved without any API change with the following rewrite:
Output of the MWE shows that each object has its own Pint unit registry:
pmb1.SEED=42
pmb2.SEED=43
pmb1.units=<pint.registry.UnitRegistry object at 0x106e81cd03a0>
pmb2.units=<pint.registry.UnitRegistry object at 0x106e81ca7790>
False
The same issue applies to the data frame. What happens if we create two class instances to load different parameter sets? The answer is not straightforward, because pandas sometimes returns copies, sometimes returns views, and sometimes allows data to be modified in-place.
Coming back to the unit system, we have another source of counter-intuitive behavior in pymbe_library.set_reduced_units():
We are creating a new unit registry. Which means we cannot pass a custom temperature as argument, because we end up multiplying two quantities whose unit belong to different registries:
Traceback (most recent call last):
ValueError: Cannot operate with Quantity and Quantity of different registries.
This exception is not necessarily raised by passing custom values to the other function arguments, but it might be raised later in the simulation script... It is simply not possible to safely pass custom values, because line 2528 guarantees that our custom values won't belong to the same registry as the one used by pyMBE.
Deleting line 2528 doesn't help, because when we call units.define(f"reduced_length = {unit_length}"), we end up re-defining the unit. I just found out the hard way how dangerous that is:
importpintunits=pint.UnitRegistry()
unit_length=0.355*units.nmunits.define(f"reduced_length = {unit_length}")
unit_length=units.Quantity(1,'reduced_length')
print(unit_length.to('nm'), "=", unit_length)
unit_length=0.201*units.nmunits.define(f"reduced_length = {unit_length}")
# units._build_cache() # <--- never forget this line when re-defining a custom unit!!!unit_length=units.Quantity(1,'reduced_length')
print(unit_length.to('nm'), "=", unit_length)
It would probably be safer to let the user define their own Pint unit registry, and pass it as a units argument of the pyMBE class constructor. In this way, the correct set up of the unit registry becomes the responsibility of the user rather than ours, and we reduce the risk of making a mistake, like the one I made above when re-defining units, or the one currently in pyMBE where the temperature cannot be set, or the one that used to be in the pyMBE constructor where the energy unit was silently ignored (#55 (comment)). If the user doesn't provide a units, we define a default one.
The text was updated successfully, but these errors were encountered:
Online discussion with @pm-blanco: move the unit registry creation to the class constructor, remove line 2528, and introduce units._build_cache() in set_reduced_units() to rebuild the cache when reduced units are redefined.
Global variables are notorious for introducing race conditions, which can cause bugs that are hard to reproduce and investigate. They also have shared ownership with multiple objects, thus client code cannot make assumptions about their state.
To illustrate pitfalls from global variables, consider the following example:
This is a simplified version of the pyMBE class that was available up to and including d10d172. The class members
units
andSEED
are bound to the class, not to class instances. You can access them from outside withprint(pymbe_library.SEED)
.Here is a MWE:
Output:
The
SEED
members are different, but theunit
members are shared between the two objects (same memory address). These members differ in behavior because integers are immutable, while Pint objects are mutable (quick refresher). The rules for an object's mutability and scope can be a bit difficult to grasp, and while Python developers should know about them, we cannot expect them to know how the pyMBE class constructor uses these rules internally to manage the lifetime and ownership of its class members. I would advocate for the principle of least surprise by making these class members fully owned by the class instance, i.e. the constructed object, rather than by the class itself.This can be achieved without any API change with the following rewrite:
Output of the MWE shows that each object has its own Pint unit registry:
The same issue applies to the data frame. What happens if we create two class instances to load different parameter sets? The answer is not straightforward, because pandas sometimes returns copies, sometimes returns views, and sometimes allows data to be modified in-place.
Coming back to the unit system, we have another source of counter-intuitive behavior in
pymbe_library.set_reduced_units()
:pyMBE/pyMBE.py
Line 2511 in 37491e8
Here is the relevant part of the method body:
pyMBE/pyMBE.py
Lines 2528 to 2544 in 37491e8
We are creating a new unit registry. Which means we cannot pass a custom
temperature
as argument, because we end up multiplying two quantities whose unit belong to different registries:Output:
This exception is not necessarily raised by passing custom values to the other function arguments, but it might be raised later in the simulation script... It is simply not possible to safely pass custom values, because line 2528 guarantees that our custom values won't belong to the same registry as the one used by pyMBE.
Deleting line 2528 doesn't help, because when we call
units.define(f"reduced_length = {unit_length}")
, we end up re-defining the unit. I just found out the hard way how dangerous that is:Output:
The new
reduced_length
value is not used because the cache is out-of-date. See hgrecco/pint#1773 (comment) and the source code inhgrecco/[email protected]:pint/facets/plain/registry.py#L517-L521
for more details. By default it should raise a warning, but in my environment (Python 3.10.12, Pint 0.23), the warnings don't get displayed in the terminal.It would probably be safer to let the user define their own Pint unit registry, and pass it as a
units
argument of the pyMBE class constructor. In this way, the correct set up of the unit registry becomes the responsibility of the user rather than ours, and we reduce the risk of making a mistake, like the one I made above when re-defining units, or the one currently in pyMBE where the temperature cannot be set, or the one that used to be in the pyMBE constructor where the energy unit was silently ignored (#55 (comment)). If the user doesn't provide aunits
, we define a default one.The text was updated successfully, but these errors were encountered: