Fix module initialization for root module under Zero3 #33632
+2
−0
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What does this PR do?
Fixes #28808
Behaviour:
loaded_keys = {k.replace(f"{module_name}.", "") for k in state_dict_keys if k.startswith(f"{module_name}.")}
inTrainer.py
misses the case where the module_name is the empty string (the root module), as its looking for state_dict_keys that start with"."
. This causes the root module (CLIPModel, AlignModel, etc.) to always havemodel.apply(model._initialize_weights)
called on it. Which leads to initialization of an empty parameter tensor when running zero3 (that doesn't need to be initialized as it will be replaced later) inmodeling_align.py
. The reason this causes a crash inmodeling_align.py
and not other models like CLIP is because ALIGN usesxavier_uniform_
which asserts the tensor must be 2d whereas CLIP usesnn.init.normal_
which just returns another empty tensor.Fix:
Use the whole state_dict as the root module's set of load_keys.
Please let me know if there is anything I should revise / this should be fixed in another way.😄
@ArthurZucker @muellerzr
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.