Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversion of MegatronLM checkpoint to HF transformer checkpoint fails. (ALIBI used during training) #21

Open
gagangayari opened this issue Sep 26, 2023 · 0 comments

Comments

@gagangayari
Copy link

I have a Megatron LM checkpoint trained using ALIBI. Since ALIBI doesn't add positional embeddings, I don't have it in my checkpoints as well.

During conversion of my checkpoint to HF transformers checkpoint, using src/transformers/models/megatron_gpt_bigcode/checkpoint_reshaping_and_interoperability.py , I get the below error.

AttributeError: 'dict' object has not attribute 'to'

This is because, I believe, the function get_element_from_dict_by_path is not consistent with it's return type.
image

It returns positional embeddings(tensors) when I have the positional embedding.
It returns empty dictionary when I don't have it. (in my case)

The issue arises later when we try to convert data type of the output from the above function in line 412.

image

Can we add support for checkpoints trained using ALIBI ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant