You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When I do this, the output is:
Output logits requires_grad: False
Output logits grad_fn: None
Despite explicitly setting all the parameters to requires_grad = True! And when printing all the params, they all are correctly set to requires_grad = True.
Just to sanity check, I ran the same code but set model_id = "bert-base-uncased", and got:
Output logits requires_grad: True
Output logits grad_fn: <ViewBackward0 object at 0x7f0ca6abf370>
So it's def a ModernBERT specific problem!
The text was updated successfully, but these errors were encountered:
System Info
latest transformers version (from source), python 3.10
Who can help?
@ArthurZ
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
model_id = "answerdotai/ModernBERT-base"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForMaskedLM.from_pretrained(model_id).to("cuda")
Create a simple input
inputs = {
"input_ids": torch.randint(0, 1000, (1, 10)).cuda(),
"attention_mask": torch.ones(1, 10).cuda()
}
Set to train mode and check all parameters
model.train()
for name, param in model.named_parameters():
print(f"{name}: requires_grad = {param.requires_grad}")
Do forward pass
outputs = model(**inputs)
print("\nOutput logits requires_grad:", outputs.logits.requires_grad)
print("Output logits grad_fn:", outputs.logits.grad_fn)
Expected behavior
When I do this, the output is:
Output logits requires_grad: False
Output logits grad_fn: None
Despite explicitly setting all the parameters to requires_grad = True! And when printing all the params, they all are correctly set to requires_grad = True.
Just to sanity check, I ran the same code but set model_id = "bert-base-uncased", and got:
Output logits requires_grad: True
Output logits grad_fn: <ViewBackward0 object at 0x7f0ca6abf370>
So it's def a ModernBERT specific problem!
The text was updated successfully, but these errors were encountered: