You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The recipe is quite simple: te_recipe.DelayedScaling(te_recipe.Format.HYBRID, amax_history_len=64, amax_compute_algo="max"). If I omit the recipe from the autocast context the forward works as expected.
Any ideas?
The text was updated successfully, but these errors were encountered:
@cassanof Do you have a script that replicates this error? I'm not able to reproduce it with the same recipe. If not, could you give a more detailed stack trace with the argument types to tex.fused_amax_and_scale_update_after_reduction?
Hi! unfortunately i cannot share, and wasn't able to repro with some of the open models. The arguments are a long list of different tensors.
At the end, i was able to get amax scaling to work by completely disabling the fused kernel in your code and using the non-fused instead. This is obviously undesired though.
Currently getting the following error on a simple forward with a transformer model when using DelayedScaling:
The recipe is quite simple:
te_recipe.DelayedScaling(te_recipe.Format.HYBRID, amax_history_len=64, amax_compute_algo="max")
. If I omit the recipe from the autocast context the forward works as expected.Any ideas?
The text was updated successfully, but these errors were encountered: