You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Dear authors, thanks for your great work. I tried the method on Qwen2.5-7B-Instruct, but the model performance drops drastically. I have tried to tune the L1 regularization weight within [0.05, 0.1, 0.15, 0.2, 0.5, 1], but none of them leads to a satisfying pattern. I wonder whether you have tried other model architectures not mentioned in the paper. Do you have suggestions on how to generalize this method to other models?
The text was updated successfully, but these errors were encountered:
Dear authors, thanks for your great work. I tried the method on Qwen2.5-7B-Instruct, but the model performance drops drastically. I have tried to tune the L1 regularization weight within [0.05, 0.1, 0.15, 0.2, 0.5, 1], but none of them leads to a satisfying pattern. I wonder whether you have tried other model architectures not mentioned in the paper. Do you have suggestions on how to generalize this method to other models?
The text was updated successfully, but these errors were encountered: