You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I observed similar behavior in evo and tripedhyena, that the model can be loaded successfully but the server crashed once I tried to do a simple inference.
compute ressource: databricks azure cluster with nvidia A100
pkgs: flash-fft-conv and flash-attention well installed so that the model had no problem being loaded
code for stripedhyena:
from transformers import AutoConfig, AutoModelForCausalLM, AutoTokenizer, TextStreamer
model_name = "togethercomputer/StripedHyena-Hessian-7B"
tokenizer = AutoTokenizer.from_pretrained(
model_name,
model_max_length=sys.maxsize,
trust_remote_code=True,
)
tokenizer.pad_token = tokenizer.eos_token
config = AutoConfig.from_pretrained(model_name, trust_remote_code=True)
config.use_cache = True
device = torch.device("cuda")
model = AutoModelForCausalLM.from_pretrained(
model_name,
config=config,
trust_remote_code=True,
).to(device)
input_text = "Question: How many hours in one day? Answer: "
input_ids = tokenizer(input_text, return_tensors="pt").input_ids.to(device)
model.generate(input_ids)
Both codes crashed the python kernel at the last line.
I was not sure if the issue was caused by configuration of my databrick ressources I also tried randomly other models of the same size e.g. "HuggingFaceH4/zephyr-7b-beta" and there were no problem making the inference. I do not know if there's any other possible incompatibility between stripedhyena and databricks though.
Does anyone also encounter this problem?
The text was updated successfully, but these errors were encountered:
Sorry for the late reply. I get a KeyError: 'stripedhyena' even before infefrence can start. Could you share which version of the libraries you are using?
I observed similar behavior in evo and tripedhyena, that the model can be loaded successfully but the server crashed once I tried to do a simple inference.
compute ressource: databricks azure cluster with nvidia A100
pkgs: flash-fft-conv and flash-attention well installed so that the model had no problem being loaded
code for stripedhyena:
code for evo:
Both codes crashed the python kernel at the last line.
I was not sure if the issue was caused by configuration of my databrick ressources I also tried randomly other models of the same size e.g. "HuggingFaceH4/zephyr-7b-beta" and there were no problem making the inference. I do not know if there's any other possible incompatibility between stripedhyena and databricks though.
Does anyone also encounter this problem?
The text was updated successfully, but these errors were encountered: