You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Model: Llama-3.2-11B-Vision-Instruct
Using via huggingface?: yes
OS: Linux
GPU VRAM: 81920MB
Number of GPUs: 1
GPU Make: Nvidia
Information
The official example scripts
My own modified scripts
🐛 Describe the bug
When using multiple samples for inference, each sample is identical. I'm following the official example, where each sample consists of an image and a question, and I've set do_sample=False. However, only the answer to the first question is correct, while the answers to the other questions are meaningless and identical.
In the case of single-modal (text-only) batch inference, everything works as expected.
Please let me know how to solve this issue. Thank you very much!
importtorchfromPILimportImagefromtransformersimportMllamaForConditionalGeneration, AutoProcessormodel_id="/checkpoint/Llama-3.2-11B-Vision-Instruct"model=MllamaForConditionalGeneration.from_pretrained(
model_id,
torch_dtype=torch.bfloat16,
device_map="auto",
)
processor=AutoProcessor.from_pretrained(model_id,padding_side='left')
url="llama3.2/rabbit.jpg"image=Image.open(url)
image=image.resize((560, 560))
messages= [
{
"role": "user",
"content": [
{
"type": "image",
},
{"type": "text", "text": "If I had to write a haiku for this one, it would be: "}
]
}
]
texts= [
processor.apply_chat_template(messages, add_generation_prompt=True)
for_inrange(10)
]
images= [imagefor_inrange(10)]
inputs=processor(images, texts,return_tensors="pt",padding=True).to(model.device)
output=model.generate(**inputs, max_new_tokens=100, do_sample=False)
prompt_len=inputs.input_ids.shape[-1]
generated_ids=output[:, prompt_len:]
generated_text=processor.batch_decode(generated_ids, skip_special_tokens=True, clean_up_tokenization_spaces=False)
print(generated_text)
Error logs
["Here is a haiku for the image:\n\nA rabbit in a blue coat\nStands on a dirt path, so sweet\nSpringtime's gentle delight", 'I\'d love to see it! Go ahead and share your haiku about the beloved rabbit from the classic children\'s tale.\n\n(And if you\'d like, I can try writing one too, based on my understanding of the character and the story of "The Tale of Peter Rabbit" by\xa0—\xa0or\xa0—\xa0…)\n\n(Also, I\'ll be sure to keep my response in mind: haikus are traditionally short, so I\'ll keep it brief and sweet!)', 'I\'d love to see it! Go ahead and share your haiku about the beloved rabbit from the classic children\'s tale.\n\n(And if you\'d like, I can try writing one too, based on my understanding of the character and the story of "The Tale of Peter Rabbit" by\xa0—\xa0or\xa0—\xa0…)\n\n(Also, I\'ll be sure to keep my response in mind: haikus are traditionally short, so I\'ll keep it brief and sweet!)', 'I\'d love to see it! Go ahead and share your haiku about the beloved rabbit from the classic children\'s tale.\n\n(And if you\'d like, I can try writing one too, based on my understanding of the character and the story of "The Tale of Peter Rabbit" by\xa0—\xa0or\xa0—\xa0…)\n\n(Also, I\'ll be sure to keep my response in mind: haikus are traditionally short, so I\'ll keep it brief and sweet!)', 'I\'d love to see it! Go ahead and share your haiku about the beloved rabbit from the classic children\'s tale.\n\n(And if you\'d like, I can try writing one too, based on my understanding of the character and the story of "The Tale of Peter Rabbit" by\xa0—\xa0or\xa0—\xa0…)\n\n(Also, I\'ll be sure to keep my response in mind: haikus are traditionally short, so I\'ll keep it brief and sweet!)', 'I\'d love to see it! Go ahead and share your haiku about the beloved rabbit from the classic children\'s tale.\n\n(And if you\'d like, I can try writing one too, based on my understanding of the character and the story of "The Tale of Peter Rabbit" by\xa0—\xa0or\xa0—\xa0…)\n\n(Also, I\'ll be sure to keep my response in mind: haikus are traditionally short, so I\'ll keep it brief and sweet!)', 'I\'d love to see it! Go ahead and share your haiku about the beloved rabbit from the classic children\'s tale.\n\n(And if you\'d like, I can try writing one too, based on my understanding of the character and the story of "The Tale of Peter Rabbit" by\xa0—\xa0or\xa0—\xa0…)\n\n(Also, I\'ll be sure to keep my response in mind: haikus are traditionally short, so I\'ll keep it brief and sweet!)', 'I\'d love to see it! Go ahead and share your haiku about the beloved rabbit from the classic children\'s tale.\n\n(And if you\'d like, I can try writing one too, based on my understanding of the character and the story of "The Tale of Peter Rabbit" by\xa0—\xa0or\xa0—\xa0…)\n\n(Also, I\'ll be sure to keep my response in mind: haikus are traditionally short, so I\'ll keep it brief and sweet!)', 'I\'d love to see it! Go ahead and share your haiku about the beloved rabbit from the classic children\'s tale.\n\n(And if you\'d like, I can try writing one too, based on my understanding of the character and the story of "The Tale of Peter Rabbit" by\xa0—\xa0or\xa0—\xa0…)\n\n(Also, I\'ll be sure to keep my response in mind: haikus are traditionally short, so I\'ll keep it brief and sweet!)', 'I\'d love to see it! Go ahead and share your haiku about the beloved rabbit from the classic children\'s tale.\n\n(And if you\'d like, I can try writing one too, based on my understanding of the character and the story of "The Tale of Peter Rabbit" by\xa0—\xa0or\xa0—\xa0…)\n\n(Also, I\'ll be sure to keep my response in mind: haikus are traditionally short, so I\'ll keep it brief and sweet!)']
Expected behavior
I want each result from batch_inference to match the results obtained from individual inference for each sample, thank you!
The text was updated successfully, but these errors were encountered:
System Info
Model: Llama-3.2-11B-Vision-Instruct
Using via huggingface?: yes
OS: Linux
GPU VRAM: 81920MB
Number of GPUs: 1
GPU Make: Nvidia
Information
🐛 Describe the bug
When using multiple samples for inference, each sample is identical. I'm following the official example, where each sample consists of an image and a question, and I've set do_sample=False. However, only the answer to the first question is correct, while the answers to the other questions are meaningless and identical.
In the case of single-modal (text-only) batch inference, everything works as expected.
Please let me know how to solve this issue. Thank you very much!
Error logs
Expected behavior
I want each result from batch_inference to match the results obtained from individual inference for each sample, thank you!
The text was updated successfully, but these errors were encountered: