-
Notifications
You must be signed in to change notification settings - Fork 28.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
In the latest version of transformers (4.49.0) matrix transformation error is encountered #36571
Comments
Hi @idebroy, this code is too long for us to really figure out what's going on! Can you try to make some minimal code that reproduces the issue? |
This is a short version of code: css = """ DESCRIPTION = "## lStation txt2Img🥠" examples = [ DEFAULT_MODEL_ID = "SG161222/RealVisXL_V5.0_Lightning" MAX_IMAGE_SIZE = int(os.getenv("MAX_IMAGE_SIZE", "4096")) device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu") def load_and_prepare_model(model_id): Load the default model oncemodel = load_and_prepare_model(DEFAULT_MODEL_ID) def save_image(img): def randomize_seed_fn(seed: int, randomize_seed: bool) -> int: @spaces.GPU(duration=60, enable_queue=True)
with gr.Blocks(css=css) as demo:
demo.queue(max_size=50).launch(show_api=True)` |
It always happens to generate the image with the latest transformer in this function: |
System Info
transformer Version : 4.49.0
python version: python3.10
env : HuggingFace spaces
Looks to be working in : 4.48.3
Please find the following HuggingFace Space code which works in (4.48.3) but fails in (4.49.0)
Code :
`
import os
| import random
| import uuid
| import gradio as gr
| import numpy as np
| from PIL import Image
| import spaces
| import torch
| from diffusers import StableDiffusionXLPipeline, EulerAncestralDiscreteScheduler
| from typing import Tuple
|
| css = '''
| .gradio-container{max-width: 575px !important}
| h1{text-align:center}
| footer {
| visibility: hidden
| }
| '''
|
| DESCRIPTIONXX = """## lStation txt2Img🥠"""
|
| examples = [
|
| "A tiny reptile hatching from an egg on the mars, 4k, planet theme, --style raw5 --v 6.0",
| "An anime-style illustration of a delicious, rice biryani with curry and chilli pickle --style raw5",
| "Iced tea in a cup --ar 85:128 --v 6.0 --style raw5, 4K, Photo-Realistic",
| "A zebra holding a sign that says Welcome to Zoo --ar 85:128 --v 6.0 --style raw",
| "A splash page of Spiderman swinging through a futuristic cityscape filled with flying cars, the scene depicted in a vibrant 3D rendered Marvel comic art style.--style raw5, 4K, Photo-Realistic"
| ]
|
| MODEL_OPTIONS = {
|
| "LIGHTNING V5.0": "SG161222/RealVisXL_V5.0_Lightning",
| "LIGHTNING V4.0": "SG161222/RealVisXL_V4.0_Lightning",
| }
|
| MAX_IMAGE_SIZE = int(os.getenv("MAX_IMAGE_SIZE", "4096"))
| USE_TORCH_COMPILE = os.getenv("USE_TORCH_COMPILE", "0") == "1"
| ENABLE_CPU_OFFLOAD = os.getenv("ENABLE_CPU_OFFLOAD", "0") == "1"
| BATCH_SIZE = int(os.getenv("BATCH_SIZE", "1"))
|
| device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
|
| style_list = [
| {
| "name": "3840 x 2160",
| "prompt": "hyper-realistic 8K image of {prompt}. ultra-detailed, lifelike, high-resolution, sharp, vibrant colors, photorealistic",
| "negative_prompt": "cartoonish, low resolution, blurry, simplistic, abstract, deformed, ugly",
| },
| {
| "name": "2560 x 1440",
| "prompt": "hyper-realistic 4K image of {prompt}. ultra-detailed, lifelike, high-resolution, sharp, vibrant colors, photorealistic",
| "negative_prompt": "cartoonish, low resolution, blurry, simplistic, abstract, deformed, ugly",
| },
| {
| "name": "HD+",
| "prompt": "hyper-realistic 2K image of {prompt}. ultra-detailed, lifelike, high-resolution, sharp, vibrant colors, photorealistic",
| "negative_prompt": "cartoonish, low resolution, blurry, simplistic, abstract, deformed, ugly",
| },
| {
| "name": "Style Zero",
| "prompt": "{prompt}",
| "negative_prompt": "",
| },
| ]
|
| styles = {k["name"]: (k["prompt"], k["negative_prompt"]) for k in style_list}
| DEFAULT_STYLE_NAME = "3840 x 2160"
| STYLE_NAMES = list(styles.keys())
|
| def apply_style(style_name: str, positive: str, negative: str = "") -> Tuple[str, str]:
| if style_name in styles:
| p, n = styles.get(style_name, styles[DEFAULT_STYLE_NAME])
| else:
| p, n = styles[DEFAULT_STYLE_NAME]
|
| if not negative:
| negative = ""
| return p.replace("{prompt}", positive), n + negative
|
| def load_and_prepare_model(model_id):
| pipe = StableDiffusionXLPipeline.from_pretrained(
| model_id,
| torch_dtype=torch.float16 if torch.cuda.is_available() else torch.float32,
| use_safetensors=True,
| add_watermarker=False,
| ).to(device)
| pipe.scheduler = EulerAncestralDiscreteScheduler.from_config(pipe.scheduler.config)
|
| if USE_TORCH_COMPILE:
| pipe.compile()
|
| if ENABLE_CPU_OFFLOAD:
| pipe.enable_model_cpu_offload()
|
| return pipe
|
| # Preload and compile both models
| models = {key: load_and_prepare_model(value) for key, value in MODEL_OPTIONS.items()}
|
| MAX_SEED = np.iinfo(np.int32).max
|
| def save_image(img):
| unique_name = str(uuid.uuid4()) + ".png"
| img.save(unique_name)
| return unique_name
|
| def randomize_seed_fn(seed: int, randomize_seed: bool) -> int:
| if randomize_seed:
| seed = random.randint(0, MAX_SEED)
| return seed
|
| @spaces.GPU(duration=60, enable_queue=True)
| def generate(
| model_choice: str,
| prompt: str,
| negative_prompt: str = "extra limbs, extra fingers, extra toes, unnatural proportions, distorted anatomy, disjointed limbs, mutated body parts, broken bones, oversized limbs, unrealistic muscles, merged faces, extra eyes, floating features, disfigured hands, incorrect joint placement, missing parts, blurry details, asymmetrical body structure, glitched textures",
| use_negative_prompt: bool = False,
| style_selection: str = DEFAULT_STYLE_NAME,
| seed: int = 1,
| width: int = 1024,
| height: int = 1024,
| guidance_scale: float = 3,
| num_inference_steps: int = 25,
| randomize_seed: bool = False,
| use_resolution_binning: bool = True,
| num_images: int = 1,
| progress=gr.Progress(track_tqdm=True),
| ):
| global models
| pipe = models[model_choice]
|
| seed = int(randomize_seed_fn(seed, randomize_seed))
| generator = torch.Generator(device=device).manual_seed(seed)
|
| prompt, negative_prompt = apply_style(style_selection, prompt, negative_prompt)
|
| options = {
| "prompt": [prompt] * num_images,
| "negative_prompt": [negative_prompt] * num_images if use_negative_prompt else None,
| "width": width,
| "height": height,
| "guidance_scale": guidance_scale,
| "num_inference_steps": num_inference_steps,
| "generator": generator,
| "output_type": "pil",
| }
|
| if use_resolution_binning:
| options["use_resolution_binning"] = True
|
| images = []
| for i in range(0, num_images, BATCH_SIZE):
| batch_options = options.copy()
| batch_options["prompt"] = options["prompt"][i:i + BATCH_SIZE]
| if "negative_prompt" in batch_options:
| batch_options["negative_prompt"] = options["negative_prompt"][i:i + BATCH_SIZE]
| images.extend(pipe(**batch_options).images)
|
| image_paths = [save_image(img) for img in images]
|
| return image_paths, seed
|
| with gr.Blocks(css=css, theme="bethecloud/storj_theme") as demo:
| gr.Markdown(DESCRIPTIONXX)
| with gr.Row():
| prompt = gr.Text(
| label="Prompt",
| show_label=False,
| max_lines=1,
| placeholder="Enter your prompt",
| container=False,
| )
| run_button = gr.Button("Run", scale=0)
| result = gr.Gallery(label="Result", columns=1, show_label=False)
|
| with gr.Row():
| model_choice = gr.Dropdown(
| label="Model Selection⬇️",
| choices=list(MODEL_OPTIONS.keys()),
| value="LIGHTNING V5.0"
| )
|
| with gr.Accordion("Advanced options", open=False, visible=True):
| style_selection = gr.Radio(
| show_label=True,
| container=True,
| interactive=True,
| choices=STYLE_NAMES,
| value=DEFAULT_STYLE_NAME,
| label="Quality Style",
| )
| num_images = gr.Slider(
| label="Number of Images",
| minimum=1,
| maximum=5,
| step=1,
| value=1,
| )
| with gr.Row():
| with gr.Column(scale=1):
| use_negative_prompt = gr.Checkbox(label="Use negative prompt", value=True)
| negative_prompt = gr.Text(
| label="Negative prompt",
| max_lines=5,
| lines=4,
| placeholder="Enter a negative prompt",
| value="(deformed, distorted, disfigured:1.3), poorly drawn, bad anatomy, wrong anatomy, extra limb, missing limb, floating limbs, (mutated hands and fingers:1.4), disconnected limbs, mutation, mutated, ugly, disgusting, blurry, amputation",
| visible=True,
| )
| seed = gr.Slider(
| label="Seed",
| minimum=0,
| maximum=MAX_SEED,
| step=1,
| value=0,
| )
| randomize_seed = gr.Checkbox(label="Randomize seed", value=True)
| with gr.Row():
| width = gr.Slider(
| label="Width",
| minimum=512,
| maximum=MAX_IMAGE_SIZE,
| step=8,
| value=1024,
| )
| height = gr.Slider(
| label="Height",
| minimum=512,
| maximum=MAX_IMAGE_SIZE,
| step=8,
| value=1024,
| )
| with gr.Row():
| guidance_scale = gr.Slider(
| label="Guidance Scale",
| minimum=0.1,
| maximum=6,
| step=0.1,
| value=3.0,
| )
| num_inference_steps = gr.Slider(
| label="Number of inference steps",
| minimum=1,
| maximum=60,
| step=1,
| value=28,
| )
| gr.Examples(
| examples=examples,
| inputs=prompt,
| cache_examples=False
| )
|
| use_negative_prompt.change(
| fn=lambda x: gr.update(visible=x),
| inputs=use_negative_prompt,
| outputs=negative_prompt,
| api_name=False,
| )
|
| gr.on(
| triggers=[
| prompt.submit,
| negative_prompt.submit,
| run_button.click,
| ],
| fn=generate,
| inputs=[
| model_choice,
| prompt,
| negative_prompt,
| use_negative_prompt,
| style_selection,
| seed,
| width,
| height,
| guidance_scale,
| num_inference_steps,
| randomize_seed,
| num_images,
| ],
| outputs=[result, seed]
| )
|
| if name == "main":
| demo.queue(max_size=50).launch(show_api=True)
`
Exception:
File "/usr/local/lib/python3.10/site-packages/spaces/zero/wrappers.py", line 256, in thread_wrapper
res = future.result()
File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 451, in result
return self.__get_result()
File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
raise self._exception
File "/usr/local/lib/python3.10/concurrent/futures/thread.py", line 58, in run
result = self.fn(*self.args, **self.kwargs)
File "/home/user/app/app.py", line 158, in generate
images.extend(pipe(**batch_options).images)
File "/usr/local/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/diffusers/pipelines/stable_diffusion_xl/pipeline_stable_diffusion_xl.py", line 1086, in call
) = self.encode_prompt(
File "/usr/local/lib/python3.10/site-packages/diffusers/pipelines/stable_diffusion_xl/pipeline_stable_diffusion_xl.py", line 406, in encode_prompt
prompt_embeds = text_encoder(text_input_ids.to(device), output_hidden_states=True)
File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
return forward_call(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/transformers/models/clip/modeling_clip.py", line 1490, in forward
text_embeds = self.text_projection(pooled_output)
File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
return forward_call(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/linear.py", line 125, in forward
return F.linear(input, self.weight, self.bias)
RuntimeError: expected mat1 and mat2 to have the same dtype, but got: float != c10::Half
Who can help?
No response
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
Just try to execute the above space code in a GPU enabled system.
While generating any image it fails with the exception in the description posted above.
RuntimeError: expected mat1 and mat2 to have the same dtype, but got: float != c10::Half
Expected behavior
There should not be any exception.
The text was updated successfully, but these errors were encountered: