UNet2DConditionModel can't load sd-1.5 config with method from_pretrained()? #7445
Unanswered
JinShuwenABC
asked this question in
Q&A
Replies: 1 comment
-
Hi @JinShuwenABC, Btw, your |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I try to use diffusers.T2IAdapter and diffusers.UNet2DConditionModel to train T2IAdapter with sd-1.5.
For T2IAdapter, I use config.json below to init it. I think it serve sd-1.5 and sd-2.1.
{
"_class_name": "T2IAdapter",
"_diffusers_version": "0.18.0.dev0",
"adapter_type": "full_adapter",
"channels": [
320,
640,
1280,
1280
],
"downscale_factor": 16,
"in_channels": 3,
"num_res_blocks": 2
}
For UNet2DConditionModel, I use from_pretrained() as follow.
While it stil perform as a sdxl model, because it alert as follow.
{'time_embedding_dim', 'mid_block_only_cross_attention', 'time_cond_proj_dim', 'timestep_post_act', 'time_embedding_act_fn', 'transformer_layers_per_block', 'addition_embed_type', 'class_embed_type', 'dual_cross_attention', 'upcast_attention', 'projection_class_embeddings_input_dim', 'attention_type', 'conv_in_kernel', 'encoder_hid_dim', 'addition_embed_type_num_heads', 'conv_out_kernel', 'mid_block_type', 'cross_attention_norm', 'only_cross_attention', 'encoder_hid_dim_type', 'num_class_embeds', 'use_linear_projection', 'time_embedding_type', 'dropout', 'resnet_time_scale_shift', 'resnet_out_scale_factor', 'num_attention_heads', 'addition_time_embed_dim', 'class_embeddings_concat', 'resnet_skip_time_act'} was not found in config. Values will be initialized to default values.
And, throw error like this:
│ /data/anaconda/envs/lib/python3.7/site-packages/diffusers/models/unet_2d_blocks.py:10 │
│ 97 in forward │
│ │
│ 1094 │ │ │ │
│ 1095 │ │ │ # apply additional residuals to the output of the last pair of resnet and at │
│ 1096 │ │ │ if i == len(blocks) - 1 and additional_residuals is not None: │
│ ❱ 1097 │ │ │ │ hidden_states = hidden_states + additional_residuals │
│ 1098 │ │ │ │
│ 1099 │ │ │ output_states = output_states + (hidden_states,) │
│ 1100 │
╰──────────────────────────────────────────────────────────────
The size of tensor a (64) must match the size of tensor b (32) at non-singleton dimension 3
If I change the "adapter_type": "full_adapter" into "adapter_type": "full_adapter_xl", and use local stable-diffusion-xl, it can work well.
Besides, as I known, the pipeline support the sd-1.5, which makes me confused.
I use diffusers in version 0.21.0, maybe I have to upgrade it to get this service?
Thank you to anyone who answer this.
Beta Was this translation helpful? Give feedback.
All reactions