You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It is not possible to use cfg for batches greater than 1 because the tensors are not contiguous anymore.
It is possible to solve it by using input_ = input_.contiguous() inside the all_gather function which is part of the group_coordinator file.
I think that this should be neccessary because the remaining parallelization modes allows us to use batches greater than 1 and cfg which is the fastets should be able to do it as well.
The text was updated successfully, but these errors were encountered:
Exactly. For example, while using "Tencent-Hunyuan/HunyuanDiT-v1.2-Diffusers" model in a 2xGPU compute system and cfg=True as parallelization mode, this error is raised when the input batch size is greater than 1.
Thus, what I find is that after including input_.contiguos() this error was not raised anymore. Then, this could be included in the original code in order to avoid this problem.
Thanks you for your response!
It is not possible to use cfg for batches greater than 1 because the tensors are not contiguous anymore.
It is possible to solve it by using input_ = input_.contiguous() inside the all_gather function which is part of the group_coordinator file.
I think that this should be neccessary because the remaining parallelization modes allows us to use batches greater than 1 and cfg which is the fastets should be able to do it as well.
The text was updated successfully, but these errors were encountered: