Burning Artifacts in LTX V2V Pipeline with T2V-Generated Videos at Mid-Range Strength Values #103

marghovo · 2025-01-17T14:31:02Z

Hi,

Thanks for your great work.

I am trying to leverage LTX-Video in my research which uses Video-to-Video pipeline.
When I apply LTX V2V on a video generated by LTX itself I get strange burning artifacts for strength values in the middle of its range (e.g. 0.4). These artifacts are reduced for small strength values (0.1) and very high strength values (0.9). Please see an example below:
The first video is a video generated by LTX text-to-video pipeline and it is used to generate the proceeding videos using the V2V pipeline. The value of the strength is indicated in the filename. The burning artifacts are apparent in the trees and rocks especially when using strength 0.4 and 0.7.

input.mp4

strength_0.1.mp4

strength_0.4.mp4

strength_0.7.mp4

strength_0.9.mp4

The issue related to these burning artifacts disappears when I use a different video not generated by LTX or even a screen recorded version of the video generated by LTX. I initially thought that some corruption occurs while saving the result produced by LTX T2V pipeline, however, when directly using its output as an input for V2V, the same issue occurs.

Next, I hypothesized that some inherent noise may be present in the output of LTX T2V due to its VAE decoder. I though of the following two possibilities:
(1) Some noise is present in the LTX's output because the VAE decoder conducts the last denoising step.
(2) Some noise is present in the LTX's output because the VAE decoder has noise injection in its architecture.

However, these two possibilities were rejected since when I tried using the latents of the T2V output as input for the V2V pipeline (without decoding), these artifacts were still present.

At the moment, I think of the following two possibilities for this burning artifacts:
(1) Something in the base model is adding some type of corruption to the output.
(2) The diffusion process results in some type of corruption in the output.

As a side note, I also tried the set-ups mentioned above with CogVideoX and no such burning artifacts are present in its results.

Do you have any thoughts on the problem described above and potential solutions for overcoming it?

Thanks in advance.

yoavhacohen · 2025-01-18T18:27:19Z

When generating the vid-to-vid output, are you using the same seed as the one used for the original video?
Try using a different seed for the initial video generation and the vid-to-vid process.

marghovo · 2025-01-19T16:57:15Z

Hi. Thanks for your reply.

Indeed, I was using the same seed for both T2V and V2V. Changing the seeds helped resolve the issue. Do you have a possible explanation to why using the same seeds results in such kind of an artifact? Is it something specific in LTX or the general diffusion process. I did not notice this issue with other video or image models.

I also checked out your latest commit and tested T2V and it seems when using spacio-temporal guidance (STG) of 1.0 the same artifacts were present in T2V itself. Using a very small STG (e.g. 0.1) results in no such artifacts.

yoavhacohen self-assigned this Jan 18, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Burning Artifacts in LTX V2V Pipeline with T2V-Generated Videos at Mid-Range Strength Values #103

Burning Artifacts in LTX V2V Pipeline with T2V-Generated Videos at Mid-Range Strength Values #103

marghovo commented Jan 17, 2025

yoavhacohen commented Jan 18, 2025 •

edited

Loading

marghovo commented Jan 19, 2025 •

edited

Loading

Burning Artifacts in LTX V2V Pipeline with T2V-Generated Videos at Mid-Range Strength Values #103

Burning Artifacts in LTX V2V Pipeline with T2V-Generated Videos at Mid-Range Strength Values #103

Comments

marghovo commented Jan 17, 2025

yoavhacohen commented Jan 18, 2025 • edited Loading

marghovo commented Jan 19, 2025 • edited Loading

yoavhacohen commented Jan 18, 2025 •

edited

Loading

marghovo commented Jan 19, 2025 •

edited

Loading