Question #1

pengzhangzhi · 2023-11-21T07:07:17Z

Hi Pengze,

I hope you are doing well! Congrats on your recent paper "Formulating Discrete Probability Flow Through Optimal Transport" . I had a good read and found it impressive in theory! However, it is too much math for me to understand the details. Thus I have a few questions and hope to hear your answers.

I compared your code with tauLDR and found that the only change in the sampling code is shown below.

reverse_rates = forward_rates * F.relu(inner_sum - 1) # (N, D, S)

Correct me if I am wrong. I do observe other codes but Is this the only change in terms of the formulation?
Would you explain more about why you implement it this way? Why inner_sum - 1 and why using relu? Do you have an intuitive explanation for that?
I try to read your paper to find the answer, but it is a little hard for me to understand why from the equations.

I have read other flow-matching models that have a preprocessing step called pairing, basically pairing the data samples and prior samples. Thus during the training, they do not sample random noise from prior distribution, instead, they use the pre-sampled noises. Do you think it is possible to use this technique in discrete sequence flow matching?

Best,

Zhangzhi,
University of Missouri,
Columbia, MO, USA

The text was updated successfully, but these errors were encountered:

PangzeCheung · 2023-11-22T07:36:49Z

Hi Zhangzhi,

@pengzhangzhi Thank you for your interest in our work. I apologize for being brief as I'm currently rushing with a paper. In essence, the discrete diffusion flow we've designed employs ReLU operations, forcing high-probability states to transition only towards low-probability states in the forward process. This effectively mitigates 'mutual flow' and diminishes uncertainty during reverse sampling.

The experiments mentioned on CIFAR-10 involve the extension of our designed discrete probability flow into a broad range of transition rates as $Q$ in Eq. (87). Consequently, the new reverse transition rate is obtained shown in Eq. (89), which is consistent with the code you provided. The core objective of this equation remains focused on mitigating 'mutual flow' to minimize uncertainty. For detailed explanations, please refer to Appendix D.12. Hope this helps~

Best,
Pengze

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question #1

Question #1

pengzhangzhi commented Nov 21, 2023

PangzeCheung commented Nov 22, 2023

Question #1

Question #1

Comments

pengzhangzhi commented Nov 21, 2023

PangzeCheung commented Nov 22, 2023