Prob. Values are not consistent if the padding amount changes #32

agupta74 · 2018-03-18T05:16:36Z

The final softmax prob. values are not same if the padding amount changes. It looks like that for some of the functions such as reduce_mean and reduce_max the padding is not masked out. Also the word/char embeddings for zero padded tokens is not zero.

zhiguowang · 2018-04-02T17:03:30Z

Not sure which part are you talking about. But I always multiply a mask to filter out the influence of padding values.

agupta74 · 2018-04-04T00:00:06Z

In matchutils.py, tf.reduce_max and tf.reduce_mean (e.g Lines 150 and 151 -- there are other places in matchutils.py using these functions) are computed without masking. So if you vary the zero padding for the same sentence/passage or query, the final prob. values are different. The amount of zero padding should not impact the final softmax prob. values for a given query/passage.

agupta74 · 2018-04-04T00:06:03Z

With regard to the word/char embeddings for "zero" word/char tokens in the zero-padded query/passage, should these zero word embeddings not be a zero vector ? Any non-zero values for zero padded tokens seems to impact the intermediate cosine similarity score (and therefore final softmax score) when I change the zero padding amount.

zhiguowang · 2018-04-04T13:24:32Z

I multiplied mask at this line https://github.com/zhiguowang/BiMPM/blob/master/src/match_utils.py#L145 before line 150 and line 151.

Before any matching layer, I will also multiply a mask to filter out padding values https://github.com/zhiguowang/BiMPM/blob/master/src/match_utils.py#L139

agupta74 · 2018-04-04T16:45:30Z

I agree that you mask out the values. However for the mean calculation the number of padded zeros will impact the mean value even though the sum does not get impacted. So if my sentence length changes due to different amount of padding based on the max sentence length in a batch, the mean value would change even though the padded values are masked out to zero.

Similarly for the max calculation since the padded values are masked out to zero, the max will be zero if the actual values in the matrix are all negative. We need to remove the padded values for max calculation since a value of zero impacts the max calculation (the padded values need to be set to negative infinity so that it does not impact the ma calculation)

I can see that in the the latest codebase you have masked out the question/passgae representation for the padded word tokens. I guess in the older codebase that did not happen. Thanks for fixing this issue !

zhiguowang · 2018-04-09T14:56:12Z

Oh, I see what you mean. You are right, the padding zeros will affect the mean calculation. I will think about it. Thanks!

agupta74 · 2018-04-09T15:55:08Z

Thanks ! Also please look into the max calculation issue as mentioned in my comments above

agupta74 · 2018-04-27T15:43:22Z

@zhiguowang Any update on this ? I can submit a PR for the fix if that helps so that you can verify the fix and perform any other testing that is required and then merge. Thanks !

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prob. Values are not consistent if the padding amount changes #32

Prob. Values are not consistent if the padding amount changes #32

agupta74 commented Mar 18, 2018

zhiguowang commented Apr 2, 2018

agupta74 commented Apr 4, 2018

agupta74 commented Apr 4, 2018

zhiguowang commented Apr 4, 2018

agupta74 commented Apr 4, 2018

zhiguowang commented Apr 9, 2018

agupta74 commented Apr 9, 2018

agupta74 commented Apr 27, 2018

Prob. Values are not consistent if the padding amount changes #32

Prob. Values are not consistent if the padding amount changes #32

Comments

agupta74 commented Mar 18, 2018

zhiguowang commented Apr 2, 2018

agupta74 commented Apr 4, 2018

agupta74 commented Apr 4, 2018

zhiguowang commented Apr 4, 2018

agupta74 commented Apr 4, 2018

zhiguowang commented Apr 9, 2018

agupta74 commented Apr 9, 2018

agupta74 commented Apr 27, 2018