Is the forward results difference and gradient difference normal when use cuda graph to accelerate and without it? #397

Jianghanxiao · 2024-12-14T02:10:54Z

Jianghanxiao
Dec 14, 2024

I notice some forward results difference (very small difference) when using cuda graph to accelerate and without using it. Is this a normal case? Actually I feel that the gradient from the cuda graph is better (is it possible in this way the forward model is preciser?) Potentially because it reduces some data transferring effort and reduce some "errors"?

Answered by nvlukasz

Dec 19, 2024

Indeed, the small numerical differences could be caused by different ordering of operations. If running with the same setting (with/without graph) multiple times also produces such differences, then the issue is probably unrelated to CUDA graphs.

View full answer

nvlukasz · 2024-12-16T20:18:16Z

nvlukasz
Dec 16, 2024
Maintainer

Hi @Jianghanxiao, there shouldn't be numerical differences when running with and without CUDA graphs. I think what might be happening is that some input array (or gradient) is different or maybe isn't getting reset between launches. It's hard to diagnose without more info, are you able to share a small repro? Thanks!

0 replies

Jianghanxiao · 2024-12-16T21:22:09Z

Jianghanxiao
Dec 16, 2024
Author

Thanks for the answer! Hmmmm, it's a bit hard to write a small demo for this, but potentially I feel the small error may come from the atomic_add when I do my physics simulation. Actually even after 600 substeps, the most error is just 2e-7, so basically very small, and may just be realted to different order in atomic_add? And actually this may also be undeterministic even if we run the same setting twice.

1 reply

nvlukasz Dec 19, 2024
Maintainer

Indeed, the small numerical differences could be caused by different ordering of operations. If running with the same setting (with/without graph) multiple times also produces such differences, then the issue is probably unrelated to CUDA graphs.

Answer selected by Jianghanxiao

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is the forward results difference and gradient difference normal when use cuda graph to accelerate and without it? #397

{{title}}

Replies: 2 comments 1 reply

{{title}}

{{title}}

{{title}}

Select a reply

Is the forward results difference and gradient difference normal when use cuda graph to accelerate and without it? #397

Jianghanxiao Dec 14, 2024

Replies: 2 comments · 1 reply

nvlukasz Dec 16, 2024 Maintainer

Jianghanxiao Dec 16, 2024 Author

nvlukasz Dec 19, 2024 Maintainer

Jianghanxiao
Dec 14, 2024

Replies: 2 comments 1 reply

nvlukasz
Dec 16, 2024
Maintainer

Jianghanxiao
Dec 16, 2024
Author

nvlukasz Dec 19, 2024
Maintainer