You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
At the moment the caching of contiguity changes is done in python which has some latency associated with it. It's not as bad with Ivan's PR: Lightning-AI/lightning-thunder@56b922a
However, this seems to be something that nvFuser should take ownership of in its caching system. Today we require a new fusion definition on any contiguity changes, as it can be a valuable optimization within a kernel. One idea is to put this aspect in the concretization pass in nvFuser if the latency of that would be tolerable.
This would prevent having any caching logic within python for nvfuser execution.
At the moment the caching of contiguity changes is done in python which has some latency associated with it. It's not as bad with Ivan's PR: Lightning-AI/lightning-thunder@56b922a
However, this seems to be something that nvFuser should take ownership of in its caching system. Today we require a new fusion definition on any contiguity changes, as it can be a valuable optimization within a kernel. One idea is to put this aspect in the concretization pass in nvFuser if the latency of that would be tolerable.
This would prevent having any caching logic within python for nvfuser execution.
Original discussion in:
Lightning-AI/lightning-thunder#1840
The text was updated successfully, but these errors were encountered: