You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The proposed is feature consists in adding a default operator in DMA abstraction that would perform copy of a contiguous
chunk of data into a non-overlapping region.
Providing such method in the abstraction makes sense since all available backends (linux, cuda, OpenCL, level_zero) do provide a memcpy functionality. There are currently only few rarely used accelerated copies for 2D/3D matrices and no real programmable dma engine other than the CPU itself.
On top of that, there are currently two use cases for such a functionality:
Currently all layouts transform operations are implemented using memcpy operations. Providing such a method would enable layouts to provide a backend-agnostic copy operator for transforming into another layout.
The WIP deepcopy abstraction uses many flat copies and has to go through the layout abstraction every time when such a data description is really not needed.
The text was updated successfully, but these errors were encountered:
The proposed is feature consists in adding a default operator in DMA abstraction that would perform copy of a contiguous
chunk of data into a non-overlapping region.
Providing such method in the abstraction makes sense since all available backends (linux, cuda, OpenCL, level_zero) do provide a
memcpy
functionality. There are currently only few rarely used accelerated copies for 2D/3D matrices and no real programmable dma engine other than the CPU itself.On top of that, there are currently two use cases for such a functionality:
The text was updated successfully, but these errors were encountered: