MLP in Warp with wp.tile_matmul #817
-
(GPU: NVIDIA RTX 6000 Ada - 48 GB memory) I am trying out the examples with tile primitives of warp. The goal is to learn to build a multilayer perceptron with Warp, which can be used as the actor network that will be trained directly in Warp without having to stream to Pytorch. In
Moreover, when I make a custom code with matmul, the result seems incorrect, perhaps because I did something wrong.
|
Beta Was this translation helpful? Give feedback.
Replies: 4 comments 4 replies
-
When you get a message about graph capture failing, I normally turn off graph capture and run again. If the error is still unclear you can turn on the appropriate debugging option to synchronize and check for CUDA errors after every operation. |
Beta Was this translation helpful? Give feedback.
-
In addition, when I run the code below on my other computer, it often takes ~ 1 mins (perhaps to compile), then I receive the following result which state an error of failing to configure kernel dynamic shared memory
The code is
|
Beta Was this translation helpful? Give feedback.
-
warp/warp/examples/benchmarks/benchmark_gemm.py Lines 45 to 49 in ab88f0f |
Beta Was this translation helpful? Give feedback.
-
Thank you for your respond. I managed to make it work. The trick is to load both the input and the (weight, bias) in small size tile as you suggested. However, I notice that there is some minor differences between the calculation made by Warp and Numpy. Why is this the case? Below is the code.
The result
|
Beta Was this translation helpful? Give feedback.
Thank you for your respond. I managed to make it work. The trick is to load both the input and the (weight, bias) in small size tile as you suggested. However, I notice that there is some minor differences between the calculation made by Warp and Numpy. Why is this the case? Below is the code.