add backward of conv2d #365

FatJhon · 2024-12-16T08:06:03Z

add backwards of input weight bias for conv.

Galaxy1458

Thank you for your contribution. Could you provide performance data for our comparative analysis?

Galaxy1458 · 2024-12-26T05:37:29Z

src/flag_gems/ops/conv2d.py

@@ -230,7 +228,7 @@ def conv2d_forward_kernel(
 @triton.autotune(
    configs=[
        triton.Config(


Please config these at FlagGems/src/flag_gems/runtime/backend/_nvidia/tune_configs.yaml

StrongSpoon

there is error in benchmark. please solve it.

StrongSpoon · 2024-12-25T09:30:57Z

tests/test_reduction_ops.py

+@pytest.mark.parametrize("dilation", [1, 2])
+@pytest.mark.parametrize("bias", [True, False])
+def test_accuracy_conv2d(shape, kernel, stride, padding, groups, dtype, dilation, bias):
+    torch.manual_seed(0)


is manual_seed necessary here?

StrongSpoon · 2024-12-26T01:58:31Z

src/flag_gems/ops/conv2d.py

+            revert_weight = revert_weight.transpose(1, 2).contiguous()
+            revert_weight = revert_weight.reshape(
+                groups * weight_c, out_c, weight_height, weight_width
+            ).contiguous()


redundant contiguous might waste resources.

StrongSpoon · 2024-12-26T06:43:00Z

src/flag_gems/ops/conv2d.py

+        if stride_height > 1 or stride_width > 1:
+            for i in range(out_grad.shape[2]):
+                for j in range(out_grad.shape[3]):
+                    new_out[:, :, i * (stride_height), j * (stride_width)] = out_grad[


this assignment will cost a lot of time. is there a better way?

maybe you can reference to the implementation in flip and use a copy_func to fill the elements in new_out.

StrongSpoon · 2024-12-26T09:27:43Z

src/flag_gems/ops/conv2d.py

+            device=device,
+        )
+
+        grid_weight = lambda meta: (


since size of weight is generally not large, I suggest not tiling them by BLOCK_CI_HK_WK.

Jiang Bin and others added 6 commits December 16, 2024 08:03

add backward of conv2d

bbaf265

delete useless code

1edc55b

Merge branch 'master' into dev_xcoresigma_jiangbin_deconv

90c108b

Merge branch 'master' into dev_xcoresigma_jiangbin_deconv

6b7cbd7

format code of tests

0eef354

modify configs for tuning

d22856a

StrongSpoon assigned iclementine and StrongSpoon Dec 23, 2024

Galaxy1458 reviewed Dec 26, 2024

View reviewed changes

Merge branch 'master' into dev_xcoresigma_jiangbin_deconv

03dedea

StrongSpoon reviewed Dec 26, 2024

View reviewed changes

Jiang Bin added 2 commits January 13, 2025 11:39

modify autotune config

51a0eb2

merge master

efa433a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add backward of conv2d #365

add backward of conv2d #365

FatJhon commented Dec 16, 2024

Galaxy1458 left a comment

Galaxy1458 Dec 26, 2024

StrongSpoon left a comment

StrongSpoon Dec 25, 2024

StrongSpoon Dec 26, 2024

StrongSpoon Dec 26, 2024

StrongSpoon Dec 26, 2024

StrongSpoon Dec 26, 2024

add backward of conv2d #365

Are you sure you want to change the base?

add backward of conv2d #365

Conversation

FatJhon commented Dec 16, 2024

Galaxy1458 left a comment

Choose a reason for hiding this comment

Galaxy1458 Dec 26, 2024

Choose a reason for hiding this comment

StrongSpoon left a comment

Choose a reason for hiding this comment

StrongSpoon Dec 25, 2024

Choose a reason for hiding this comment

StrongSpoon Dec 26, 2024

Choose a reason for hiding this comment

StrongSpoon Dec 26, 2024

Choose a reason for hiding this comment

StrongSpoon Dec 26, 2024

Choose a reason for hiding this comment

StrongSpoon Dec 26, 2024

Choose a reason for hiding this comment