Skip to content

Commit 3417b48

Browse files
committed
[dev.simd] simd: add carryless multiply
now with comments, and also a test. choice of data types, method names, etc, are all up for comment. It's NOT commutative, because of the immediate operand (unless we swap the bits of the immediate). Change-Id: I730a6938c6803d0b93544445db65eadc51783e42 Reviewed-on: https://go-review.googlesource.com/c/go/+/726963 Reviewed-by: Junyang Shao <[email protected]> LUCI-TryBot-Result: Go LUCI <[email protected]>
1 parent f51ee08 commit 3417b48

File tree

15 files changed

+302
-7
lines changed

15 files changed

+302
-7
lines changed

src/cmd/compile/internal/amd64/simdssa.go

Lines changed: 3 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

src/cmd/compile/internal/ssa/_gen/simdAMD64.rules

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1333,6 +1333,9 @@
13331333
(blendMaskedInt16x32 x y mask) => (VPBLENDMWMasked512 x y (VPMOVVec16x32ToM <types.TypeMask> mask))
13341334
(blendMaskedInt32x16 x y mask) => (VPBLENDMDMasked512 x y (VPMOVVec32x16ToM <types.TypeMask> mask))
13351335
(blendMaskedInt64x8 x y mask) => (VPBLENDMQMasked512 x y (VPMOVVec64x8ToM <types.TypeMask> mask))
1336+
(carrylessMultiplyUint64x2 ...) => (VPCLMULQDQ128 ...)
1337+
(carrylessMultiplyUint64x4 ...) => (VPCLMULQDQ256 ...)
1338+
(carrylessMultiplyUint64x8 ...) => (VPCLMULQDQ512 ...)
13361339
(concatSelectedConstantFloat32x4 ...) => (VSHUFPS128 ...)
13371340
(concatSelectedConstantFloat64x2 ...) => (VSHUFPD128 ...)
13381341
(concatSelectedConstantInt32x4 ...) => (VSHUFPS128 ...)

src/cmd/compile/internal/ssa/_gen/simdAMD64ops.go

Lines changed: 3 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

src/cmd/compile/internal/ssa/_gen/simdgenericOps.go

Lines changed: 3 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

src/cmd/compile/internal/ssa/opGen.go

Lines changed: 69 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

src/cmd/compile/internal/ssa/rewriteAMD64.go

Lines changed: 9 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

src/cmd/compile/internal/ssagen/simdintrinsics.go

Lines changed: 3 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

src/simd/_gen/simdgen/ops/GaloisField/categories.yaml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,3 +19,5 @@
1919
documentation: !string |-
2020
// NAME computes element-wise GF(2^8) multiplication with
2121
// reduction polynomial x^8 + x^4 + x^3 + x + 1.
22+
- go: carrylessMultiply
23+
commutative: false

src/simd/_gen/simdgen/ops/GaloisField/go.yaml

Lines changed: 60 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -30,3 +30,63 @@
3030
- *uint8
3131
out:
3232
- *uint8
33+
34+
- go: carrylessMultiply
35+
documentation: !string |-
36+
// NAME computes one of four possible Galois polynomial
37+
// products of selected high and low halves of x and y,
38+
// depending on the value of xyHiLo, returning the 128-bit
39+
// product in the concatenated two elements of the result.
40+
// Bit 0 selects the low (0) or high (1) element of x and
41+
// bit 4 selects the low (0x00) or high (0x10) element of y.
42+
asm: V?PCLMULQDQ
43+
in:
44+
- go: Uint64x2
45+
- go: Uint64x2
46+
- class: immediate
47+
immOffset: 0
48+
name: xyHiLo
49+
out:
50+
- go: Uint64x2
51+
overwriteElementBits: 64
52+
hideMaskMethods: true
53+
54+
- go: carrylessMultiply
55+
documentation: !string |-
56+
// NAME computes one of two possible Galois polynomial
57+
// products of selected high and low halves of each of the two
58+
// 128-bit lanes of x and y, depending on the value of xyHiLo,
59+
// and returns the four 128-bit products in the result's lanes.
60+
// Bit 0 selects the low (0) or high (1) elements of x's lanes and
61+
// bit 4 selects the low (0x00) or high (0x10) elements of y's lanes.
62+
asm: V?PCLMULQDQ
63+
in:
64+
- go: Uint64x4
65+
- go: Uint64x4
66+
- class: immediate
67+
immOffset: 0
68+
name: xyHiLo
69+
out:
70+
- go: Uint64x4
71+
overwriteElementBits: 64
72+
hideMaskMethods: true
73+
74+
- go: carrylessMultiply
75+
documentation: !string |-
76+
// NAME computes one of four possible Galois polynomial
77+
// products of selected high and low halves of each of the four
78+
// 128-bit lanes of x and y, depending on the value of xyHiLo,
79+
// and returns the four 128-bit products in the result's lanes.
80+
// Bit 0 selects the low (0) or high (1) elements of x's lanes and
81+
// bit 4 selects the low (0x00) or high (0x10) elements of y's lanes.
82+
asm: V?PCLMULQDQ
83+
in:
84+
- go: Uint64x8
85+
- go: Uint64x8
86+
- class: immediate
87+
immOffset: 0
88+
name: xyHiLo
89+
out:
90+
- go: Uint64x8
91+
overwriteElementBits: 64
92+
hideMaskMethods: true

src/simd/_gen/simdgen/types.yaml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -83,6 +83,9 @@ in: !repeat
8383
- {class: vreg, go: Int64x4, base: "int", elemBits: 128, bits: 256, lanes: 4}
8484
- {class: vreg, go: Uint64x4, base: "uint", elemBits: 128, bits: 256, lanes: 4}
8585

86+
# Special for carryless multiply
87+
- {class: vreg, go: Uint64x8, base: "uint", elemBits: 128, bits: 512, lanes: 8}
88+
8689
# Special shapes just to make VAES(ENC|DEC)(LAST)?512 work.
8790
# The elemBits field of these shapes are wrong, it would be overwritten by overwriteElemBits.
8891
- {class: vreg, go: Int8x32, base: "int", elemBits: 128, bits: 512, lanes: 32}

0 commit comments

Comments
 (0)