Bootstrap the new SIMD module #255

slaperche-scality · 2018-10-16T01:29:07Z

This PR just lay the foundation for the new SIMD module, the biggest part will come in the next PR.

That being said, here we defines the basic constants and helpers, as well as a custom allocator.

By using `_mm_set_epi16` we can set the value using a single instruction instead of three. With this change, we can get rid of the MMX instructions `_mm_setzero_si64` and `_mm_set_pi16`. Refs: scality#222

First, add some useful instruction set-specific definitions. Two types: - RegisterType: underlying representation of the future Register class. - MaskType: same as RegisterType, but for the Mask class. Three useful constants: - INSTRUCTION_SET: useful for discrimination. - ALIGNMENT: useful for memory allocation. - REG_BITSZ: useful for iteration. Refs: scality#222

This first test is pretty dumb, but it will be useful to bootstrap and setup the CircleCI pipeline. Refs: scality#222

lamphamsy

Nice PR. There are small things to change.

src/simd/allocator.h

lamphamsy · 2018-10-16T11:15:49Z

src/simd/definitions.h

+// We required AVX2 because we relies on some instructions (such as
+// `_mm256_add_epi16` and others) that aren't available in the first version of
+// AVX.
+#if defined(__AVX__) && defined(__AVX2__)


Is defined(__AVX2__) enough?

Yes, it's enough but I prefer to be explicit: here we want AVX BUT because of QuadIron uses of SIMD we required AVX2.

So I think we could simplify it, also the comment as we need AVX2

We could, but I think explicit is better than implicit (and removing AVX doesn't really make the code more or less readable)

src/simd/definitions.h

lamphamsy · 2018-10-16T11:21:40Z

src/simd_nf4.h

@@ -52,19 +54,18 @@ static inline aint128 m128i_to_uint128(m128i v)
 inline aint128 expand16(uint16_t* arr, int n)
 {
    // since n <= 4
-    uint16_t _arr[4] __attribute__((aligned(ALIGN_SIZE))) = {0, 0, 0, 0};
+    uint16_t _arr[4] __attribute__((aligned(simd::ALIGNMENT))) = {0, 0, 0, 0};


It is in simd namespace so simd:: is not necessary

Good catch! I'll fix this 😃

src/simd_nf4.h

That way, it can be ran in parallel with the test suite. Refs: scality#251

Now, we are sure that every codepath is covered and we won't silently break one of them. Refs: scality#222, scality#251

With this allocator we can use the STL container, such as std::vector, and be sure that we can use aligned load instruction when necessary. Refs: scality#222

Refs: scality#222

The new implementation is more C++-like (can be used with STL container) and more robust (overflow handling, strict aliasing rule, …). Refs: scality#222

src/simd/definitions.h

slaperche-scality added 3 commits October 12, 2018 16:18

use _mm_set_epi16

f642316

By using `_mm_set_epi16` we can set the value using a single instruction instead of three. With this change, we can get rid of the MMX instructions `_mm_setzero_si64` and `_mm_set_pi16`. Refs: scality#222

add tests for SIMD

0826471

This first test is pretty dumb, but it will be useful to bootstrap and setup the CircleCI pipeline. Refs: scality#222

lamphamsy requested changes Oct 16, 2018

View reviewed changes

slaperche-scality added 2 commits October 16, 2018 15:15

use the definitions from the new SIMD module

7ffdd9b

move benchmark into its own job

30582fd

That way, it can be ran in parallel with the test suite. Refs: scality#251

slaperche-scality force-pushed the eh/bootstrap_new_simd_api branch from 7ce4d38 to 83e2bec Compare October 16, 2018 13:15

run the unit tests in all SIMD mode

65e34d8

Now, we are sure that every codepath is covered and we won't silently break one of them. Refs: scality#222, scality#251

slaperche-scality force-pushed the eh/bootstrap_new_simd_api branch 3 times, most recently from d141979 to 35ad3a8 Compare October 17, 2018 09:47

slaperche-scality added 3 commits October 17, 2018 12:01

add a custom allocator for aligned memory

ff4eef9

With this allocator we can use the STL container, such as std::vector, and be sure that we can use aligned load instruction when necessary. Refs: scality#222

add test for the custom allocator

635f83f

Refs: scality#222

use the new allocator for aligned allocations

ba0ec36

The new implementation is more C++-like (can be used with STL container) and more robust (overflow handling, strict aliasing rule, …). Refs: scality#222

slaperche-scality force-pushed the eh/bootstrap_new_simd_api branch from 35ad3a8 to ba0ec36 Compare October 17, 2018 10:01

lamphamsy reviewed Oct 17, 2018

View reviewed changes

src/simd/definitions.h Show resolved Hide resolved

lamphamsy approved these changes Oct 17, 2018

View reviewed changes

slaperche-scality merged commit 8b9cd06 into scality:master Oct 17, 2018

slaperche-scality deleted the eh/bootstrap_new_simd_api branch October 17, 2018 12:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bootstrap the new SIMD module #255

Bootstrap the new SIMD module #255

slaperche-scality commented Oct 16, 2018

lamphamsy left a comment

lamphamsy Oct 16, 2018

slaperche-scality Oct 16, 2018

lamphamsy Oct 16, 2018

slaperche-scality Oct 16, 2018

lamphamsy Oct 16, 2018

slaperche-scality Oct 16, 2018

Bootstrap the new SIMD module #255

Bootstrap the new SIMD module #255

Conversation

slaperche-scality commented Oct 16, 2018

lamphamsy left a comment

Choose a reason for hiding this comment

lamphamsy Oct 16, 2018

Choose a reason for hiding this comment

slaperche-scality Oct 16, 2018

Choose a reason for hiding this comment

lamphamsy Oct 16, 2018

Choose a reason for hiding this comment

slaperche-scality Oct 16, 2018

Choose a reason for hiding this comment

lamphamsy Oct 16, 2018

Choose a reason for hiding this comment

slaperche-scality Oct 16, 2018

Choose a reason for hiding this comment