Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bootstrap the new SIMD module #255

Merged

Conversation

slaperche-scality
Copy link
Contributor

This PR just lay the foundation for the new SIMD module, the biggest part will come in the next PR.

That being said, here we defines the basic constants and helpers, as well as a custom allocator.

By using `_mm_set_epi16` we can set the value using a single instruction
instead of three.
With this change, we can get rid of the MMX instructions
`_mm_setzero_si64` and `_mm_set_pi16`.

Refs: scality#222
First, add some useful instruction set-specific definitions.

Two types:
- RegisterType: underlying representation of the future Register class.
- MaskType: same as RegisterType, but for the Mask class.

Three useful constants:
- INSTRUCTION_SET: useful for discrimination.
- ALIGNMENT: useful for memory allocation.
- REG_BITSZ: useful for iteration.

Refs: scality#222
This first test is pretty dumb, but it will be useful to bootstrap and
setup the CircleCI pipeline.

Refs: scality#222
Copy link
Contributor

@lamphamsy lamphamsy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice PR. There are small things to change.

src/simd/allocator.h Show resolved Hide resolved
src/simd/allocator.h Show resolved Hide resolved
// We required AVX2 because we relies on some instructions (such as
// `_mm256_add_epi16` and others) that aren't available in the first version of
// AVX.
#if defined(__AVX__) && defined(__AVX2__)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is defined(__AVX2__) enough?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's enough but I prefer to be explicit: here we want AVX BUT because of QuadIron uses of SIMD we required AVX2.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I think we could simplify it, also the comment as we need AVX2

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could, but I think explicit is better than implicit (and removing AVX doesn't really make the code more or less readable)

src/simd/definitions.h Show resolved Hide resolved
src/simd/definitions.h Show resolved Hide resolved
src/simd_nf4.h Outdated
@@ -52,19 +54,18 @@ static inline aint128 m128i_to_uint128(m128i v)
inline aint128 expand16(uint16_t* arr, int n)
{
// since n <= 4
uint16_t _arr[4] __attribute__((aligned(ALIGN_SIZE))) = {0, 0, 0, 0};
uint16_t _arr[4] __attribute__((aligned(simd::ALIGNMENT))) = {0, 0, 0, 0};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is in simd namespace so simd:: is not necessary

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! I'll fix this 😃

src/simd_nf4.h Show resolved Hide resolved
@slaperche-scality slaperche-scality force-pushed the eh/bootstrap_new_simd_api branch from 7ce4d38 to 83e2bec Compare October 16, 2018 13:15
Now, we are sure that every codepath is covered and we won't silently
break one of them.

Refs: scality#222, scality#251
@slaperche-scality slaperche-scality force-pushed the eh/bootstrap_new_simd_api branch 3 times, most recently from d141979 to 35ad3a8 Compare October 17, 2018 09:47
With this allocator we can use the STL container, such as std::vector,
and be sure that we can use aligned load instruction when necessary.

Refs: scality#222
The new implementation is more C++-like (can be used with STL container)
and more robust (overflow handling, strict aliasing rule, …).

Refs: scality#222
@slaperche-scality slaperche-scality force-pushed the eh/bootstrap_new_simd_api branch from 35ad3a8 to ba0ec36 Compare October 17, 2018 10:01
@slaperche-scality slaperche-scality merged commit 8b9cd06 into scality:master Oct 17, 2018
@slaperche-scality slaperche-scality deleted the eh/bootstrap_new_simd_api branch October 17, 2018 12:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants