-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Example failure - AVX512F/VL intrinsic support? #33
Comments
Ah, I see here that only a few commands are supported. Adding all of AVX512 I'm sure will be a beast. |
It may be hard to support all of AVX512 on a short time horizon. |
Are you manually adding each intrinsic? It would probably be worth programmatically parsing the |
At quick glance, I'm actually not sure how this tool works. I see in src/components/ASMVisualizer.js, you have four asm ins'ns hand-coded, which would indeed be very labor-intensive to add even all of AVX or AVX2. How/where is the intrinsic translated to the instruction? Are you using the acorn JS parser for C++? I'm interested to take a stab at an alternative approach -- something like using LLVM to get the actual assembly with optimization and whatever compiler flags, and then the operation code from Intel to create the SIMD visualization of the vector instructions programmatically. Let me know if that's of interest to you. That would also make integration with Godbolt much easier. |
@zbjornson That's an intriguing possibility. The operation elements are made of pseudocode, so we would need to write a parser for this pseudocode first. Do you know if they use a standard grammar? Then you need to map the parsed pseudocode to a graphical representation. Hmmm... |
We call our technique "magic". It is only taught at the very best sorcery schools.
It is not that bad because a lot of instructions are going to fit into predetermined patterns. To code absolutely everything is going to be labor intensive, yes... but we think that coding many of the frequently used intrinsics ought to be fine.
We tried like hell to parse C/C++ using JavaScript, but we did not manage to do it. I know, I know, it ought to be easy, right? I thought so, but apparently it is not. However you can send the C/C++ to a server, have it compile it down to assembler and then parse that. Told you: magic.
We already receive the assembly. The part that I am interested in is "SIMD visualization of the vector instructions programmatically". At this point, we are still discussing how to do it cheaply and correctly. If you have ideas, please share. One thing that we did not discuss is how to do turn the assembly instruction and do the corresponding computation... We want to do that in the browser, as part of the animation... and we want to make it so that the user can manipulate the data and see the effect. To this end, we are planning to go the WebAssembly route (got a prototype somewhere on GitHub). |
Thanks for the info. I can take a shot at a parser for Intel's operation syntax that returns an AST. I think that will be relatively straightforward using something like PEG.js and I think will be a much faster/accurate/easier way of building this tool. Will try some evening this week or weekend.
Given the size of the C++ spec, that definitely would not have been my guess. :) Can you point me to where you request/receive the assembly please? I looked for an exec/spawn call but couldn't find one.
With the parsed operation (what I said I'll work on above), I think you could translate that AST into JS operations fairly easily... handling signalings, things that read/write MXCSR and all the rounding ops will be a potential pain point. |
The C spec is smaller... but even just supporting a limited subset proved to be a headache.
The assembly is requested via a REST call... Were you expecting us to launch a compiler from the browser? :-) I think most people don't have a compiler along with their browser.
The flags are the least of our worries... JavaScript does not support unsigned ints, 32-bit floats, 64-bit ints... to say nothing of converting between those. This can all be done in JavaScript, but it risks becoming a giant wall of code, especially if you add the necessary testing. My current proposal is to represent all values as hexadecimal strings, and push them to a separate WebAssembly library for processing... https://github.com/lemire/jstypes But the computation itself is a relatively modest problem compared to the visualization. This being said, I do think that your approach has potentially great merit compared with our current struggle to scale this up. |
I see now it goes to the godbolt API; I was expecting the compiler to be run by this module's server. I have a grammar mostly done (I have two more expression types to add). Two examples below, including a behemoth from AVX-512. _mm_shuffle_pd
_mm_mask_fixupimm_sd
I'm not entirely positive that this will ultimately be useful, but it'll be available.
Have you considered recording the process executed on a real processor and just replaying it in the browser? See [1-4]. I'm not sure if it's possible to record on AWS since most perf counters are locked down, but that would eliminate the need to emulate in JS. [1] https://software.intel.com/en-us/blogs/2013/09/18/processor-tracing |
Looking forward to your grammar.
It is certainly is possible to run the function through a debugger. You can control lldb via Python. Visual Studio has a great C++ debugger. |
Hi folks, cool project!
I was trying out a few of my code snippets but can't seem to get them to work. Here's an example:
Removing the
&
in the parameter list and making it returnprodv
gets it a bit further, but not much. All of the instructions are reported as "unsupported command".The text was updated successfully, but these errors were encountered: