-
-
Notifications
You must be signed in to change notification settings - Fork 97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support x86_64h builds in macOS, improving performance for nearly all intel macs #11150
Comments
I implemented the slice (without changing defaults) as In godot-cpp, the option works no problem. Check out my PR. In godot itself (check out my branch), i've got issues with embree: On the x86_64h slice, it fails to find the expected If anybody has some embree experience, please help me out :) Error Log
|
It may be useful to tie this to the target macOS version (which we don't have a setting for right now, and would probably be a good idea to do so...). If a developer is targeting macOS 12.0 or later, which does not run on pre-Haswell CPUs, then we should use As for "Add x86_64h as a target", I'm not sure about this. The specific name isn't something recognized on all platforms. One of the goals we had with standardizing the architecture names for Godot 4.0 is to make them consistent across platforms, so for example, x86_64 is x86_64 everywhere, on Windows it's x86_64 and not x64, on Linux it's x86_64 and not amd64. I would look for some OS-agnostic way of specifying this, for example, setting a minimum architecture version to some value on Godot's side ( |
Describe the project you are working on
Something Godot related.
Describe the problem or limitation you are having in your project
I'm proposing a performance improvement on nearly all intel macOS computers (2013 and newer), at the cost of binary size (or compatibility).
Describe the feature / enhancement and how it helps to overcome the problem or limitation
On macOS, Godot currently builds for
x86_64
andarm64
.macOS supports another relevant type of binary slice,
x86_64h
. This binary slice is preferred overx86_64
on all haswell and newer CPUs. This includes all post-2013 macs.x86_64
, for comparison, was adopted by Apple in 2006 and likely finalized the supported instruction set then.Among others, the
x86_64h
slice implies haswell or newer and therefore enables SIMD instruction sets (and auto-vectorization for)SSE3
,SSSE3
,SSE4.1
,SSE4.2
,AVX
andAVX2
, improving performance. @Calinou has recently, for another GOP, benchmarked justSSE4.2
(and implied below) performance gains to ~10%. It is reasonable to assume the other implied flags from this architecture change will improve performance further. I have found an up to 2x throughput increase (high volume float addition, 50k floats, 600ms vs 300ms) - though this improvement is limited to specific applications of avx2.Here's a complete list of instruction set changes (reference):
diff <(g++ -Q --help=target) <(g++ -Q -march=haswell --help=target)
(note that passing
-march=haswell
is not the exact same as building for thex86_64h
binary slice. For example,mtune
may be set to something different thancore2
, such asgeneric
. I don't know a good way to test the exact command though)Describe how your proposal will work, with code, pseudo-code, mock-ups, and/or diagrams
Fortunately it's pretty simple: detect.py would either:
x86_64h
as a target, increasing the universal fat binary size (by up to 50%).x86_64h
again from released games, for projects where binary size is more important than speed.x86_64
withx86_64h
, cutting support for macs older than 2013.march=haswell
(or similar but lower, such asnehalem
) on otherx86_64
targets too, so that Linux and Windows can also benefit from the performance changes.x86_64h
to the build, but do not modify any defaults.Additionally, for non-universal GDExtensions, godot would need to expose an
x86_64h
feature tag.If this enhancement will not be used often, can it be worked around with a few lines of script?
It cannot.
Is there a reason why this should be core and not an add-on in the asset library?
It's core.
The text was updated successfully, but these errors were encountered: