Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support x86_64h builds in macOS, improving performance for nearly all intel macs #11150

Open
Ivorforce opened this issue Nov 13, 2024 · 2 comments

Comments

@Ivorforce
Copy link

Ivorforce commented Nov 13, 2024

Describe the project you are working on

Something Godot related.

Describe the problem or limitation you are having in your project

I'm proposing a performance improvement on nearly all intel macOS computers (2013 and newer), at the cost of binary size (or compatibility).

Describe the feature / enhancement and how it helps to overcome the problem or limitation

On macOS, Godot currently builds for x86_64 and arm64.

macOS supports another relevant type of binary slice, x86_64h. This binary slice is preferred over x86_64 on all haswell and newer CPUs. This includes all post-2013 macs. x86_64, for comparison, was adopted by Apple in 2006 and likely finalized the supported instruction set then.

Among others, the x86_64h slice implies haswell or newer and therefore enables SIMD instruction sets (and auto-vectorization for) SSE3 , SSSE3, SSE4.1, SSE4.2, AVX and AVX2, improving performance. @Calinou has recently, for another GOP, benchmarked just SSE4.2 (and implied below) performance gains to ~10%. It is reasonable to assume the other implied flags from this architecture change will improve performance further. I have found an up to 2x throughput increase (high volume float addition, 50k floats, 600ms vs 300ms) - though this improvement is limited to specific applications of avx2.

Here's a complete list of instruction set changes (reference):

diff <(g++ -Q --help=target) <(g++ -Q -march=haswell --help=target)

(note that passing -march=haswell is not the exact same as building for the x86_64h binary slice. For example, mtune may be set to something different than core2, such as generic. I don't know a good way to test the exact command though)

 ❯ diff <(g++ -Q --help=target) <(g++ -Q -march=haswell --help=target)
31c31
<   -march=                                     x86-64
---
>   -march=                                     haswell
34,35c34,35
<   -mavx                                       [disabled]
<   -mavx2                                      [disabled]
---
>   -mavx                                       [enabled]
>   -mavx2                                      [enabled]
60,61c60,61
<   -mbmi                                       [disabled]
<   -mbmi2                                      [disabled]
---
>   -mbmi                                       [enabled]
>   -mbmi2                                      [enabled]
74,75c74,75
<   -mcrc32                                     [disabled]
<   -mcx16                                      [disabled]
---
>   -mcrc32                                     [enabled]
>   -mcx16                                      [enabled]
82c82
<   -mf16c                                      [disabled]
---
>   -mf16c                                      [enabled]
88c88
<   -mfma                                       [disabled]
---
>   -mfma                                       [enabled]
94c94
<   -mfsgsbase                                  [disabled]
---
>   -mfsgsbase                                  [enabled]
103c103
<   -mhle                                       [disabled]
---
>   -mhle                                       [enabled]
123c123
<   -mlzcnt                                     [disabled]
---
>   -mlzcnt                                     [enabled]
130c130
<   -mmovbe                                     [disabled]
---
>   -mmovbe                                     [enabled]
133c133
<   -mmove-max=                                 128
---
>   -mmove-max=                                 256
144c144
<   -mno-sse4                                   [enabled]
---
>   -mno-sse4                                   [disabled]
151c151
<   -mpclmul                                    [disabled]
---
>   -mpclmul                                    [enabled]
155c155
<   -mpopcnt                                    [disabled]
---
>   -mpopcnt                                    [enabled]
166c166
<   -mrdrnd                                     [disabled]
---
>   -mrdrnd                                     [enabled]
177c177
<   -msahf                                      [disabled]
---
>   -msahf                                      [enabled]
189,191c189,191
<   -msse4                                      [disabled]
<   -msse4.1                                    [disabled]
<   -msse4.2                                    [disabled]
---
>   -msse4                                      [enabled]
>   -msse4.1                                    [enabled]
>   -msse4.2                                    [enabled]
195c195
<   -mssse3                                     [disabled]
---
>   -mssse3                                     [enabled]
202c202
<   -mstore-max=                                128
---
>   -mstore-max=                                256
213c213
<   -mtune=                                     core2
---
>   -mtune=                                     haswell
226c226
<   -mxsave                                     [disabled]
---
>   -mxsave                                     [enabled]
228c228
<   -mxsaveopt                                  [disabled]
---
>   -mxsaveopt                                  [enabled]

Describe how your proposal will work, with code, pseudo-code, mock-ups, and/or diagrams

Fortunately it's pretty simple: detect.py would either:

  • Add x86_64h as a target, increasing the universal fat binary size (by up to 50%).
    • It may be possible to add an option to the export template whether to strip x86_64h again from released games, for projects where binary size is more important than speed.
  • Replace x86_64 with x86_64h, cutting support for macs older than 2013.
    • macOS itself cut support for pre-haswell computers in 2021 with Monterey.
    • If we do this, it's reasonable to question whether we don't want to pass march=haswell (or similar but lower, such as nehalem) on other x86_64 targets too, so that Linux and Windows can also benefit from the performance changes.
  • Add the option to add x86_64h to the build, but do not modify any defaults.

Additionally, for non-universal GDExtensions, godot would need to expose an x86_64h feature tag.

If this enhancement will not be used often, can it be worked around with a few lines of script?

It cannot.

Is there a reason why this should be core and not an add-on in the asset library?

It's core.

@Ivorforce
Copy link
Author

Ivorforce commented Nov 17, 2024

I implemented the slice (without changing defaults) as with_x86_64h. This would not allow compiling without x86_64 as a base slice, but it is unopinionated and allows users to benefit from the option if they need to squeeze out some more speed without sacrificing compatibility.

In godot-cpp, the option works no problem. Check out my PR.

In godot itself (check out my branch), i've got issues with embree: On the x86_64h slice, it fails to find the expected avx2 symbols when linking. I spent some time looking, and I think it may have to do with flags EMBREE_TARGET_AVX2 and EMBREE_TARGET_AVX. But since it doesn't compile at all, I think it may be embree bug, because instead of defaulting to either use avx or not use it, it just fails to compile without additional flags being set.

If anybody has some embree experience, please help me out :)

Error Log
[100%] Linking Program bin/godot.macos.editor.x86_64 ...
ld: Undefined symbols:
  embree::BVHN<4>::clearBarrier(embree::NodeRefPtr<4>&), referenced from:
      embree::avx2::BVHNMeshBuilderMorton<4, embree::TriangleMesh, embree::TriangleM<4>>::build() in libmodule_raycast.macos.editor.x86_64.a[x86_64h][39](bvh_builder_morton.macos.editor.x86_64.o)
      embree::avx2::BVHNMeshBuilderMorton<4, embree::TriangleMesh, embree::TriangleMv<4>>::build() in libmodule_raycast.macos.editor.x86_64.a[x86_64h][39](bvh_builder_morton.macos.editor.x86_64.o)
      embree::avx2::BVHNMeshBuilderMorton<4, embree::TriangleMesh, embree::TriangleMi<4>>::build() in libmodule_raycast.macos.editor.x86_64.a[x86_64h][39](bvh_builder_morton.macos.editor.x86_64.o)
  embree::BVHN<4>::layoutLargeNodes(unsigned long), referenced from:
      embree::avx2::BVHNBuilderSAH<4, embree::TriangleM<4>>::build() in libmodule_raycast.macos.editor.x86_64.a[x86_64h][40](bvh_builder_sah.macos.editor.x86_64.o)
      embree::avx2::BVHNBuilderSAH<4, embree::TriangleMv<4>>::build() in libmodule_raycast.macos.editor.x86_64.a[x86_64h][40](bvh_builder_sah.macos.editor.x86_64.o)
      embree::avx2::BVHNBuilderSAH<4, embree::TriangleMi<4>>::build() in libmodule_raycast.macos.editor.x86_64.a[x86_64h][40](bvh_builder_sah.macos.editor.x86_64.o)
      embree::avx2::BVHNBuilderFastSpatialSAH<4, embree::TriangleMesh, embree::TriangleM<4>, embree::avx2::TriangleSplitterFactory>::build() in libmodule_raycast.macos.editor.x86_64.a[x86_64h][41](bvh_builder_sah_spatial.macos.editor.x86_64.o)
      embree::avx2::BVHNBuilderFastSpatialSAH<4, embree::TriangleMesh, embree::TriangleMv<4>, embree::avx2::TriangleSplitterFactory>::build() in libmodule_raycast.macos.editor.x86_64.a[x86_64h][41](bvh_builder_sah_spatial.macos.editor.x86_64.o)
      embree::avx2::BVHNBuilderFastSpatialSAH<4, embree::TriangleMesh, embree::TriangleMi<4>, embree::avx2::TriangleSplitterFactory>::build() in libmodule_raycast.macos.editor.x86_64.a[x86_64h][41](bvh_builder_sah_spatial.macos.editor.x86_64.o)
  embree::BVHN<4>::set(embree::NodeRefPtr<4>, embree::LBBox<embree::Vec3fa> const&, unsigned long), referenced from:
      embree::avx2::BVHNMeshBuilderMorton<4, embree::TriangleMesh, embree::TriangleM<4>>::build() in libmodule_raycast.macos.editor.x86_64.a[x86_64h][39](bvh_builder_morton.macos.editor.x86_64.o)
      embree::avx2::BVHNMeshBuilderMorton<4, embree::TriangleMesh, embree::TriangleM<4>>::build() in libmodule_raycast.macos.editor.x86_64.a[x86_64h][39](bvh_builder_morton.macos.editor.x86_64.o)
      embree::avx2::BVHNMeshBuilderMorton<4, embree::TriangleMesh, embree::TriangleMv<4>>::build() in libmodule_raycast.macos.editor.x86_64.a[x86_64h][39](bvh_builder_morton.macos.editor.x86_64.o)
      embree::avx2::BVHNMeshBuilderMorton<4, embree::TriangleMesh, embree::TriangleMv<4>>::build() in libmodule_raycast.macos.editor.x86_64.a[x86_64h][39](bvh_builder_morton.macos.editor.x86_64.o)
      embree::avx2::BVHNMeshBuilderMorton<4, embree::TriangleMesh, embree::TriangleMi<4>>::build() in libmodule_raycast.macos.editor.x86_64.a[x86_64h][39](bvh_builder_morton.macos.editor.x86_64.o)
      embree::avx2::BVHNMeshBuilderMorton<4, embree::TriangleMesh, embree::TriangleMi<4>>::build() in libmodule_raycast.macos.editor.x86_64.a[x86_64h][39](bvh_builder_morton.macos.editor.x86_64.o)
      embree::avx2::BVHNBuilderSAH<4, embree::TriangleM<4>>::build() in libmodule_raycast.macos.editor.x86_64.a[x86_64h][40](bvh_builder_sah.macos.editor.x86_64.o)
      ...
  embree::BVHN<4>::preBuild(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&), referenced from:
      embree::avx2::BVHNBuilderSAH<4, embree::TriangleM<4>>::build() in libmodule_raycast.macos.editor.x86_64.a[x86_64h][40](bvh_builder_sah.macos.editor.x86_64.o)
      embree::avx2::BVHNBuilderSAH<4, embree::TriangleMv<4>>::build() in libmodule_raycast.macos.editor.x86_64.a[x86_64h][40](bvh_builder_sah.macos.editor.x86_64.o)
      embree::avx2::BVHNBuilderSAH<4, embree::TriangleMi<4>>::build() in libmodule_raycast.macos.editor.x86_64.a[x86_64h][40](bvh_builder_sah.macos.editor.x86_64.o)
      embree::avx2::BVHNBuilderSAHQuantized<4, embree::TriangleMi<4>>::build() in libmodule_raycast.macos.editor.x86_64.a[x86_64h][40](bvh_builder_sah.macos.editor.x86_64.o)
      embree::avx2::BVHNBuilderFastSpatialSAH<4, embree::TriangleMesh, embree::TriangleM<4>, embree::avx2::TriangleSplitterFactory>::build() in libmodule_raycast.macos.editor.x86_64.a[x86_64h][41](bvh_builder_sah_spatial.macos.editor.x86_64.o)
      embree::avx2::BVHNBuilderFastSpatialSAH<4, embree::TriangleMesh, embree::TriangleMv<4>, embree::avx2::TriangleSplitterFactory>::build() in libmodule_raycast.macos.editor.x86_64.a[x86_64h][41](bvh_builder_sah_spatial.macos.editor.x86_64.o)
      embree::avx2::BVHNBuilderFastSpatialSAH<4, embree::TriangleMesh, embree::TriangleMi<4>, embree::avx2::TriangleSplitterFactory>::build() in libmodule_raycast.macos.editor.x86_64.a[x86_64h][41](bvh_builder_sah_spatial.macos.editor.x86_64.o)
      ...
  embree::BVHN<4>::postBuild(double), referenced from:
      embree::avx2::BVHNBuilderSAH<4, embree::TriangleM<4>>::build() in libmodule_raycast.macos.editor.x86_64.a[x86_64h][40](bvh_builder_sah.macos.editor.x86_64.o)
      embree::avx2::BVHNBuilderSAH<4, embree::TriangleMv<4>>::build() in libmodule_raycast.macos.editor.x86_64.a[x86_64h][40](bvh_builder_sah.macos.editor.x86_64.o)
      embree::avx2::BVHNBuilderSAH<4, embree::TriangleMi<4>>::build() in libmodule_raycast.macos.editor.x86_64.a[x86_64h][40](bvh_builder_sah.macos.editor.x86_64.o)
      embree::avx2::BVHNBuilderSAHQuantized<4, embree::TriangleMi<4>>::build() in libmodule_raycast.macos.editor.x86_64.a[x86_64h][40](bvh_builder_sah.macos.editor.x86_64.o)
      embree::avx2::BVHNBuilderFastSpatialSAH<4, embree::TriangleMesh, embree::TriangleM<4>, embree::avx2::TriangleSplitterFactory>::build() in libmodule_raycast.macos.editor.x86_64.a[x86_64h][41](bvh_builder_sah_spatial.macos.editor.x86_64.o)
      embree::avx2::BVHNBuilderFastSpatialSAH<4, embree::TriangleMesh, embree::TriangleMv<4>, embree::avx2::TriangleSplitterFactory>::build() in libmodule_raycast.macos.editor.x86_64.a[x86_64h][41](bvh_builder_sah_spatial.macos.editor.x86_64.o)
      embree::avx2::BVHNBuilderFastSpatialSAH<4, embree::TriangleMesh, embree::TriangleMi<4>, embree::avx2::TriangleSplitterFactory>::build() in libmodule_raycast.macos.editor.x86_64.a[x86_64h][41](bvh_builder_sah_spatial.macos.editor.x86_64.o)
      ...
  embree::BVHN<4>::BVHN(embree::PrimitiveType const&, embree::Scene*), referenced from:
      embree::BVH4Factory::BVH4Triangle4(embree::Scene*, embree::BVHFactory::BuildVariant, embree::BVHFactory::IntersectVariant) in libmodule_raycast.macos.editor.x86_64.a[x86_64h][33](bvh4_factory.macos.editor.x86_64.o)
      embree::BVH4Factory::BVH4Triangle4v(embree::Scene*, embree::BVHFactory::BuildVariant, embree::BVHFactory::IntersectVariant) in libmodule_raycast.macos.editor.x86_64.a[x86_64h][33](bvh4_factory.macos.editor.x86_64.o)
      embree::BVH4Factory::BVH4Triangle4i(embree::Scene*, embree::BVHFactory::BuildVariant, embree::BVHFactory::IntersectVariant) in libmodule_raycast.macos.editor.x86_64.a[x86_64h][33](bvh4_factory.macos.editor.x86_64.o)
      embree::BVH4Factory::BVH4Triangle4iMB(embree::Scene*, embree::BVHFactory::BuildVariant, embree::BVHFactory::IntersectVariant) in libmodule_raycast.macos.editor.x86_64.a[x86_64h][33](bvh4_factory.macos.editor.x86_64.o)
      embree::BVH4Factory::BVH4Triangle4vMB(embree::Scene*, embree::BVHFactory::BuildVariant, embree::BVHFactory::IntersectVariant) in libmodule_raycast.macos.editor.x86_64.a[x86_64h][33](bvh4_factory.macos.editor.x86_64.o)
      embree::BVH4Factory::BVH4QuantizedTriangle4i(embree::Scene*) in libmodule_raycast.macos.editor.x86_64.a[x86_64h][33](bvh4_factory.macos.editor.x86_64.o)
      embree::avx2::BVHNBuilderTwoLevel<4, embree::TriangleMesh, embree::TriangleM<4>>::createMeshAccel(unsigned long, embree::Builder*&) in libmodule_raycast.macos.editor.x86_64.a[x86_64h][43](bvh_builder_twolevel.macos.editor.x86_64.o)
      ...
  embree::avx2::BVH8Triangle4vIntersector1Woop(), referenced from:
      embree::BVH8Factory::BVH8Factory(int, int) in libmodule_raycast.macos.editor.x86_64.a[x86_64h][34](bvh8_factory.macos.editor.x86_64.o)
  embree::avx2::BVH8Triangle4Intersector1Moeller(), referenced from:
      embree::BVH8Factory::BVH8Factory(int, int) in libmodule_raycast.macos.editor.x86_64.a[x86_64h][34](bvh8_factory.macos.editor.x86_64.o)
  embree::avx2::BVH8Triangle4iIntersector1Moeller(), referenced from:
      embree::BVH8Factory::BVH8Factory(int, int) in libmodule_raycast.macos.editor.x86_64.a[x86_64h][34](bvh8_factory.macos.editor.x86_64.o)
  embree::avx2::QBVH8Triangle4Intersector1Moeller(), referenced from:
      embree::BVH8Factory::BVH8Factory(int, int) in libmodule_raycast.macos.editor.x86_64.a[x86_64h][34](bvh8_factory.macos.editor.x86_64.o)
  embree::avx2::BVH8Triangle4iIntersector1Pluecker(), referenced from:
      embree::BVH8Factory::BVH8Factory(int, int) in libmodule_raycast.macos.editor.x86_64.a[x86_64h][34](bvh8_factory.macos.editor.x86_64.o)
  embree::avx2::BVH8Triangle4vIntersector1Pluecker(), referenced from:
      embree::BVH8Factory::BVH8Factory(int, int) in libmodule_raycast.macos.editor.x86_64.a[x86_64h][34](bvh8_factory.macos.editor.x86_64.o)
  embree::avx2::BVH8Triangle4iMBIntersector1Moeller(), referenced from:
      embree::BVH8Factory::BVH8Factory(int, int) in libmodule_raycast.macos.editor.x86_64.a[x86_64h][34](bvh8_factory.macos.editor.x86_64.o)
  embree::avx2::BVH8Triangle4vMBIntersector1Moeller(), referenced from:
      embree::BVH8Factory::BVH8Factory(int, int) in libmodule_raycast.macos.editor.x86_64.a[x86_64h][34](bvh8_factory.macos.editor.x86_64.o)
  embree::avx2::QBVH8Triangle4iIntersector1Pluecker(), referenced from:
      embree::BVH8Factory::BVH8Factory(int, int) in libmodule_raycast.macos.editor.x86_64.a[x86_64h][34](bvh8_factory.macos.editor.x86_64.o)
  embree::avx2::BVH8Triangle4iMBIntersector1Pluecker(), referenced from:
      embree::BVH8Factory::BVH8Factory(int, int) in libmodule_raycast.macos.editor.x86_64.a[x86_64h][34](bvh8_factory.macos.editor.x86_64.o)
  embree::avx2::BVH8Triangle4vMBIntersector1Pluecker(), referenced from:
      embree::BVH8Factory::BVH8Factory(int, int) in libmodule_raycast.macos.editor.x86_64.a[x86_64h][34](bvh8_factory.macos.editor.x86_64.o)
  embree::avx2::BVH4Triangle4Intersector8HybridMoeller(), referenced from:
      embree::BVH4Factory::BVH4Factory(int, int) in libmodule_raycast.macos.editor.x86_64.a[x86_64h][33](bvh4_factory.macos.editor.x86_64.o)
  embree::avx2::BVH8Triangle4Intersector4HybridMoeller(), referenced from:
      embree::BVH8Factory::BVH8Factory(int, int) in libmodule_raycast.macos.editor.x86_64.a[x86_64h][34](bvh8_factory.macos.editor.x86_64.o)
  embree::avx2::BVH8Triangle4Intersector8HybridMoeller(), referenced from:
      embree::BVH8Factory::BVH8Factory(int, int) in libmodule_raycast.macos.editor.x86_64.a[x86_64h][34](bvh8_factory.macos.editor.x86_64.o)
  embree::avx2::BVH4Triangle4iIntersector8HybridMoeller(), referenced from:
      embree::BVH4Factory::BVH4Factory(int, int) in libmodule_raycast.macos.editor.x86_64.a[x86_64h][33](bvh4_factory.macos.editor.x86_64.o)
  embree::avx2::BVH8Triangle4iIntersector4HybridMoeller(), referenced from:
      embree::BVH8Factory::BVH8Factory(int, int) in libmodule_raycast.macos.editor.x86_64.a[x86_64h][34](bvh8_factory.macos.editor.x86_64.o)
  embree::avx2::BVH8Triangle4iIntersector8HybridMoeller(), referenced from:
      embree::BVH8Factory::BVH8Factory(int, int) in libmodule_raycast.macos.editor.x86_64.a[x86_64h][34](bvh8_factory.macos.editor.x86_64.o)
  embree::avx2::BVH4Triangle4iIntersector8HybridPluecker(), referenced from:
      embree::BVH4Factory::BVH4Factory(int, int) in libmodule_raycast.macos.editor.x86_64.a[x86_64h][33](bvh4_factory.macos.editor.x86_64.o)
  embree::avx2::BVH4Triangle4vIntersector8HybridPluecker(), referenced from:
      embree::BVH4Factory::BVH4Factory(int, int) in libmodule_raycast.macos.editor.x86_64.a[x86_64h][33](bvh4_factory.macos.editor.x86_64.o)
  embree::avx2::BVH8Triangle4iIntersector4HybridPluecker(), referenced from:
      embree::BVH8Factory::BVH8Factory(int, int) in libmodule_raycast.macos.editor.x86_64.a[x86_64h][34](bvh8_factory.macos.editor.x86_64.o)
  embree::avx2::BVH8Triangle4iIntersector8HybridPluecker(), referenced from:
      embree::BVH8Factory::BVH8Factory(int, int) in libmodule_raycast.macos.editor.x86_64.a[x86_64h][34](bvh8_factory.macos.editor.x86_64.o)
  embree::avx2::BVH8Triangle4vIntersector4HybridPluecker(), referenced from:
      embree::BVH8Factory::BVH8Factory(int, int) in libmodule_raycast.macos.editor.x86_64.a[x86_64h][34](bvh8_factory.macos.editor.x86_64.o)
  embree::avx2::BVH8Triangle4vIntersector8HybridPluecker(), referenced from:
      embree::BVH8Factory::BVH8Factory(int, int) in libmodule_raycast.macos.editor.x86_64.a[x86_64h][34](bvh8_factory.macos.editor.x86_64.o)
  embree::avx2::BVH4Triangle4iMBIntersector8HybridMoeller(), referenced from:
      embree::BVH4Factory::BVH4Factory(int, int) in libmodule_raycast.macos.editor.x86_64.a[x86_64h][33](bvh4_factory.macos.editor.x86_64.o)
  embree::avx2::BVH4Triangle4vMBIntersector8HybridMoeller(), referenced from:
      embree::BVH4Factory::BVH4Factory(int, int) in libmodule_raycast.macos.editor.x86_64.a[x86_64h][33](bvh4_factory.macos.editor.x86_64.o)
  embree::avx2::BVH8Triangle4iMBIntersector4HybridMoeller(), referenced from:
      embree::BVH8Factory::BVH8Factory(int, int) in libmodule_raycast.macos.editor.x86_64.a[x86_64h][34](bvh8_factory.macos.editor.x86_64.o)
  embree::avx2::BVH8Triangle4iMBIntersector8HybridMoeller(), referenced from:
      embree::BVH8Factory::BVH8Factory(int, int) in libmodule_raycast.macos.editor.x86_64.a[x86_64h][34](bvh8_factory.macos.editor.x86_64.o)
  embree::avx2::BVH8Triangle4vMBIntersector4HybridMoeller(), referenced from:
      embree::BVH8Factory::BVH8Factory(int, int) in libmodule_raycast.macos.editor.x86_64.a[x86_64h][34](bvh8_factory.macos.editor.x86_64.o)
  embree::avx2::BVH8Triangle4vMBIntersector8HybridMoeller(), referenced from:
      embree::BVH8Factory::BVH8Factory(int, int) in libmodule_raycast.macos.editor.x86_64.a[x86_64h][34](bvh8_factory.macos.editor.x86_64.o)
  embree::avx2::BVH4Triangle4iMBIntersector8HybridPluecker(), referenced from:
      embree::BVH4Factory::BVH4Factory(int, int) in libmodule_raycast.macos.editor.x86_64.a[x86_64h][33](bvh4_factory.macos.editor.x86_64.o)
  embree::avx2::BVH4Triangle4vMBIntersector8HybridPluecker(), referenced from:
      embree::BVH4Factory::BVH4Factory(int, int) in libmodule_raycast.macos.editor.x86_64.a[x86_64h][33](bvh4_factory.macos.editor.x86_64.o)
  embree::avx2::BVH8Triangle4iMBIntersector4HybridPluecker(), referenced from:
      embree::BVH8Factory::BVH8Factory(int, int) in libmodule_raycast.macos.editor.x86_64.a[x86_64h][34](bvh8_factory.macos.editor.x86_64.o)
  embree::avx2::BVH8Triangle4iMBIntersector8HybridPluecker(), referenced from:
      embree::BVH8Factory::BVH8Factory(int, int) in libmodule_raycast.macos.editor.x86_64.a[x86_64h][34](bvh8_factory.macos.editor.x86_64.o)
  embree::avx2::BVH8Triangle4vMBIntersector4HybridPluecker(), referenced from:
      embree::BVH8Factory::BVH8Factory(int, int) in libmodule_raycast.macos.editor.x86_64.a[x86_64h][34](bvh8_factory.macos.editor.x86_64.o)
  embree::avx2::BVH8Triangle4vMBIntersector8HybridPluecker(), referenced from:
      embree::BVH8Factory::BVH8Factory(int, int) in libmodule_raycast.macos.editor.x86_64.a[x86_64h][34](bvh8_factory.macos.editor.x86_64.o)
  embree::avx2::BVH4Triangle4Intersector8HybridMoellerNoFilter(), referenced from:
      embree::BVH4Factory::BVH4Factory(int, int) in libmodule_raycast.macos.editor.x86_64.a[x86_64h][33](bvh4_factory.macos.editor.x86_64.o)
  embree::avx2::BVH8Triangle4Intersector4HybridMoellerNoFilter(), referenced from:
      embree::BVH8Factory::BVH8Factory(int, int) in libmodule_raycast.macos.editor.x86_64.a[x86_64h][34](bvh8_factory.macos.editor.x86_64.o)
  embree::avx2::BVH8Triangle4Intersector8HybridMoellerNoFilter(), referenced from:
      embree::BVH8Factory::BVH8Factory(int, int) in libmodule_raycast.macos.editor.x86_64.a[x86_64h][34](bvh8_factory.macos.editor.x86_64.o)
clang: error: linker command failed with exit code 1 (use -v to see invocation)
scons: *** [bin/godot.macos.editor.x86_64] Error 1
scons: building terminated because of errors.
[Time elapsed: 00:00:33.97]

@aaronfranke
Copy link
Member

aaronfranke commented Nov 20, 2024

It may be useful to tie this to the target macOS version (which we don't have a setting for right now, and would probably be a good idea to do so...). If a developer is targeting macOS 12.0 or later, which does not run on pre-Haswell CPUs, then we should use x86_64h automatically. Though If a developer is targeting an older macOS version, they probably expect their app to run on old Macs by default. However, we could also add an option to use x86_64h if targeting an older macOS version.

As for "Add x86_64h as a target", I'm not sure about this. The specific name isn't something recognized on all platforms. One of the goals we had with standardizing the architecture names for Godot 4.0 is to make them consistent across platforms, so for example, x86_64 is x86_64 everywhere, on Windows it's x86_64 and not x64, on Linux it's x86_64 and not amd64. I would look for some OS-agnostic way of specifying this, for example, setting a minimum architecture version to some value on Godot's side (haswell?), and then this can map to x86_64h on macOS and other compiler flags on Windows/Linux. Then we would also allow for other values in this field (ex: skylake could behave differently on Windows/Linux, but still use x86_64h on macOS). However, there are other architectures with standardized names where it makes more sense to put it all in one, such as RISC-V rv64 referring to the 64-bit edition of the architecture, and rv64gcv includes various extensions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants