-
-
Notifications
You must be signed in to change notification settings - Fork 30.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-117578: Fix inlining regression in PyType_GetModuleByDef() #123100
Conversation
Hm, it doesn't sound right to override profile-guided optimization, especially since |
It is mentioned on the faster-cpython repo that the telco test has slowed down a lot. According to MSVC, the module state access counts were:
Tested with the
This patch would need to be applied if we wanted as much speed as the global state access on Windows, which has little effect alone (1%) for some reason. |
Windows PGO: |
Is it acceptable that @requires_cdecimal
class CArithmeticOperatorsTest(ArithmeticOperatorsTest, unittest.TestCase):
...
@unittest.skipIf(not test.support.PGO, 'PGO training only')
def test_excecise_binop(self):
Decimal = self.decimal.Decimal
d = Decimal('11.1')
for i in range(500000):
1 + d # at least 300000 times |
I'll try |
Closing in favor of proposing the |
On
main
and3.13
, there are cases where theget_module_by_def
function intypeobject.c
is not inlined in its wrapper functions:get_module_by_def()
PyType_GetModuleByDef()
/Ob2
:called/Ob3
:inlined_PyType_GetModuleByDef2()
/Ob2
:called/Ob3
:inlinedNon-builtin modules can have extra function-call overheads, where the wrappers cannot be inlined.
This PR specifies
Py_ALWAYS_INLINE
to the callee.cc @encukou