-
-
Notifications
You must be signed in to change notification settings - Fork 32.2k
gh-123471: Make itertools.chain thread-safe #135689
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Can you use an ISTM that most of the "make thread-safe" patches going in everywhere from heapq to itertools adds overhead to the builds that don't get any benefit from it. The code is getting harder to read and maintain. AFAICT most of these edits are going in without tests so it hard to tell whether they are achieving their goal. Also there seem to be differing notions of what "thread-safe" means, from just not-segfaulting to strict guarantees that underlying iterators are non called concurrently. Also, have you been testing for performance impacts on the standard nogil build that everyone uses in production? Almost the entire point of itertools is to add minimal overhead to the underying data producers. The |
Misc/NEWS.d/next/Library/2025-06-18-19-25-32.gh-issue-123471.lx1Xbt.rst
Outdated
Show resolved
Hide resolved
barrier.wait() | ||
while True: | ||
try: | ||
_ = next(it) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the extra assignment to _
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some linters complain if the value is not assigned. I can remove it though, it is not needed here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Linters tend to complain about a lot of things in the source. I'd just remove it.
The lock is a critical section. It's completely compiled away on the normal build, so there shouldn't be any overhead. If you're worried about the extra nested function call, I'm pretty sure that the compiler will inline it.
Yeah, there's a bit of a double-standard here. Basically:
It's not totally clear to me where |
Co-authored-by: Peter Bierma <[email protected]>
The refactoring with
The fact that the code is getting more complex is imo unavoidable with PEP 703. The (conditional) acceptance of the PEP is a trade-off between more complexity and the benefit of FT performance. In this PR the goal is to avoid crashes in the FT build (see #124397). The test from the PR segfaults on my system on current main (probability of the segfauls depends on the
As Peter commented above, there is no performance impact to be expected from this PR. I tested this with the script below and the performance variation is within the noise. In the script I also compared the current C implementation with the pure python implementation from the itertools documentation. The pure python implementation is about 18% slower. Based on the 18% performance difference, I would say the main reason for Benchmark script
|
The
itertools.chain
has two attributes that are mutated duration iteration, making it non-thread safe.Options to make it thread safe
i) Use a lock (simple, but adds a performance penalty for the single-threaded case)
ii) Make the iterator thread safe using atomics.
For the second option we can stop clearing the
lz->source
on exhaustion (this is a standard mitigation to avoid settinglz->source
to zero (and doing a decref) while another thread is still usinglz->source
). To signal the iterator thatlz->source
is exhausted we can either add a new attribute, or replace thewhile (lz->source != NULL)
withwhile (true)
. This creates a slower path the the iterator is exhausted, but typically once the iterator is exhausted performance is not relevant any longer. Thelz->active
is more problematic. Oncelz->active
is exhausted, we need to update this with a new one (e.g.PyIter_Next(lz->source
). Butlz->active
only has a single reference count. So updating in one thread means having to decref the oldlz->active
. Possible mitigations: a) atomically load-and-incref thelz->active
inchain_next
b) add a new attribute that contains all the exhausted values oflz->active
and clear this once thechainobject
itself is deallocated.The second option is complex and has some performance issues (e.g. more incref/decrefs, keeping references to exhausted iterators longer in memory), so in this PR we pick the first option.