Skip to content

Commit

Permalink
PEP 709: inline comps in all scopes, mention tracing/profiling (#3046)
Browse files Browse the repository at this point in the history
  • Loading branch information
carljm committed Mar 8, 2023
1 parent e39be94 commit a44796e
Showing 1 changed file with 36 additions and 36 deletions.
72 changes: 36 additions & 36 deletions pep-0709.rst
Original file line number Diff line number Diff line change
Expand Up @@ -17,12 +17,11 @@ Abstract
Comprehensions are currently compiled as nested functions, which provides
isolation of the comprehension's iteration variable, but is inefficient at
runtime. This PEP proposes to inline list, dictionary, and set comprehensions
into the function where they are defined, and provide the expected isolation by
into the code where they are defined, and provide the expected isolation by
pushing/popping clashing locals on the stack. This change makes comprehensions
much faster: up to 2x faster for a microbenchmark of a comprehension alone,
translating to an 11% speedup for one sample benchmark derived from real-world
code that makes heavy use of comprehensions in the context of doing actual
work.
code that makes heavy use of comprehensions in the context of doing actual work.


Motivation
Expand Down Expand Up @@ -131,28 +130,31 @@ the need to place these variables in a cell, allowing the comprehension (and all
other code in the outer function) to access them as normal fast locals instead.
This provides further performance gains.

Only comprehensions occurring inside functions, where fast-locals
(``LOAD_FAST/STORE_FAST``) are used, will be inlined. Module-level
comprehensions will continue to create and call a function.

Generator expressions are currently never inlined in the reference
implementation of this PEP. In the future, some generator expressions may be
inlined, where the returned generator object does not leak.

In more complex cases, the comprehension iteration variable may be a global or
cellvar or freevar in the outer function scope. In these cases, the compiler
also internally pushes and pops the scope information for the variable when
entering/leaving the comprehension, so that semantics are maintained. For
example, if the variable is a global outside the comprehension, ``LOAD_GLOBAL``
will still be used where it is referenced. If it is a cellvar/freevar outside
the comprehension, the ``LOAD_FAST_AND_CLEAR`` / ``STORE_FAST`` used to
save/restore it do not change (there is no ``LOAD_DEREF_AND_CLEAR``), meaning
that the entire cell (not just the value within it) is saved/restored, so the
comprehension does not write to the cell.

In effect, comprehensions introduce a sub-function scope where local variables
are fully isolated, but without the performance cost or stack frame entry of a
call.
In some cases, the comprehension iteration variable may be a global or cellvar
or freevar, rather than a simple function local, in the outer scope. In these
cases, the compiler also internally pushes and pops the scope information for
the variable when entering/leaving the comprehension, so that semantics are
maintained. For example, if the variable is a global outside the comprehension,
``LOAD_GLOBAL`` will still be used where it is referenced outside the
comprehension, but ``LOAD_FAST`` / ``STORE_FAST`` will be used within the
comprehension. If it is a cellvar/freevar outside the comprehension, the
``LOAD_FAST_AND_CLEAR`` / ``STORE_FAST`` used to save/restore it do not change
(there is no ``LOAD_DEREF_AND_CLEAR``), meaning that the entire cell (not just
the value within it) is saved/restored, so the comprehension does not write to
the outer cell.

Comprehensions occurring in module or class scope are also inlined. In this
case, the comprehension will introduce usage of fast-locals (``LOAD_FAST`` /
``STORE_FAST``) for the comprehension iteration variable within the
comprehension only, in a scope where otherwise only ``LOAD_NAME`` /
``STORE_NAME`` would be used, maintaining isolation.

In effect, comprehensions introduce a sub-scope where local variables are fully
isolated, but without the performance cost or stack frame entry of a call.

Generator expressions are currently not inlined in the reference implementation
of this PEP. In the future, some generator expressions may be inlined, where the
returned generator object does not leak.


Backwards Compatibility
Expand Down Expand Up @@ -239,6 +241,15 @@ comprehension and its containing function and point to a calling frame outside
the library. In such a scenario it would usually be simpler and more reliable
to raise the warning closer to the calling code and bypass fewer frames.

Tracing/profiling will no longer show a call/return for the comprehension
-------------------------------------------------------------------------

Naturally, since list/dict/set comprehensions will no longer be implemented as a
call to a nested function, tracing/profiling using ``sys.settrace`` or
``sys.setprofile`` will also no longer reflect that a call and return have
occurred.


Impact on other Python implementations
======================================

Expand Down Expand Up @@ -310,17 +321,6 @@ for all code. It also provides less scope for future optimizations.
This PEP takes the position that full inlining offers sufficient additional
performance to more than justify the behavior changes.

Inlining module-level comprehensions
------------------------------------

Module-level comprehensions are generally called only once (when the module is
imported), so optimizing their performance is low priority. Inlining them would
require separate code paths in the compiler to handle a module global namespace
dictionary instead of fast-locals. It would be difficult or impossible to avoid
breaking semantics, since the comprehension iteration variable itself would be
a module global which might be referenced inside other functions that in turn
could be called within the comprehension.


Copyright
=========
Expand Down

0 comments on commit a44796e

Please sign in to comment.