Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proof of concept: Add timestamps to tracebacks. #129337

Draft
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

gpshead
Copy link
Member

@gpshead gpshead commented Jan 27, 2025

The idea for this came up at work with @njsmith tossing the idea out as "dumb" yet still maybe useful... Why? to make debugging what happened in async servers with lots of exception groups and exceptions easier. Event timestamps reported on everything when emitting exception groups containing tracebacks often with their own nested causes would allow some semblance of order to be more readily understood and correlated with other parts of the distributed system.

This draft PR is a demo. If we want such a feature, we should settle on semantics in a Discuss thread and write it up as a PEP. This should be simpler than exception notes (PEP-678) was. One thought was just to store the timestamp as a note; but that'd involve string and list creation on every exception.

Demo

± PYTHON_TRACEBACK_TIMESTAMPS=iso ../b/python
Python 3.14.0a4+ (heads/traceback-timestamps-dirty:b06539429ef, Feb  1 2025, 17:54:53) [Clang 18.1.3 (1ubuntu1)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> def california_raisin():
...     try:
...         raise RuntimeError("not enough sunshine")
...     except Exception:
...         raise OSError(2, "on a cloudy day")
...
>>> california_raisin()
Traceback (most recent call last):
  File "<python-input-0>", line 3, in california_raisin
    raise RuntimeError("not enough sunshine")
RuntimeError: not enough sunshine <@2025-02-01T20:43:01.026169>

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<python-input-2>", line 1, in <module>
    california_raisin()
    ~~~~~~~~~~~~~~~~~^^
  File "<python-input-0>", line 5, in california_raisin
    raise OSError(2, "on a cloudy day")
FileNotFoundError: [Errno 2] on a cloudy day <@2025-02-01T20:43:01.026176>
>>>

A real world demo involving exception groups in an async application would be more interesting but isn't something I can just tap out in a repl...

Updates: [edits]

I've done performance testing. It now appears to be a no-op (no meaningful regression) as desired performance wise on Linux. For that to be true, I had to add the one special case I suspected might matter: Don't collect the timestamp on StopIteration and AsyncStopIteration as those are not infrequent error exceptions, but a normal part of application control flow. Without that, one of the async pyperformance benchmarks showed a significant performance hit of over 10%.

I've changed it to not emit the timestamps by default. If you set the PYTHON_TRACEBACK_TIMESTAMPS= environment variable to one of either us or 1, ns, or iso, timestamps will be enabled in traceback module display output. See the documentation in the PR.

TODO

  • macOS performance regression testing, on both architectures.
  • Windows performance regression testing.
  • Fix the 32-bit build test_sys exception struct size failures.
  • Tests needed:
    • I doubt the existing test suite passes if that environment variable is set?
    • Specifically test pickle and unpickle of all existing exception types in our heirarchy.
    • Test user derived class exception types from each of BaseException, Exception, OSError, ImportError, AttributeError to have the expected attributes and to survive pickle dump and load with the timestamp intact.
    • Test that the timestamp shows up when enabled on everything but StopIteration and AsyncStopIteration.
    • Can I have a PONY?

This came up at work as a suggestion to make debugging what happened in
big async servers with lots of exception groups and exceptions easier.
Timestamps when emitting exception groups containing tracebacks often
with their own nested causes would allow some semblance of order to be
understood.

This is a demo.  If we want such a feature, we should settle on
semantics in a Discuss thread and write it up as a PEP.  This should be
simpler than exception notes (PEP-678) was.  One thought was just to
store the timestamp as a note; but that'd involve string and list
creation on every exception.

Performance testing needs to be done. This is the kind of thing that is
visually distracting, so not all applications want to _see_ the
timestamps.  A knob to turn that on for those who do seems more useful
rather than making that the default.  But the performance impact of
merely collecting the timestamps is worth knowing.
Avoids a 15% regression in the pyperformance async_generators suite.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant