Description
How do you use Sentry?
Self-hosted/on-premise
Version
"2.25.1"
Steps to Reproduce
We noticed that a lot of our fastapi transactions in sentry were showing an unexpectedly long delay in latency after the typical request-response cycle for a request route. On investigation further, we realized this increase in latency was coming from how sentry was capturing the error event in those cases. On setting up spans around sentry's own internal event capture and serialization functions, we noticed that while sentry-sdk was only sending an error event to sentry server once, like we expected, it would still try to serialize the stack frames for subsequent event captures, before checking whether the event was already sent by sentry-sdk earlier. This ends up wasting cpu cycles and also needlessly slows down the response latency by a decent amount. Can we update the logic here to first check if the event needs to be sent over before we start serializing stack frames?
The issue is even more stark in a test/prod environment running a real app, that can easily have over 25-30 stack frames that need serializing and we are seeing a latency increase by up to 30 ms or more from these extra stack frame serializations that are not necessary.
I have a simple setup below to reproduce the issue with fastapi, though it is not limited to fastapi:
# main.py
import fastapi
import logging
import sentry_sdk
from sentry_sdk import serializer
from sentry_sdk.integrations import asgi
sentry_sdk.capture_event = sentry_sdk.trace(sentry_sdk.capture_event)
serializer.serialize = sentry_sdk.trace(serializer.serialize)
asgi._capture_exception = sentry_sdk.trace(asgi._capture_exception)
sentry_sdk.init("<your sentry dsn>", traces_sample_rate=1)
app = fastapi.FastAPI()
@app.get("/test")
def test() -> str:
raise ValueError("Check the linked trace")
@app.exception_handler(ValueError)
def exception_handler(request: fastapi.Request, exc: Exception):
# logging error here so that sentry-sdk can serialize stack frames, capture exception,
# send it to sentry and also mark the error as already "handled"
logging.exception("Log Exception")
# reraise so it bubbles up to outer middleware/layers.
raise exc
Run with uvicorn
$ uvicorn main:app --port 8000
Expected Result
We expect that stack frame serialization will happen only once and capture_event()
will only send the exception to sentry once for the raised exception.
Actual Result
The capture_event()
does seem to only send event to sentry once like we expect. But we see the stack frame serialization happens 3 separate times, the last two of which happened when capture_event
ultimately realized it had already sent the exception over earlier.
In this example trace, we can see that while the normal request-response cycle would have normally taken 8-9 ms, including the first error capture event, the overall txn ended up taking 21ms, mostly from the 2 subsequent stack frame serializations for the same error event.
The following snapshot expands one of the three auto-grouped spans, to show that these spans are indeed coming from the serialization step for every stack frame in the exception.
The transaction is only linked to one exception, so these are not three separate error events either.
Metadata
Metadata
Assignees
Type
Projects
Status
Activity
sentrivana commentedon May 6, 2025
Hey @jvineet, thanks for raising this. There's a couple things going on here.
First of all, transactions and errors are serialized separately. In the sample app you're capturing an exception as well as a transaction, so the expected number of full serialization calls would be two, not one.
For some background, the logic responsible for determining whether an exception is a duplicate and should be dropped lives in its own integration and is applied via a global event processor. Event processors run before serialization, so events that are going to be dropped don't go through the whole serialization routine. However -- and this is most likely what you're seeing --, we do pre-serialize parts of the event (namely, the stacktrace) early. This was a conscious decision in order to avoid potentially tampering with actual userland objects (see #3409). This happens before the dedupe process takes place.
There's improvement potential here. Some things we could consider:
I'll leave this open, but I don't believe there's a quick fix for this, so it might take some time. (PRs welcome though!)
jvineet commentedon May 6, 2025
Hi @sentrivana , thanks for the quick reply. I am almost certain that all three serialization groups that I mentioned above are coming from stack frame serialization (which makes sense since
sentry-sdk
pre-serializes stack traces like you mentioned). I also see threelogging.exception
logs per request that should corresponding to each one of the three serialization groups. Those three span groups do not represent a transaction serialization.I think this can also be easily confirmed by changing the fastapi exception handler code into this
With this change in place, the # of "serializing" span groups in the txn (and # of
logging.exception
logs per request) would be the same asrange
value.Here is what it looks like for a range value of 4.
for range value of 2.
for range value of 10.
The other reason I know that these huge serialization groups are not from transaction serialization is because we don't see these giant span groups in our production traces when there was no exception linked to the txn. In fact, in those cases we can hardly tell if any time was spent on serializing spans (it is possible that a sentry transaction does not capture it's own serializing time).
I can mimic that behavior by setting range value of 0 in our sample case.
I think that adding a ~10 ms penalty to response latency every time the app calls
logging.exception()
is pretty brutal (I am saying ~10ms because real stack traces from production are rarely as small as our example above), especially when we incur this serialization penalty for stack frames that we know will never be sent over. I am also assuming that any team that relies on sentry error and transactions as much as we do has api performance near the top of their mind, so this feels a bit counterproductive (we do really love sentry here as a tool btw, so amazing job there!). Therefore, I am really hoping we could bump the priority on this particular issue.Another thing to consider is optionally reducing the size of stack frames that we ship with sentry. Like maybe give the users option to omit stack frames from certain libraries. For example, in fastapi, we often see 17-18 stack frames from starlette/fastapi, before we even reach the frames from our app.
While those frames might be nice to have, we personally don't really look at them ~99% of the time, and would much rather pick shaving off the ~15-20 ms of time that not serializing these frames would give us back. And only turn them on optionally, if required, under some rare circumstances.
sl0thentr0py commentedon May 7, 2025
@jvineet agree largely with what you said and we need to fix this in the long term, but the way it is now is due to a non-trivial interaction between holding references to frames and other parts of the SDK.
For now, I can recommend you use
LoggingIntegration
withevent_level=None
so we will only capture actual exceptions and not the ones fromlogging.exception
. I'm sorry I can't give you a cleaner solution right now.For @sentrivana, note that PEP 667 solves some of this on the Python side, but we're stuck with what to do for older versions. The theoretically proper solution is to build a read-only proxy/view of
frame.f_locals
that doesn't write overwriting any of them. Then we can remove a lot of cruft we have.jvineet commentedon May 7, 2025
@sl0thentr0py We rely on sentry quite a bit, which includes relying on
logging.exception
andlogging.error
in many places (even if we weren't doing it, a lot of third party libs we use surface exceptions withlogigng.exception
). So disabling logging integration would probably be bad. Do you recall if there is an easy way to reduce the size of stack frames that are serialized (maybe as part ofbefore_send
or something like that), along the lines of the second point I tried to make earlier. Maybe we could look into that for the time being.Maybe
before_send
is not going to work, since all of the frames are probably already serialized by that point.antonpirker commentedon May 8, 2025
Hey @jvineet, you are right with your assumption. The serialization of the stack traces is done before the
before_send
hook. So there is no easy clean solution unfortunately.The only thing I can thing of is the have this constant that limits how many stack frames are serialized: https://github.com/getsentry/sentry-python/blob/potel-base/sentry_sdk/utils.py#L81
We do not have an option to change this value, so you could only monkey patch the SDK to change this value. But that is a pretty hacky "solution: to your problem...