-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
observability: annotate Session+SessionPool events #1207
observability: annotate Session+SessionPool events #1207
Conversation
8d6f2b7
to
4bc937e
Compare
4bc937e
to
0d5bf26
Compare
0d5bf26
to
3617921
Compare
f71d05c
to
eb9dd5a
Compare
b1ea772
to
ab23f09
Compare
Kindly cc-ing you @harshachinta. |
Kindly please take another look @harshachinta, feedback addressed, thank you for the code review! |
google/cloud/spanner_v1/pool.py
Outdated
) | ||
|
||
if requested_session_count > 0: | ||
current_span.add_event( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
current_span is not initialized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sheesh, sorry I had a bunch of benchmarks running locally and they interrupted my nox -s unit-3.8
tests so didn't see those but all fixed now.
e60b662
to
3a1611e
Compare
if self._transaction_id is None and len(self._mutations) > 0: | ||
self.begin() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if self._transaction_id is None and len(self._mutations) > 0: | |
self.begin() |
Why is this added here? This is already getting executed in the beginning of commit() call, so not needed
if self._transaction_id is None and len(self._mutations) > 0: |
google/cloud/spanner_v1/pool.py
Outdated
) | ||
session = self._sessions.get(block=True, timeout=timeout) | ||
except queue.Empty as e: | ||
add_span_event(current_span, "No session available", span_event_attributes) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add_span_event(current_span, "No session available", span_event_attributes) | |
add_span_event(current_span, "No session available in the pool", span_event_attributes) |
tests/unit/test_pool.py
Outdated
"exception", | ||
"exception", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are there 2 exception events?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a quirk with opentelemetry-python; they add "exception" on every exit call whether or not you set your own, I'll file a bug later on but for now I'll just remove our explicit invocation of "span.record_exception" and instead add our comment.
Thank you @harshachinta for the review, I've addressed the feedback. Please take a look again. |
b6187d8
to
e654029
Compare
@harshachinta kindly help me run the bots. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please fix lint
@@ -102,7 +103,21 @@ def trace_call(name, session, extra_attributes=None, observability_options=None) | |||
yield span | |||
except Exception as error: | |||
span.set_status(Status(StatusCode.ERROR, str(error))) | |||
span.record_exception(error) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets not remove this exception. We are not sure if there are any cases where the span will end up not recording an exception.
I would suggest adding this back here and let us discuss more during our demo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also I was wondering that this behavior of exception getting added twice was not seen earlier since this code exists from very long.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's because OpenTelemetry was upgraded only recently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do have locked tests to check for the exceptions to ensure that they are in there and from Span.enter. I had to dive back into OpenTelemetry-Python's code as it isn't even documented and in our demos it was very distracting to have mysteriously both errors. I think for the sake of our sanity and project stability let's leave that comment in and if anything happens it is a trivial one to add back @harshachinta
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. But the opentelemetry documentation for Python guides to record exception for instrumentation libraries.
https://opentelemetry.io/docs/languages/python/instrumentation/#record-exceptions-in-spans
Can you share the code pointer on where the opentelemetry records exception by default when exiting?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@harshachinta it was the cause of us seeing 2 exceptions and took a ton of confusion and time for me to debug, they don't seem to document this condition.
@harshachinta kindly help me re-run the bots; all unit tests pass locally. |
498a70a
to
060d17c
Compare
060d17c
to
b400718
Compare
This change adds annotations for session and session pool events to aid customers in debugging latency issues with session pool malevolence and also for maintainers to figure out which session pool type is the most appropriate. Updates googleapis#1170
b400718
to
1d7e440
Compare
This change adds annotations for session and session pool events to aid customers in debugging latency issues with session pool malevolence and also for maintainers to figure out which session pool type is the most appropriate.
Updates #1170
BEGIN_COMMIT_OVERRIDE
feat: add additional opentelemetry span events for session pool
END_COMMIT_OVERRIDE