-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Session leaks #957
base: main
Are you sure you want to change the base?
feat: Session leaks #957
Conversation
self._database._pool.put(self._session) | ||
if self._batch._session is not None: | ||
self._database._pool.put(self._session) | ||
self._session._transaction = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Query - I think we are doing this to flush the transaction context which we newly created. Is this the best place to flush. Are there any other place where existing properties are flushed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are also deleting this in commit and rollback methods, apart from all these methods. https://github.com/googleapis/python-spanner/blob/main/google/cloud/spanner_v1/transaction.py#L181
https://github.com/googleapis/python-spanner/blob/main/google/cloud/spanner_v1/transaction.py#L246
Also In case of retry due to abort errors. https://github.com/googleapis/python-spanner/blob/main/google/cloud/spanner_v1/session.py#L388
google/cloud/spanner_v1/pool.py
Outdated
for session in self._borrowed_sessions | ||
if (datetime.datetime.utcnow() - session.checkout_time) | ||
> datetime.timedelta(seconds=longRunningTransactionThreshold_sec) | ||
and not session.long_running |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Should we re-name
session.long_running
tosession.eligible_for_long_running
? What we actually mean to say is if the session is allowed to be long running or not. If not (PDML & partitioned reads) then we clean it. While I fully understand the background behind this naming, a new reader will thinksession.long_running
means that a session is long running. And then this check looks confusing that we are cleaning it when session is "not long running" - Suggestion : It will be better to have
not session.long_running
as the first check so that we short circuit for a few times (PDML/partitioned reads)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added the not session.long_running
condition in beginning.
Will be keeping property name as long_running
as it gives a more clear indication that session in use currently is for long running transaction or not.
google/cloud/spanner_v1/pool.py
Outdated
if not session.transaction_logged: | ||
self._database.logger.warning( | ||
"Transaction has been running for longer than 60 minutes and might be causing a leak. " | ||
+ "Enable closeInactiveTransactions in Session Pool Options to automatically clean such transactions. " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think yes a constant message is better, and agreed with arpan. We can suggest switching to pdml or batch
) and len( | ||
self._borrowed_sessions | ||
) / self.target_size < self._used_sessions_ratio_threshold: | ||
self.stopCleaningLongRunningSessions() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should the setting of session.transaction=None happen here so that it doesn't get missed anywhere?
@@ -38,6 +38,12 @@ | |||
+ "numeric has a whole component with precision {}" | |||
) | |||
|
|||
LONG_RUNNING_TRANSACTION_ERR_MSG = "Transaction has been closed as it was running for more than 60 minutes. If transaction is expected to run long, run as batch or partitioned DML." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I thought we were removing the exact 60 minute thing here?
@@ -113,6 +113,11 @@ class Database(object): | |||
is `True` to log commit statistics. If not passed, a logger | |||
will be created when needed that will log the commit statistics | |||
to stdout. | |||
|
|||
:type close_inactive_transactions: boolean | |||
:param close_inactive_transactions: (Optional) If set to True, the database will automatically close inactive transactions that have been running for longer than 60 minutes which may cause session leaks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: same here for the 60 minutes
Adding a couple of option in Database and Instance for auto-detecting long running transactions, logging the stack-trace and removing such transactions from consuming resources.
close_inactive_transactions - Default value is False. Setting this to True will close the faulty transactions that are consuming the resources. We will also be generating the warning logs which can be later referred to know the transactions that were closed through this option.
logging_enabled - Default value is True. Setting this to False will stop generating warning logs which helps in identifying faulty code which is leading to incorrect consumption of sessions.
Batch and Partitioned Transactions are long running and will not be auto cleaned