Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Session leaks #957

Open
wants to merge 32 commits into
base: main
Choose a base branch
from

Conversation

surbhigarg92
Copy link
Contributor

@surbhigarg92 surbhigarg92 commented Jun 7, 2023

Adding a couple of option in Database and Instance for auto-detecting long running transactions, logging the stack-trace and removing such transactions from consuming resources.

close_inactive_transactions - Default value is False. Setting this to True will close the faulty transactions that are consuming the resources. We will also be generating the warning logs which can be later referred to know the transactions that were closed through this option.
logging_enabled - Default value is True. Setting this to False will stop generating warning logs which helps in identifying faulty code which is leading to incorrect consumption of sessions.

Batch and Partitioned Transactions are long running and will not be auto cleaned

@product-auto-label product-auto-label bot added size: l Pull request size is large. api: spanner Issues related to the googleapis/python-spanner API. labels Jun 7, 2023
@surbhigarg92 surbhigarg92 changed the title Session leaks feat: Session leaks Jun 15, 2023
@surbhigarg92 surbhigarg92 marked this pull request as ready for review June 15, 2023 12:38
@surbhigarg92 surbhigarg92 requested review from a team as code owners June 15, 2023 12:38
@surbhigarg92 surbhigarg92 added the do not merge Indicates a pull request not ready for merge, due to either quality or timing. label Jun 15, 2023
@surbhigarg92 surbhigarg92 added the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Jun 20, 2023
@yoshi-kokoro yoshi-kokoro removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Jun 20, 2023
@product-auto-label product-auto-label bot added size: xl Pull request size is extra large. and removed size: l Pull request size is large. labels Jun 21, 2023
google/cloud/spanner_dbapi/connection.py Outdated Show resolved Hide resolved
google/cloud/spanner_v1/database.py Outdated Show resolved Hide resolved
google/cloud/spanner_v1/database.py Outdated Show resolved Hide resolved
self._database._pool.put(self._session)
if self._batch._session is not None:
self._database._pool.put(self._session)
self._session._transaction = None
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Query - I think we are doing this to flush the transaction context which we newly created. Is this the best place to flush. Are there any other place where existing properties are flushed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

google/cloud/spanner_v1/database.py Show resolved Hide resolved
google/cloud/spanner_v1/pool.py Outdated Show resolved Hide resolved
for session in self._borrowed_sessions
if (datetime.datetime.utcnow() - session.checkout_time)
> datetime.timedelta(seconds=longRunningTransactionThreshold_sec)
and not session.long_running
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Should we re-name session.long_running to session.eligible_for_long_running? What we actually mean to say is if the session is allowed to be long running or not. If not (PDML & partitioned reads) then we clean it. While I fully understand the background behind this naming, a new reader will think session.long_running means that a session is long running. And then this check looks confusing that we are cleaning it when session is "not long running"
  2. Suggestion : It will be better to have not session.long_running as the first check so that we short circuit for a few times (PDML/partitioned reads)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the not session.long_running condition in beginning.

Will be keeping property name as long_running as it gives a more clear indication that session in use currently is for long running transaction or not.

tests/system/test_session_api.py Show resolved Hide resolved
tests/unit/test_snapshot.py Outdated Show resolved Hide resolved
tests/unit/test_snapshot.py Outdated Show resolved Hide resolved
if not session.transaction_logged:
self._database.logger.warning(
"Transaction has been running for longer than 60 minutes and might be causing a leak. "
+ "Enable closeInactiveTransactions in Session Pool Options to automatically clean such transactions. "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think yes a constant message is better, and agreed with arpan. We can suggest switching to pdml or batch

) and len(
self._borrowed_sessions
) / self.target_size < self._used_sessions_ratio_threshold:
self.stopCleaningLongRunningSessions()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should the setting of session.transaction=None happen here so that it doesn't get missed anywhere?

@@ -38,6 +38,12 @@
+ "numeric has a whole component with precision {}"
)

LONG_RUNNING_TRANSACTION_ERR_MSG = "Transaction has been closed as it was running for more than 60 minutes. If transaction is expected to run long, run as batch or partitioned DML."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I thought we were removing the exact 60 minute thing here?

@@ -113,6 +113,11 @@ class Database(object):
is `True` to log commit statistics. If not passed, a logger
will be created when needed that will log the commit statistics
to stdout.

:type close_inactive_transactions: boolean
:param close_inactive_transactions: (Optional) If set to True, the database will automatically close inactive transactions that have been running for longer than 60 minutes which may cause session leaks.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: same here for the 60 minutes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: spanner Issues related to the googleapis/python-spanner API. do not merge Indicates a pull request not ready for merge, due to either quality or timing. size: xl Pull request size is extra large.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants