Skip to content

Conversation

@RPalmr
Copy link

@RPalmr RPalmr commented Nov 19, 2025

Add close() method to BaseAlgorithm to prevent memory leaks #1966

Description

Introduces a close() method in BaseAlgorithm to explicitly clean up resources. This method closes the environment, deletes the policy and rollout buffer objects, and calls torch.cuda.empty_cache() and gc.collect().

This prevents increasing memory usage and Out-of-Memory (OOM) errors in sequential training loops, relative to #1966.

Motivation and Context

  • I have raised an issue to propose this change (required for new features and bug fixes)

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation (update in the documentation)

Checklist

  • I've read the CONTRIBUTION guide (required)
  • I have updated the changelog accordingly (required).
  • My change requires a change to the documentation.
  • I have updated the tests accordingly (required for a bug fix or a new feature).
  • I have updated the documentation accordingly.
  • I have opened an associated PR on the SB3-Contrib repository (if necessary)
  • I have opened an associated PR on the RL-Zoo3 repository (if necessary)
  • I have reformatted the code using make format (required)
  • I have checked the codestyle using make check-codestyle and make lint (required)
  • I have ensured make pytest and make type both pass. (required)
  • I have checked that the documentation builds using make doc (required)

Note: You can run most of the checks using make commit-checks.

Note: we are using a maximum length of 127 characters per line

Reece Palmer added 2 commits November 19, 2025 14:31
Introduces a `close()` method in BaseAlgorithm to explicitly clean up resources. This method closes the environment, deletes the policy and rollout buffer objects, and calls `torch.cuda.empty_cache()` and `gc.collect()`.

This prevents increasing memory usage and Out-of-Memory (OOM) errors in sequential training loops, relative to DLR-RM#1966.
Introduces a `close()` method in BaseAlgorithm to explicitly clean up resources. This method closes the environment, deletes the policy and rollout buffer objects, and calls `torch.cuda.empty_cache()` and `gc.collect()`.

This prevents increasing memory usage and Out-of-Memory (OOM) errors in sequential training loops, relative to DLR-RM#1966.
@araffin araffin added the Maintainers on vacation Maintainers are on vacation so they can recharge their batteries, we will be back soon ;) label Nov 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Maintainers on vacation Maintainers are on vacation so they can recharge their batteries, we will be back soon ;)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants