Skip to content

fix: ensure PII is cleared from historical certificate records during retirement#38671

Open
ttak-apphelix wants to merge 7 commits into
openedx:masterfrom
ttak-apphelix:ttak-apphelix/BOMS-590
Open

fix: ensure PII is cleared from historical certificate records during retirement#38671
ttak-apphelix wants to merge 7 commits into
openedx:masterfrom
ttak-apphelix:ttak-apphelix/BOMS-590

Conversation

@ttak-apphelix
Copy link
Copy Markdown
Contributor

@ttak-apphelix ttak-apphelix commented May 28, 2026

Description

GeneratedCertificate uses django-simple-history, which mirrors every row change into certificates_historicalgeneratedcertificate. This table stores the learner's full name in every snapshot. The existing retirement pipeline only blanked name in the live table — all prior audit snapshots retained the real name indefinitely.

This PR closes that gap:

  • clear_pii_from_certificate_records_for_user() now also blanks name in the history table via GeneratedCertificate.history.filter(...).update(name="")
  • purge_pii_from_generatedcertificates management command extended to also backfill the history table for all previously retired users(flag-gated)
  • [config.py] — added [ENABLE_REDACT_HISTORICAL_PII_RETIREMENT] waffle flag
  • pii_retirement annotation on GeneratedCertificate corrected from retained to local_api
  • Tests updated to assert history table rows are cleared in both paths

Testing:
pytest lms/djangoapps/certificates/tests/test_api.py::CertificatesLearnerRetirementFunctionality
pytest lms/djangoapps/certificates/management/commands/tests/test_purge_pii_from_generatedcertificates.py

Jira ticket:
https://2u-internal.atlassian.net/browse/BOMS-590

@ttak-apphelix ttak-apphelix requested a review from a team as a code owner May 28, 2026 10:21
Comment thread lms/djangoapps/certificates/config.py Outdated
# retiring users' certificate records.
# .. toggle_use_cases: open_edx
# .. toggle_creation_date: 2026-05-29
ENABLE_HISTORICAL_PII_RETIREMENT = WaffleFlag(
Copy link
Copy Markdown
Contributor

@vgulati-apphelix vgulati-apphelix May 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ttak-apphelix If the flag is about retaining PII vs redacting then preferably rename this to "RETAIN_HISTORICAL_PII_RETIREMENT" OR "REDACT_HISTORICAL_PII_RETIREMENT"

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated the name with ENABLE_REDACT_HISTORICAL_PII_RETIREMENT

Copy link
Copy Markdown
Contributor

@robrap robrap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.

.. pii: PII can exist in the generated certificate linked to in this model.
.. pii_types: name, username
.. pii_retirement: retained
.. pii_retirement: local_api
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this because we changed something, or just because we think this didn't match the code? When did they get out of sync? Should we discuss with Axim? Note, if different deployments might want something different, we should discuss.

cert_for_retired_user = GeneratedCertificate.objects.get(user_id=self.user_retired)
assert cert_for_retired_user.name == ""

active_history_names = list(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe assert on the expected size of active_history_names and retired_history_names? I'm not clear on what is happening, and empty lists would probably pass most assert alls?


with LogCapture() as logger:
call_command("purge_pii_from_generatedcertificates", "--dry-run")
with override_waffle_flag(ENABLE_REDACT_HISTORICAL_PII_RETIREMENT, active=True):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a personal preference for patch decorators, rather than deeply nested withs, but this is an optional change.

Base model for generated course certificates
.. pii: PII can exist in the generated certificate linked to in this model. Certificate data is currently retained.
.. pii: PII can exist in the generated certificate linked to in this model.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there some way that makes sense for us to annotate the history table? Do you want to ask Axim about this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants