fix: ensure PII is cleared from historical certificate records during retirement#38671
fix: ensure PII is cleared from historical certificate records during retirement#38671ttak-apphelix wants to merge 7 commits into
Conversation
| # retiring users' certificate records. | ||
| # .. toggle_use_cases: open_edx | ||
| # .. toggle_creation_date: 2026-05-29 | ||
| ENABLE_HISTORICAL_PII_RETIREMENT = WaffleFlag( |
There was a problem hiding this comment.
@ttak-apphelix If the flag is about retaining PII vs redacting then preferably rename this to "RETAIN_HISTORICAL_PII_RETIREMENT" OR "REDACT_HISTORICAL_PII_RETIREMENT"
There was a problem hiding this comment.
updated the name with ENABLE_REDACT_HISTORICAL_PII_RETIREMENT
| .. pii: PII can exist in the generated certificate linked to in this model. | ||
| .. pii_types: name, username | ||
| .. pii_retirement: retained | ||
| .. pii_retirement: local_api |
There was a problem hiding this comment.
Is this because we changed something, or just because we think this didn't match the code? When did they get out of sync? Should we discuss with Axim? Note, if different deployments might want something different, we should discuss.
| cert_for_retired_user = GeneratedCertificate.objects.get(user_id=self.user_retired) | ||
| assert cert_for_retired_user.name == "" | ||
|
|
||
| active_history_names = list( |
There was a problem hiding this comment.
Maybe assert on the expected size of active_history_names and retired_history_names? I'm not clear on what is happening, and empty lists would probably pass most assert alls?
|
|
||
| with LogCapture() as logger: | ||
| call_command("purge_pii_from_generatedcertificates", "--dry-run") | ||
| with override_waffle_flag(ENABLE_REDACT_HISTORICAL_PII_RETIREMENT, active=True): |
There was a problem hiding this comment.
I have a personal preference for patch decorators, rather than deeply nested withs, but this is an optional change.
| Base model for generated course certificates | ||
| .. pii: PII can exist in the generated certificate linked to in this model. Certificate data is currently retained. | ||
| .. pii: PII can exist in the generated certificate linked to in this model. |
There was a problem hiding this comment.
Is there some way that makes sense for us to annotate the history table? Do you want to ask Axim about this?
Description
GeneratedCertificateusesdjango-simple-history, which mirrors every row change intocertificates_historicalgeneratedcertificate. This table stores the learner's full name in every snapshot. The existing retirement pipeline only blankednamein the live table — all prior audit snapshots retained the real name indefinitely.This PR closes that gap:
clear_pii_from_certificate_records_for_user()now also blanksnamein the history table viaGeneratedCertificate.history.filter(...).update(name="")purge_pii_from_generatedcertificatesmanagement command extended to also backfill the history table for all previously retired users(flag-gated)pii_retirementannotation onGeneratedCertificatecorrected fromretainedtolocal_apiTesting:
pytest lms/djangoapps/certificates/tests/test_api.py::CertificatesLearnerRetirementFunctionality
pytest lms/djangoapps/certificates/management/commands/tests/test_purge_pii_from_generatedcertificates.py
Jira ticket:
https://2u-internal.atlassian.net/browse/BOMS-590