Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upsert Application Submissions #603

Merged
merged 3 commits into from
Jan 6, 2024
Merged

Upsert Application Submissions #603

merged 3 commits into from
Jan 6, 2024

Conversation

rm03
Copy link
Member

@rm03 rm03 commented Nov 28, 2023

Previously, we would create new ApplicationSubmission and ApplicationQuestionResponse objects every time a user submitted their application; this change ensures that each user will have at most one submission per application per committee.

Going forward, we should be able to remove raw SQL queries like this since users won't have multiple submissions associated with an application, but this would be potentially problematic for past applications unless we run a script that drops all submissions that aren't the most recent.

Copy link
Member

@rohangpta rohangpta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Somehow I missed this! Looks great, I'm going to try and run some benchmarks on throughput with this code before merging in.

Curious if moving from create to update_or_create will give us the performance benefits that we hypothesised

@rohangpta
Copy link
Member

RE: Raw SQL queries -- it's slightly tricky to figure out how exactly to migrate this logic. An approach we can consider:

  • Compute the inverse of the queryset returned under RawSQL (everything but the most recent for each app) and delete them. Do this by hand via SSH (do you have access to this?).
  • Migrate logic to omit the RawSQL cruft.
  • For sanity, add a unique_together check on "application", "committee" and "user" for the ApplicationSubmission model (and maybe something of the sort to ApplicationQuestionResponse).

If we're doing this, let's also get stats on how much extra data we were storing. I'd imagine we package all these changes + other optimisations into a nice blog post by end of this month.

@rm03
Copy link
Member Author

rm03 commented Jan 6, 2024

As discussed:

  • We had a total of 40,457 ApplicationSubmission objects stored in the database and 24,112 of them are not most recent submissions.
  • Additionally, there were 165,269 ApplicationQuestionResponse objects associated with these submissions (out of a total of 274,445), which means that we've been storing quite a bit of extra data.

I've removed these objects from the database so the last commit shouldn't be problematic.

@rm03 rm03 merged commit 6c26b15 into master Jan 6, 2024
7 checks passed
@rm03 rm03 deleted the upsert-app-submissions branch January 6, 2024 07:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants