-
Notifications
You must be signed in to change notification settings - Fork 157
recalculate statistics use sql #696
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
arily
wants to merge
4
commits into
osuAkatsuki:master
Choose a base branch
from
ppy-sb:akat-master-0428-update-stat-sql
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While I agree this is objectively better from a performance and atomicity perspective, I feel that most developers will have more trouble maintaining this, it's relatively complex sql and the procedural code (in python) I suspect is a more common of a skillset for the devs who work on b.py (and maintainability is the #1 goal for b.py)
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can have a scaled down version (which only calculates pp and pp acc), or split query into smaller ones.
if this is acceptable we could strike a balance between procedural code & performance.
Our server ran into performance / memory issue recently. With this update our score submission speed went from ~5s for a heavy user, into ~600ms for single mode, and few orders better when users MP (don't forget that we have global submission thread lock). Not to mention that we also saved bunch of IO and memory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this also comes with added benefits for example when a map changes states, rather than setup complex trigger and handlers, taking care of any potential edge cases, statistics will be corrected next time this user submits a score.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my feedback is more that i don't think many devs are familiar with sql, and this uses more advanced features like window functions, CTEs, etc. -- i suspect having it as pure-python is more familiar for most
i don't disagree w/ this, but it can be implemented in pure py as well
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think the most important issue in recalc to fix would be recalculation of score statuses when scores of a user shift around
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
imo it's too cluttered for something which can be implemented in like 20-50 lines of python, and yeah sql can be confusing at times
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's a good point, let me consider it & review more deeply soon
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pagination is a good idea, as time-to-execute is less of a problem than resource utilization at large scale imo (from akatsuki experience) -- not sure but i'm not entirely remembering the technical considerations of this process (e.g. row/table locking w.r.t. transactions, requirements of upfront processing, etc.), so i'll have to refresh myself
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A PR for that already exists for quite some time. #573
I believe the simplicity of the current pure python way easily outweighs any performance gains from this. As someone that ran bancho.py at its limits, score submission was rather the least of issues.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are several technical difficulties with this approach If I understand the OFFSET correctly.:
There's however a better way imo, to chunk user ids, as they don't interference each other.