-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve species rematching process #288
Comments
We have noticed that the process is slowing down when it processes large amount species: The list dr18679, which has 166,350 records. The time completed for the first 500 records starts from 2 seconds, increased to 14seconds (~7 seconds in reading those 500 records) after 147,500 of 166,350 has been processed. Cpu hits > 85%, Memory 6.2G/7.6. CPU reduced to 2% 15 secs after the process is completed. We believed that the offset relating Hibernate pagination is the possible reason. The fellowing query were used in the rematching process
` def c = SpeciesListItem.createCriteria()
I built a test case to check if the assumption is correct. I use two different query to query from the first available species id, then do 50,000 offset, 50,500, 100,000 offset. The reading speed should obviously reduce if our assumption is correct. And also, I recorded the id of the first available specie id, No. 50,000, 100,000 id, and used the The results shew the reading speeds have no obvious difference. The each reading process has been finished in 1 second. Here is the result:
|
Few attempts: DO NOT WORK. We batch processing 500 species (at this moment) in a transaction. So when we find a new matched species and save to DB, but this record is actually not pushed into DB until the transaction is completed |
Performance records: For each 500 species rematching process, the time elapsed gradually increased from ~2.6s to ~8.5s, However, when system started rematching 151219 species in the list dr18685 after the previous list is completed We believed that querying a large dataset with offset slows down the query. Using ScorllableResults would be faster. However, when we use this snippet of reading code. It ended with ~1s slower than 'offset'
** For improve the reading speed of scrollableResult, we need to keep Session is open and scrollableResult remains open after , for example, 500 records were loaded and being processed. The benefit is the reading speed maintain stable ~0.8s, compare with Offset solution. the time won't increase with the more records have been read In this way, we need to create a new Session, with long timeout to ~4Hours for the largest list. Otherwise it may lead to ScrollableResults "expiring" if the cursor is left open too long without being used. **
If we use
It does not improve the reading process. The reading time starts from ~0.8s, increase slightly to 1.2s, (Without CPU burden caused by Rematching
|
Using a new session with a long timeout connection, the reading / writing speeds on processing a large list is very stable. Starts with reading/writing : ~0.025S/0.838s from the beginning, and R/W 0.024S/0.98S Total time cost to complete the list of dr18679 [166350] : 18 minutes, 26.660 seconds
|
Full rematch testing:
|
Rematching on lists-test was completed : There is still a small issue in rematchLog: https://lists-test.ala.org.au/ws/rematchLog/1. [Deleted]
|
Probably why the status of the rematching does not change as well, remains at RUNNING |
Using a long live session to process a large list Rematching based on list not species
Another round rematching was tested. It was completed in 2 hours. https://lists-test.ala.org.au/ws/rematchLog/2 @hamzajaved-csiro and I figured out a workaround to solve the issue the last rematching log was not written in DB. However, we did not understand the exact reasons why the last log was not saved . |
Including updates from 5.2.1 snapshot updates included in 5.2.0 snapshot #292 Grails 6 upgrade #295 Support the latest AuthService Plugin #299 Fixed the broken list upload from Spatial #302 Fixed download list csv issue #306 Fixed 'family(matched)' facade #288 Improved rematching process performance
Once a namematching service is reindexed, the lists need to have a way to rematch all existing lists.
add a function to trace the progress. We can continue aborted mission if necessary
Memo:
Rematching after > 100,000 species, starts significantly affect CPU load
The text was updated successfully, but these errors were encountered: