-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inconsistent scheduling & speed of compaction tasks on EC2 #3329
Comments
Just read through all involved code and the very detailed (thank you!) description above. There's a few things to unpack here. The three classes we need to care about are:
Short and not entirely helpful answer to this issue
More detailed answerThe scaler was designed (based on several earlier iterations and previous experience) around the idea it's cheaper both in terms of code maintenance and $ cost, to do the simple action and fail sometimes rather than strive to do the optimal actions all of the time and avoid failure at all costs. It doesn't know how containers per instance it can fit until some instances have been created. When there aren't any instances, it therefore makes the safe assumption of 1 task per container: See here The EC2Scaler algorithm is:
* If the scaler tries to scale down then SafeTerminationLambda prevents terminating instances with running containers.
Yep, at the beginning when it is waiting for the EC2 AutoScaler to commission the extra instances this will happen.
See here. The "few failures" are it waiting for the AutoScaler to finish creating/commissioning the EC2s it previously asked for. Once this happens, we can interrogate an instance and get a proper idea of how many containers per instance. See here. That's why it "changes its mind": it suddenly has proper information to work with.
It aims for high resource utilisation on the instances to minimise the number we need. We already add a 5% "hedge" on the memory usage (Explanation here) to prevent ECS complaining it can't fit containers in. This does appear to the cause of the issue, it should try to pack fewer containers on to an EC2. Analysis
I definitely wouldn't do this. As you say, it will run again in a minute anyhow. But there's more to it than this. Options:
I considered making it try to "lookup" the CPU and memory based upon instance type from AWS. Not sure why I decided against this now. Just looking it up from running instances seemed better at the time. This might be a viable alternative if having inconsistent containers / instance is seen as bad enough to fix, but I'm not convinced it is.
It could be worth allowing a configurable "default" number of instances per container to be set in the instance properties. The risk of this is that you have two dependent properties: the EC2 instance type (COMPACTION_EC2_TYPE) and this new "containers per instance". Just changed one? Hope you didn't forget to change the other! I can almost see the bug report "I changed the compaction instance type to a [super high spec. machine] and Sleeper is STILL only scheduling 2 tasks per instance!" How about a "minimum containers per instance" to replace step 1 in the algorithm above? If it can't read the actual number, it assumes this minimum rather than 1. Then you avoid the risk above.
This points to needing to relax the computation of how the scaler computes the number of containers per instance. Rather than trying to cram on as many as we have the CPU/RAM for, we should add in some more headroom? Recommendation
|
If we make the estimated containers per EC2 configurable, that would add an extra configuration property someone would need to find, but it would also be misleading since it wouldn't actually determine how many containers AWS will assign to each instance. We have two separate problems:
These are related, but I think we need to solve both. A few things to note:
|
I we use the AWS EC2 describe-instance-types API that you mentioned at runtime to find out the CPU/memory available on our chosen instance, then we get a deterministic value for number of containers per instance. This should stop it requesting more instances than needed and also eliminate bullet point 1. I think this should be done at runtime to account for when the "instance type" is changed post deployment. To prevent over provisioning, I think increasing the memory requirements for each task is the easiest way to achieve this. CPU is too coarse for this. E.g. for a task that gets 1 vCPU and 4 GiB of memory, if we increase the requirements to 2 vCPU, then we've halved the number of tasks that can be provisioned on a single instance which is probably too conservative. Whereas, if we increase the memory to 5 or 6GIB, we can get finer control to prevent over-provisioning. This looks like 2 PRs:
|
Description
In the system test CompactionPerformanceST, we've noticed that sometimes compaction tasks are scheduled quite inconsistently between the EC2s that are created, and when they're scheduled to the same one they run much slower than expected. More EC2s seem to be created than we expect, and some EC2s are set up to run just a single compaction task.
The test repeatedly runs the compaction task creator until it finds 10 tasks are up, then creates 10 compaction jobs to run on them.
It looks like the compaction task creator lambda fails halfway through a number of times, and it runs about 9 times overall. It starts off thinking it needs 1 task per instance, then changes its mind after a few failures trying to actually create the tasks, and for the rest of the lambda invocations it decides it should have 4 containers per instance instead.
In one example of the test, we ended up with 6 compaction jobs each on their own dedicated EC2, and 4 compaction jobs all on the same EC2. The 4 that ran on the same instance ran at about 180,000 records per second, whereas the other 6 ran at about 300,000 records per second. The EC2 that had 4 compaction jobs ran at over 99.8% CPU utilization the whole time, where the other 6 EC2s ran at just over 25%.
Steps to reproduce
Expected behaviour
The compaction task starter could wait for the autoscaler to apply the desired number of instances before it starts trying to run tasks. It's not clear how worthwhile this is though, as it will run again in a minute anyway.
The compaction task starter should choose the same number of containers per instance given the same settings. This can be calculated from the instance details as it is now, but it should be consistent. It might also be worth configuring the number of containers per instance directly in an instance property.
We need to decide how much of a problem it is that we see this level of performance degradation at full utilization. If 3 tasks on one EC2 maintain the 300,000 records per second, that would be 900,000/s for the whole EC2, compared to the 720,000/s we saw for 4 tasks on one EC2. If we make this more configurable this may be less of a problem. It might be worth noting in the documentation or the property descriptions that high utilization can result in this degradation, at least if we can confirm that is the cause.
Since the CPU is the bottleneck, we could also consider moving to CPU-optimised EC2 instances. At time of writing it's running on t3.xlarge.
The text was updated successfully, but these errors were encountered: