-
Notifications
You must be signed in to change notification settings - Fork 90
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(v2): Use AWS CRT as the default SDK client #1541
Conversation
Thanks @jreijn, looks great! Thanks to the new sync API, it's not so much change, that's a good news. Question/ask: could you do a load test on the parameter sample (maybe the easiest to do) before and after CRT? Feel free to say no and I'll do it later. |
Build is failing, problem in |
@jeromevdl thanks! Yes the change is quite simple now. I'll look into the issues. I've just setup a dev linux box to see if I can run the tests locally as it seems to fail on my MacBook book due to some docker issue. |
It's the test in the CDK example, which is autogenerated and doesn't really test anything. I'd like to remove it in #1542 |
@jeromevdl Do we already have something in place for doing the load test? Or should I just run something on my own? |
Quality Gate passedKudos, no new issues were introduced! 0 New issues |
@jeromevdl I've ran a load test with artillery on the parameters module and the results are mentioned below. I've used the parameters example with 1 slight improvement (tiered compilation) against the current v1 (main) branch with the connection client and this PR branch with the aws crt client. It's using 512MB. I've created two reports:
I used CW Logs insights with the following 2 queries:
DurationFor the main branch (v1) while using the url-connection-client I get the following results:
For the v2 branch while using the aws-crt-client I get the following results:
Memoryv1 with url-connection-client
v2 with aws-crt-client
I expected different results and I noticed that the example was not functioning due to some changes in the v2 branch which I fixed in a commit. There are still some messages during a coldstart for the LambdaJsonLayout of the log4j configuration and a mention about a missing version, which I will try to look at somewhere over the next couple of days. I also see that the parameters example is using an amazon API call to https://checkip.amazonaws.com. Not sure if that impacts the results, but will try to analyse further. As the crt sync client should also work with the v1 branch (by providing a client) I'll also try and test that to see if changes the v2 impact the results. |
Thanks for the tests. I didn't expect such results... It's disappointing. @msailes @scottgerring any thoughts? Do you think the asynchronous version would be better? |
I don't think this is expected, one to ask Zoe Wang, when she's next working. |
I've also tested with the sync crt client in the v1 branch and I see similar behavior. It makes me wonder about the impact compared to the url connection client. In the announcement post on the aws blog for the crt client we see the comparison to the netty client which is a more heavy weight client. Looking forward to some insights from the team that worked on the aws crt client. |
@jreijn thanks for the extensive testing! |
@scottgerring great suggestion for code guru. I'll see if I can find some over the next week to hook it up and find out what I can see about it. |
It seems like this PR will also resolve #1544 ! |
I've used this as an opportunity to have a play with CodeGuru. V2 current branch - No CRT Client - Java 11 Runtime CRT version @jreijn branch - Java 11 Runtime I don't think the above ⏫ is that useful - 1/ you can see in the CRT chart, the frames are all to do with fetching the checkmyip website, and 2/ the CRT usage is performed once, in the class constructor, not per-request. I am putting a branch together to make the runtime depend more on AWS SDK usage. |
Yes. I wasn't sure if the issue was known so solved it in this PR for now. |
Thanks @scottgerring . I'll keep an eye on the issue. I'm still enjoying some time off so if something changes I'll chime in again. |
@scottgerring i just noticed I missed this comment. With regards to: 1 I fully agree. The remote call makes the test less precise due to the latency from the remote call. 2 sounds like a plan. If you want me to incorporate the changes and try again let me know! |
Hi @jreijn, I'm on the AWS SDK for Java team 🙂. We performed tests to compare HTTP clients coldstart latency on Lambda and from our tests, switching to AWS CRT sync HTTP client from UrlConnectionHttpClient reduced 35% startup latency. AWS CRT sync has slight higher startup latency (6%) compared to AWS CRT async HTTP client and we'll look into that. The Lambda memory setting was 512MB and we were just sending one single ListBuckets request with S3 SDK. LMK if you have any questions and I'm interested to see the new test result! |
Thanks @zoewangg ! I'll try to make some time today or tomorrow to take it for another test run and see if we can see similar results. Was there a specific java version you used in your tests? |
I was using Java 8. |
Hey @jreijn sorry for the slow reply - took some time off over the break with the family! If you have the time and the inclination that would of course be great! I'm aware this one is getting a bit more fiddly than initially expected - @jeromevdl and I can also jump in put some time into it if you'd like |
@scottgerring i hope you had some wonderful time with the family. I don't mind spending some more time, but it's good to share experiences. I did a test 2 days ago where I incorporated most of your changes (from the branch) and ran a new test. The results did not differ that much and I'm planning to do another run tomorrow before I post the results here just to check for any mistakes I might have made. I'll send an update tomorrow! |
Hey @jreijn sounds good (although a bit surprised about the results). Looking forward to seeing the numbers. |
…eliable during testing.
I did a retest today to see the difference. I've incorporated some your changes @scottgerring into this PR for clarity. I'm a bit surprised by the memory results. Duration v2-with-crt-client (AWS SDK 2.22.0)
Memory usage (AWS SDK 2.22.0)
Duration (AWS SDK 2.23.9)
Memory (AWS SDK 2.23.9)
I'll try to create a heap dump of the JVM to analyse the what's actually in the heap and post an update later. Update I've analyzed a heap dump of about (28 MB). The amount of live objects is very small. Not sure how this relates to the 169 MB reported by Lambda itself. @scottgerring @jeromevdl any ideas? |
Hey @jreijn great - thanks for the dive deep!
So .... we can see the remote call latency is down, but the overall picture hasn't really changed - the CRT is still slower both with and without cold starts. Memory usage seems similar. We'll reach out to the team again on our side, it still feels like we are missing something! Would you be able to give us the exact lambda configuration you are using - region, runtime, size?
I believe the lambda-reported usage will be the "OS view" and include the java runtime as well, not just the JVM-allocated heap. I would guess this accounts for the discrepancy. |
Quality Gate passedKudos, no new issues were introduced! 0 New issues |
I was using the sam template for deployment. region: eu-west-1 |
Hey @jreijn , Once again apologies for the high turnaround here. @jeromevdl , @zoewangg and I had an impromptu chat about this to see if there was anything obvious to point at, and came up empty-handed. I'd like to take some time to sit with this myself and dive into it - i've set time aside tomorrow for this. I will loop back with findings and hopefully some reproducible unit of testing we can use to measure these and other perf-impacting decisions for PT. |
@jreijn , @jeromevdl : I think i've gotten to the bottom of it! It's the size of the output JAR; using the CRT client by default includes a pile of Detailed numbers here, testing bits here (i'll clean this up a bit so others can use it) Stripping back (removing all the shared objects for unneeded platforms) the JAR to the minimum reduces cold start time below the URLHttpClient:
CRT-normal and URL client are very close to the same. This is close to what what you measured, @jreijn - yours are a little higher, mine are a little lower. I note that in the maven repo, there are jar variants for all the different combos of arch/runtime. Turns out - you can use the "classifier" to get the right one - details here. I haven't played with this yet, but it seems like there will be an onus on us to document this clearly in Powertools as part of the "getting started" guide - otherwise simply including the CRT as the default means that |
@scottgerring thanks for your great tests. Looking at the jar sizes in maven, it's clear that the generic crt is quite big compared to the others 17 Mb vs ~2 Mb for the specific versions. Did you test one of these specific versions ? If I'm not mistaken, we need either: <dependency>
<groupId>software.amazon.awssdk.crt</groupId>
<artifactId>aws-crt</artifactId>
<version>0.29.10</version>
<classifier>linux-aarch_64</classifier> <!-- arm64 -->
</dependency> Or: <dependency>
<groupId>software.amazon.awssdk.crt</groupId>
<artifactId>aws-crt</artifactId>
<version>0.29.10</version>
<classifier>linux-x86_64</classifier> <!-- x86_64 -->
</dependency> Problem is that we can't set this in Powertools, so we need to document it... In PT, we can use <dependency>
<groupId>software.amazon.awssdk.crt</groupId>
<artifactId>aws-crt</artifactId>
<version>0.29.10</version>
<scope>provided</scope>
</dependency> I don't really like the developer experience, but if we can get better performance, that's probably worth it. |
not directly; i recreated a specific variant by hand by stripping the "full" JAR, before I realised you can use the classifier. We need to test that and make sure it works. I think the only trick is going to be ensuring that maven does what we think it will! |
Maven will do what it's supposed to do if you specify the classifier... Test it but it should work. Wondering if we could simplify this somehow, rather than just documenting... We'll build a powertools-maven-plugin that reads in your IaC to evaluate the architecture, the target java runtime and choose the best dependencies ;) |
@scottgerring wow nice find! I was curious if the package size had any impact after reading this post, but I did not get to it yet. In my local branch I made some changes and marked the This is mainly because the following observations:
otherwise you can/will end up with both versions on the class path (I noticed this after the
It makes me wonder if we should use the CRT client as the default client, as overall the developer experience will be less. Perhaps it's better to leave the url connection client as the default as it offers quite a good performance over dev experience . We could add documentation on how to use the CRT client for the local/target runtime for the users that want to try and squeeze the most out of their function, but it will be an explicit decision made by the developer to take on the extra burden of the local development vs remote and the dependency synch. Now for some good news.I did another run with the classifier and I see some very promising results compared to the previous benchmark(s). The following table is for the v2 crt client with classifier and latest version of the sdk (2.23.21) and crt library (0.29.9) :
|
Thanks @jreijn. I agree the devX is quite bad and we probably should not do this by default but document it so that people who want better performance can do it... |
Yeah, the developer experience isn't great. I didn't even consider this:
... plus you often want flexibility to mix ARM64/AMD64 on the AWS side. Considering our tenets 1/ Keep it lean and 2/ Work backwards from the community ... I think it is clear we should document how to use the CRT client (including what we've learnt here in terms of JAR-size minimization) and 2/ not use the CRT by default. The extra complexity it pushes onto the developer at the start of the journey outweighs the benefit in our case IMHO. Open question: Is |
I agree that we should leave CRT for now. At best we document how to use it as an optimization. |
@jreijn - FYI - we have passed over our collective findings to the CRT team, who mentioned they have been thinking about this. We will loop back when we have any further interesting information to share! In the meantime, would you like to work on the documentation improvements? I appreciate this isn't as glamorous as code, and we can certainly pick it up on our side if not :) |
@scottgerring great! Yes, I was already looking into adding some documentation. I will probably do that in a separate PR if that's ok. |
@jreijn sounds good! There's a bit of accumulated history on this one. I'll close this and link the findings on the issue. |
#1092
Description of changes:
This PR changes the default SDK HTTP client from UrlConnectionClient to the AWS CRT client. The CRT client is a lightweight client which should reduce the cold start and latency of lambda functions.
Checklist
Breaking change checklist
RFC issue #:
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.