Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(aws-lambda): added performance improvements #1315

Draft
wants to merge 17 commits into
base: main
Choose a base branch
from

Conversation

kirrg001
Copy link
Contributor

@kirrg001 kirrg001 commented Sep 5, 2024

refs https://jsw.ibm.com/browse/INSTA-13498

  • live test with lambda layer
  • live test without lambda layer
  • can the backend connect the spans still? If we send 10 each?
    • backend confirmed: no big deal as long as they arrive within 20s.
    • add a protection for the 20s limit! -> we cannot add logic for that, we have to tell the customers to disable the feature for calls longer than 20s.
  • layer too slow with AND without span buffering - https://github.ibm.com/instana/lambda-extension/pull/18
  • Span buffer
  • fix not waiting to finish the requests for the current invocation (problem: lambda gets freezed and on the next invocation, the data gets sent out but also goes to the awaitnext iteration!) -> Fixed -> reproduce and proof its no longer happening
  • create multiple PR's
  • 250 span iteration -> bug???
  • add a load test with different scenarios (automate?)

CASES

The results are always a little different. Factors like the speed of AWS, network and the backend replies plays a role.

Heartbeat performance fix

Was extracted from this PR and released already.
0c001c8

We have seen too many random errors regarding failed heartbeat requests to the layer.

Backend down / Host not found

  • 2 spans
  • host not found triggers retries and runs longer than the actual lambda execution
  • fixed bug in extension

Current layer

Average Response Time: 367.958ms
Average Billed Duration: 1502.000ms

Local branch

Average Response Time: 350.958ms
Average Billed Duration: 295.500ms

Backend slow / timeout (8 spans, no span batching)

  • lambda fn in US -> serverless endpoint in Asia
  • layer tries twice to send the spans - both failing (this adds a delay on top)

Current layer

Average Response Time: 962.348ms
Average Billed Duration: 1567.000ms

Local branch

Average Response Time: 974.408ms
Average Billed Duration: 1356.900ms

Backend performs (8 spans, no span batching)

  • lambda fn in US -> serverless endpoint in US

Current layer

Average Response Time: 949.076ms
Average Billed Duration: 904.267ms

Local branch

Average Response Time: 957.742ms
Average Billed Duration: 882.929ms
Average Backend Response Time: 65.178ms

Very similar. BUT there was a bug which executed invocations on the next invocation. Thats why current layer has similar results.

10 spans (one /traces, one /bundle)

Backend slow / timeout (100 spans, span batching enabled)

  • lambda fn in US -> serverless endpoint in Asia

Backend performs (Span batching 100 spans)

current layer

Average Response Time: 10280.682ms
Average Billed Duration: 10231.250ms

local branch

Average Response Time: 10270.244ms
Average Billed Duration: 10195.500ms
Average Backend Response Time: 59.806ms

@kirrg001 kirrg001 changed the title fix(aws-lambda): reduced lambda runtimes for large number of spans fix(aws-lambda): reduced lambda latency for large number of spans Sep 5, 2024
@kirrg001 kirrg001 changed the title fix(aws-lambda): reduced lambda latency for large number of spans fix(serverless): reduced lambda latency for large number of spans Sep 10, 2024
@kirrg001 kirrg001 changed the title fix(serverless): reduced lambda latency for large number of spans fix(aws-lambda): reduced lambda latency for large number of spans Sep 10, 2024
@kirrg001 kirrg001 changed the title fix(aws-lambda): reduced lambda latency for large number of spans fix(aws-lambda): reduced latency for large number of spans Sep 10, 2024
@kirrg001 kirrg001 force-pushed the short-fix-lambda branch 3 times, most recently from dad39ed to 83ae371 Compare November 19, 2024 16:07
@kirrg001 kirrg001 changed the title fix(aws-lambda): reduced latency for large number of spans fix(aws-lambda): added performance improvements Nov 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant