Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

instrumentation-mongodb: performance degradation due to _scrubStatement #2516

Open
DanStory opened this issue Nov 4, 2024 · 2 comments
Open
Assignees
Labels
bug Something isn't working pkg:instrumentation-mongodb priority:p3 Bugs which cause problems in user apps not related to correctness (performance, memory use, etc)

Comments

@DanStory
Copy link

DanStory commented Nov 4, 2024

What version of OpenTelemetry are you using?

@opentelemetry/api: 1.9.0
@opentelemetry/sdk-trace-base: 1.26.0
@opentelemetry/sdk-trace-node: 1.26.0
@opentelemetry/instrumentation-mongodb: 0.48.0

What version of Node are you using?

node: v20.17.0

What did you do?

Insert/update mongo documents that have large datasets (BinData, Arrays, etc).
Namely using GridFS to upload large files (which are chunked into 255KiB documents)

What did you expect to see?

GridFS upload of a 430MiB file takes around a minute (as of instrumentation-mongodb: 0.44.0)

What did you see instead?

GridFS upload of a 430MiB file takes over 5 minutes (as of instrumentation-mongodb: 0.45.0 and after)

Additional context

We were able to deduce that the introduce of _scrubStatement as being the culprit: https://github.com/open-telemetry/opentelemetry-js-contrib/pull/1728/files#diff-0dd65d8fe0cee39d03f2ddd06f60730b20553e33521842fdb553056bfc1c8231R929

It is iterating the mongo cmd operation (which includes the document) recursively through nested objects and arrays, namely with disregard of the operation (for GridFs upload) to contain binary data upwards of (by default) 261120 items in an Array/BinData/Buffer. I will also note, the implementation would not (IMO) handle Date, ObjectId, etc. correctly either.

image
@DanStory DanStory added the bug Something isn't working label Nov 4, 2024
@DanStory DanStory changed the title instrumentation-mongodb: perfromance degradation due to _scrubStatement instrumentation-mongodb: performance degradation due to _scrubStatement Nov 4, 2024
@DanStory
Copy link
Author

DanStory commented Nov 4, 2024

For the time being, we have worked around the perf issue by supplying our own dbStatementSerializer (of the original implementation pre-0.45.0).

new MongoDBInstrumentation({
        enhancedDatabaseReporting: false,
        dbStatementSerializer: cmd =>
          JSON.stringify(
            Object.keys(cmd).reduce(
              (obj, key) => {
                obj[key] = '?';
                return obj;
              },
              {} as { [key: string]: unknown }
            )
          )
      })

@trentm
Copy link
Contributor

trentm commented Nov 21, 2024

I added some notes at #2281 (comment) that relate to this issue as well.

@pichlermarc pichlermarc added priority:p3 Bugs which cause problems in user apps not related to correctness (performance, memory use, etc) pkg:instrumentation-mongodb labels Dec 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working pkg:instrumentation-mongodb priority:p3 Bugs which cause problems in user apps not related to correctness (performance, memory use, etc)
Projects
None yet
Development

No branches or pull requests

4 participants