-
Notifications
You must be signed in to change notification settings - Fork 8
Merc 6849 llo benchmarking on staging mainnet don #481
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Merc 6849 llo benchmarking on staging mainnet don #481
Conversation
• Moved Ajv schema compilation out of the hot path • Introduced `schema-cache.ts` with a WeakMap<object, ValidateFunction> • `InputParameters.validateInput` now: – strips top-level nulls (matches legacy “missing field” semantics) – runs a pre-compiled Ajv validator before per-param logic * **schema-cache.ts** – Creates a single process-wide Ajv instance – Caches compiled validators in a WeakMap keyed by the schema object – Falls back to on-the-fly compile for boolean schemas (true/false) * **input-params.ts** – Builds JSON-Schema once in the constructor (`definitionToJsonSchema`) – Fetches/compiles validator via `getValidator` – Adds fast upfront Ajv validation (`validateFn`) – Sanitises `null` values before validation to preserve previous behaviour – Replaces ad-hoc runtime checks with brace-wrapped `if` blocks – Types tightened (`ValidateFunction`, `ErrorObject`) and all `any` removed – ESLint/TS warnings fixed (curly, index-signature, unused directives)
…e logic • Introduced persistent HTTP and HTTPS agents with keep-alive enabled, limiting socket churn and improving latency. • Simplified and clarified queue processing logs and error handling logic. • Moved keep-alive agent initialization into the Requester constructor to ensure global Axios defaults are set. * **Requester.ts** – Adds persistent HTTP(S) agents with configurable max sockets (`MAX_PARALLEL_HTTP_SOCKETS` defaulting to 128). – Simplifies logging to clearly indicate queue actions (enqueue, overflow, dequeue, retries). – Streamlines queue processing logic, preserving request coalescing behavior. * **Performance** – Reduces overhead from socket creation/teardown, improving throughput for concurrent outbound HTTP requests.
… overhead • Replaced SHA-1 + Base64 with FarmHash’s 64-bit `fingerprint64` converted to a 16-char hex string, eliminating intermediate `Buffer` allocations. • Cuts hash computation time and reduces event-loop blocking and lowering GC churn under high QPS. • Maintains deterministic, fixed-length key segments and preserves existing JSON-stringify lowercasing logic.
…ustify shutdown • Refactored background execution to spawn a dedicated loop for every (endpoint × transport) pair, improving isolation and resilience. • Timers are now managed in a Map keyed by "endpoint:transport", ensuring precise control and cleanup. • Enhanced shutdown logic: all timers are reliably cleared when the HTTP server closes, preventing resource leaks. • Improved error and timeout handling: background loops always survive exceptions, maintaining continuous operation. • Metrics are now labeled per (endpoint, transport) for more granular observability. * **background-executor.ts** – Replaces single-loop logic with per-transport loop spawning. – Adds robust timer management and graceful shutdown via a centralized stopAll routine. – Refactors metric collection to use consistent, cached label handles. – Ensures backgroundExecute errors and timeouts do not interrupt future executions. * **Test Coverage** – All existing unit tests updated and passing. – Test expectations adjusted for new scheduling granularity. * **Performance & Reliability** – Reduces risk of cross-transport interference and timer leaks. – More maintainable and observable background execution model.
• Introduced `fast-serialize.ts` module for optimized JSON serialization of adapter responses, aiming to reduce CPU/GC overhead. • Added `FAST_SERIALIZATION_ENABLED` configuration option (default: true) to toggle the feature. • Integrated custom serialization into Fastify's reply pipeline, conditional on the new setting. • Added `ea_response_serialization_duration_seconds` Prometheus histogram to track 'fast' vs. 'standard' serialization performance. • Implemented comprehensive unit tests for the new fast serialization logic, covering correctness and various data structures. • Added API-level tests to verify custom serializer integration with Fastify. • Achieved 100% test coverage for the schema validation caching utility (`src/validation/schema-cache.ts`) with new test cases. * **src/util/fast-serialize.ts** – New module providing `serializeResponse`, `serializeSuccessResponse`, `serializeErrorResponse`, and `escapeString`. * **src/index.ts** – Modified `buildRestApi` to use custom serializer via `reply.serializer()` when `FAST_SERIALIZATION_ENABLED` is true. * **src/config/index.ts** – Added `FAST_SERIALIZATION_ENABLED` setting. * **src/metrics/index.ts** – Added `ea_response_serialization_duration_seconds` metric. * **test/util/fast-serialize.test.ts** – New file with unit tests for `serializeResponse` and its helpers. * **test/api-serialization.test.ts** – New file testing Fastify integration for serialization. * **test/validation/schema-cache.test.ts** – New file with tests for `getValidator` caching and validation rules.
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: David de Kloet <[email protected]>
9810365
to
1c361d5
Compare
2a97965
to
9dbfbe3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potential avenues to take for improvements:
- The redis cache implementation that EAs use have a local in-memory cache, as a sort of layered cache implementation to not have to read from redis all the time. This has evolved since the first implementations, but you can see that it returns directly if the values are found in memory. Some things here:
- Looking at the metrics however I see that the adapter makes a lot of get requests to redis anyways, and I would expect much fewer if the in-memory cache was working as intended. Fixing this would be a huge win for timeliness of requests.
- Redis is also used as a "subscription set", which serves as the layer in between read and write to let the adapter know what values to fetch. This should also have an in-memory layer, because the zadd ops to add to this set are also hundreds per second and they are not necessary: the entries will fall from the set after a while, probably minutes. This should be another huge win.
|
||
// Checks if an individual transport has a backgroundExecute function, and executes it if it does | ||
const callBackgroundExecute = ( | ||
// Spawn one loop per (endpoint × transport) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was already setting up a separate one per bg, I don't think this should have a meaningful impect. Maybe the recursive vs separate timeout could, but in theory it should not have an effect
logger.debug(`Clearing timeout for endpoint "${endpointName}"`) | ||
timeoutsMap[endpointName].unref() | ||
clearTimeout(timeoutsMap[endpointName]) | ||
export function callBackgroundExecutes(adapter: Adapter, apiShutdownPromise?: Promise<void>): void { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All this would be on the writer side and executed not very frequently, I would advise to separate this change from the first attempts to derisk
shasum.update(cacheKey) | ||
return shasum.digest('base64') | ||
} | ||
const digest = farmhash.fingerprint64(cacheKey) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This indeed could save time, as it's one of the things that is executed quite frequently and on reads. These cache keys are at the core of the framework, serving as the identifier for requests. A couple of things to note here:
- This is an improvement to make for sure, but it may not affect adapters, since it's only applied if the key is too long
- The original cache key is just stringifying the object. This is where most effort will be done most likely, profiling should be able to tell what's going on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're doing streams specific features, for the cost of higher payload sizes this calculation could be done once at the job level and the cache key sent out directly as a parameter in the payload, bypassing all this calculation entirely and speeding up the reads but putting the onus of making sure these are correct on the job production.
* @param response - The response object to serialize | ||
* @returns JSON string representation of the response | ||
*/ | ||
export function serializeResponse<T extends ResponseGenerics>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd be really curious to A/B test this feature, as the JSON.stringify call even tho slow it's using a c lib iirc, but probably for small payloads the string concatenation is faster? It seems somewhat brittle to me though, it's good to try but the approach should be stricter if it turns out it gives significant benefits
* It additionally serves to coalesce requests by utilizing a more complex queue structure: | ||
* - ignores duplicate items via a provided key | ||
* - doesn't use the request itself because it's common for those to have things like timestamps/nonces | ||
* Manages outbound HTTP requests with queueing, rate limiting, and connection reuse. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one I would honestly remove entirely. It adds complexity but it's only useful for outbound http requests, and we're not really doing that many. Most LL adapters use a streaming protocol anyways, so this should not affect performance at all.
/* -------------------------------------------------------------------------- */ | ||
/* TYPE HELPERS */ | ||
/* -------------------------------------------------------------------------- */ | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would DEFINITELY not remove all these comments haha, I added them with a lot of love and I'm sure our AI overlords appreciate them even if they hallucinate to remove them
|
||
/* New: compile schema once */ | ||
this.schema = definitionToJsonSchema(this.definition) | ||
this.validateFn = getValidator(this.schema) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We tried ajv fast too, but at the time the parameters were very complex and proved impossible to do the migrations properly so it was quickly rejected. Still, it's a big time sink on the read request execution, so any speedup helps a lot and worth a try. I would first consider the approach where the validation is skipped entirely for mercury jobs though, as we avoid this altogether. Paired with the suggestion to precalculate the cacheKeys, requests could be resolved in 2 lines only and be blazing fast
Another thing to consider is parallelizing multiple read instances and a write instance in a single dockerfile, should be doable with some weird constructs and would not require any changes to the interfaces |
No description provided.