Update get_callback_id, register_callback #97
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This commit reflects 2 suggestions that are very closely related to dotnet/aspnetcore#40832. If the signatures within the callback_manager class are currently fixed, then we should consider the following changes:
I ran a very simplified mockup of the methods around get_callback_id and register_callback. The tests were run across ARM32 and x64, Windows and Linux, and using multiple language standards. The table also includes testing 3 different ways the callback map can be updated. This was run on my office hardware with compilations that represent real-world use cases.
Here's a table of the results. Note the
make-pair
column represents using the insert method. The last 6 columns are elapsed time in seconds.+
make-pair
+
operator
+
emplace
+
make-pair
+
operator
+
emplace
Using std::to_string in get_callback_id
If you look at the table above with the perspective of "how does utilizing std::to_string affect performance with respect to the original stringstream approach", then the following table is the % difference between the suggested std::to_string technique and the original stringstream implementation (larger is better).
Replacing the stringstream conversion with std::to_string is faster in all cases, no matter what technique is chosen to update the callback map. The average gain is right around the 47.7% mark.
Updating the map in register_callback
In the register callback method, the question of how to add information to the map comes down to 3 possible options: insert with std::make_pair, index operator [], or emplace. This is a little harder to see but if you look at the relative ranking of the total execution time, the following table emerges (lower is better).
+
make-pair
+
operator
+
emplace
+
make-pair
+
operator
+
emplace
In short, emplace tends to perform better in nearly all the tested platforms, whether or not the original stringstream or the suggested to_string approach is used.
I'm a little surprised that the original insert methodology (with make-pair) is somehow better on RPi (see asterisk) as we tend to use that platform quite frequently. Seems using emplace is worse in that case. However, without doing a bunch of preprocessor defines to swap out techniques based on platform, I think the greater good can be served by switching to emplace. If someone wants to help explore the RPi perf difference, that effort may help guide the implementation of a more tailored solution that fits more configurations. It could be an old compiler issue, ARM vs x64, ARM32-specific, etc.
Finally, even though emplace generally works better, the perf gain is small compared to the to_string change. This change, assuming to_string is accepted and the ARM32 results are omitted, yields a modest an average of 6.26% improvement over the original insert technique.
Summary:
Two small changes can help reduce time spent in creating and registering the callback id. The two changes are submitted for review as a unit but they could be taken individually if desired.