-
Notifications
You must be signed in to change notification settings - Fork 161
We need to define consistent semantics of passing collections to tracer methods #4
Comments
TL;DR; my 2p is (benchmark, )make a call, and document it. Documenting how to mitigate untrusted input (ex via decorators) is a choice we can make, regardless of how unpopular that would be for safety/security minds. I'd encourage benchmarks to show the impact, since if we are successful, we will have someone ask about a vulnerability sooner or later. We may not be able to "afford" this rigor quite yet, and I also have no idea what tools are available in python. -- non-python asides which may or may not help Immutable collections are often compared so as to not be copied. You see this a lot in google code. Librarians working at google ensure common libraries can be used safely and performantly, often with Caliper (and more recently JMH) to prove they perform. At twitter, even equals had to be written in a way that wouldn't leak timing information. At square/android, hefty disclaimers were added when untrusted code was allowed access to internal data structures. I've personally had to spend days on something similar due to a vulnerability report on a seemingly reasonable library design decision. |
Re the actual subject of this issue, I am fine with either option as long as we document the choice. My assumption as a library user is that the param would be passed by reference (i.e., ownership hands off to the library), FWIW. Re the thing I commented about in that PR: I'm not aiming for parity with golang, I'm aiming for supporting the minimum reasonable number of ways to do the same thing. I.e., if we support I agree that this is less of a concern in python due to optional params and method overloading, though. |
@adriancole on the benchmarks, this is something I am asked often internally as well, and don't have a good story to tell. Because raw overhead of a tracer that does all the work but never reports collected spans is orders of magnitude less than if it reports every span. What's the best way to measure overhead then?
|
good q. I get that from an end-to-end pov, the traces sent/timeunit is for many the key metric and a benchmark. It looks like you've that above. Wrt to things like api design, it would be a microbench that can measure micros or nanos w/o side effects. So here it would clearly show the difference in choices like this. The first bench (like you have above) is more important than the latter, imho, as if a design decision for performance doesn't move the needle, it sortof proves itself a micro-optimization (which would be testable in a microbenchmark) |
We can either emphasize safety by requiring tracer to always take copies of passed collections if it wants to store them, or emphasize performance by telling the caller that it must always give up ownership of collections.
My vote is to emphasize performance. The safety can be achieved by decorators.
The text was updated successfully, but these errors were encountered: