-
Notifications
You must be signed in to change notification settings - Fork 671
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inefficient and undertyped to_json public data structures #3364
Comments
Introduce `to_dict` to the objects included in the existing JSON serialization process for `ReadableSpan`, `MetricsData`, `LogRecord`, and `Resource` objects. This includes adding `to_dict` to objects that are included within the serialized data structures of these objects. In places where `repr()` serialization was used, it has been replaced by a JSON-compatible serialization instead. Inconsistencies between null and empty string values were preserved, but in cases where attributes are optional, an empty dictionary is provided as well to be more consistent with cases where attributes are not optional and an empty dictionary represents no attributes were specified on the containing object. These changes also included: 1. Dictionary typing was included for all the `to_dict` methods for clarity in subsequent usage. 2. `DataT` and `DataPointT` were did not include the exponential histogram types in point.py, and so those were added with new `to_json` and `to_dict` methods as well for consistency. It appears that the exponential types were added later and including them in the types might have been overlooked. Please let me know if that is a misunderstanding on my part. 3. OrderedDict was removed in a number of places associated with the existing `to_json` functionality given its redundancy for Python 3.7+ compatibility. I was assuming this was legacy code for previous compatibility, but please let me know if that's not the case as well. 4. `to_dict` was added to objects like `SpanContext`, `Link`, and `Event` that were previously being serialized by static methods within the `ReadableSpan` class and accessing private/protected members. This simplified the serialization in the `ReadableSpan` class and those methods were removed. However, once again, let me know if there was a larger purpose to those I could not find. Finally, I used `to_dict` as the method names here to be consistent with other related usages. For example, `dataclasses.asdict()`. But, mostly because that was by far the most popular usage within the larger community: 328k files found on GitHub that define `to_dict` functions, which include some of the most popular Python libraries to date: https://github.com/search?q=%22def+to_dict%28%22+language%3APython&type=code&p=1&l=Python versus 3.3k files found on GitHub that define `to_dictionary` functions: https://github.com/search?q=%22def+to_dictionary%28%22+language%3APython&type=code&l=Python However, if there is a preference for this library to use `to_dictionary` instead let me know and I will adjust. Fixes open-telemetry#3364
Introduce `to_dict` to the objects included in the existing JSON serialization process for `ReadableSpan`, `MetricsData`, `LogRecord`, and `Resource` objects. This includes adding `to_dict` to objects that are included within the serialized data structures of these objects. In places where `repr()` serialization was used, it has been replaced by a JSON-compatible serialization instead. Inconsistencies between null and empty string values were preserved, but in cases where attributes are optional, an empty dictionary is provided as well to be more consistent with cases where attributes are not optional and an empty dictionary represents no attributes were specified on the containing object. These changes also included: 1. Dictionary typing was included for all the `to_dict` methods for clarity in subsequent usage. 2. `DataT` and `DataPointT` were did not include the exponential histogram types in point.py, and so those were added with new `to_json` and `to_dict` methods as well for consistency. It appears that the exponential types were added later and including them in the types might have been overlooked. Please let me know if that is a misunderstanding on my part. 3. OrderedDict was removed in a number of places associated with the existing `to_json` functionality given its redundancy for Python 3.7+ compatibility. I was assuming this was legacy code for previous compatibility, but please let me know if that's not the case as well. 4. `to_dict` was added to objects like `SpanContext`, `Link`, and `Event` that were previously being serialized by static methods within the `ReadableSpan` class and accessing private/protected members. This simplified the serialization in the `ReadableSpan` class and those methods were removed. However, once again, let me know if there was a larger purpose to those I could not find. Finally, I used `to_dict` as the method names here to be consistent with other related usages. For example, `dataclasses.asdict()`. But, mostly because that was by far the most popular usage within the larger community: 328k files found on GitHub that define `to_dict` functions, which include some of the most popular Python libraries to date: https://github.com/search?q=%22def+to_dict%28%22+language%3APython&type=code&p=1&l=Python versus 3.3k files found on GitHub that define `to_dictionary` functions: https://github.com/search?q=%22def+to_dictionary%28%22+language%3APython&type=code&l=Python However, if there is a preference for this library to use `to_dictionary` instead let me know and I will adjust. Fixes open-telemetry#3364
Introduce `to_dict` to the objects included in the existing JSON serialization process for `ReadableSpan`, `MetricsData`, `LogRecord`, and `Resource` objects. This includes adding `to_dict` to objects that are included within the serialized data structures of these objects. In places where `repr()` serialization was used, it has been replaced by a JSON-compatible serialization instead. Inconsistencies between null and empty string values were preserved, but in cases where attributes are optional, an empty dictionary is provided as well to be more consistent with cases where attributes are not optional and an empty dictionary represents no attributes were specified on the containing object. These changes also included: 1. Dictionary typing was included for all the `to_dict` methods for clarity in subsequent usage. 2. `DataT` and `DataPointT` were did not include the exponential histogram types in point.py, and so those were added with new `to_json` and `to_dict` methods as well for consistency. It appears that the exponential types were added later and including them in the types might have been overlooked. Please let me know if that is a misunderstanding on my part. 3. OrderedDict was removed in a number of places associated with the existing `to_json` functionality given its redundancy for Python 3.7+ compatibility. I was assuming this was legacy code for previous compatibility, but please let me know if that's not the case as well. 4. `to_dict` was added to objects like `SpanContext`, `Link`, and `Event` that were previously being serialized by static methods within the `ReadableSpan` class and accessing private/protected members. This simplified the serialization in the `ReadableSpan` class and those methods were removed. However, once again, let me know if there was a larger purpose to those I could not find. Finally, I used `to_dict` as the method names here to be consistent with other related usages. For example, `dataclasses.asdict()`. But, mostly because that was by far the most popular usage within the larger community: 328k files found on GitHub that define `to_dict` functions, which include some of the most popular Python libraries to date: https://github.com/search?q=%22def+to_dict%28%22+language%3APython&type=code&p=1&l=Python versus 3.3k files found on GitHub that define `to_dictionary` functions: https://github.com/search?q=%22def+to_dictionary%28%22+language%3APython&type=code&l=Python However, if there is a preference for this library to use `to_dictionary` instead let me know and I will adjust. Fixes open-telemetry#3364
Thanks for opening this issue. Points 1 and 2 make sense to me, I agree we should improve those.
The console exporter not intended to be a public data structure, or to be parseable even. We just use JSON because it's convenient, but other languages use different representations. There are a lot more details in open-telemetry/opentelemetry-specification#3565, I'd recommend chiming in there. As a workaround for anyone reading this, if you'd like to log out json lines of OTLP protobufs, you can use write your own small exporter using https://github.com/open-telemetry/opentelemetry-python/tree/main/exporter/opentelemetry-exporter-otlp-proto-common as a util. This should be more performant and I think the only reason we don't do this for ConsoleExporters is to avoid pulling the protobuf dependency in. |
Introduce `to_dict` to the objects included in the existing JSON serialization process for `ReadableSpan`, `MetricsData`, `LogRecord`, and `Resource` objects. This includes adding `to_dict` to objects that are included within the serialized data structures of these objects. In places where `repr()` serialization was used, it has been replaced by a JSON-compatible serialization instead. Inconsistencies between null and empty string values were preserved, but in cases where attributes are optional, an empty dictionary is provided as well to be more consistent with cases where attributes are not optional and an empty dictionary represents no attributes were specified on the containing object. These changes also included: 1. Dictionary typing was included for all the `to_dict` methods for clarity in subsequent usage. 2. `DataT` and `DataPointT` were did not include the exponential histogram types in point.py, and so those were added with new `to_json` and `to_dict` methods as well for consistency. It appears that the exponential types were added later and including them in the types might have been overlooked. Please let me know if that is a misunderstanding on my part. 3. OrderedDict was removed in a number of places associated with the existing `to_json` functionality given its redundancy for Python 3.7+ compatibility. I was assuming this was legacy code for previous compatibility, but please let me know if that's not the case as well. 4. `to_dict` was added to objects like `SpanContext`, `Link`, and `Event` that were previously being serialized by static methods within the `ReadableSpan` class and accessing private/protected members. This simplified the serialization in the `ReadableSpan` class and those methods were removed. However, once again, let me know if there was a larger purpose to those I could not find. Finally, I used `to_dict` as the method names here to be consistent with other related usages. For example, `dataclasses.asdict()`. But, mostly because that was by far the most popular usage within the larger community: 328k files found on GitHub that define `to_dict` functions, which include some of the most popular Python libraries to date: https://github.com/search?q=%22def+to_dict%28%22+language%3APython&type=code&p=1&l=Python versus 3.3k files found on GitHub that define `to_dictionary` functions: https://github.com/search?q=%22def+to_dictionary%28%22+language%3APython&type=code&l=Python However, if there is a preference for this library to use `to_dictionary` instead let me know and I will adjust. Fixes open-telemetry#3364
Issue opened in response to this discussion:
#3346 (comment)
where nested json serde behaviors are less than ideal for converting to a public data structure as used in the ConsoleExporter.
Issues with the current implementation are:
repr()
are used in places where they could/would produce invalid JSON objects, which breaks the contract of ato_json
public method.to_json
output is used directly by ConsoleExporter among other locations, which makes it a public data structure. However, no type definitions or other contracts exist to provide consistency with the public, structured outputs.The text was updated successfully, but these errors were encountered: