Make UnivariateDriftCalculator (and other objects) JSON serializable #394

KGoldsmith11 · 2024-06-04T15:56:50Z

Currently, the UnivariateDriftCalculator object supports serialization via pickle. However, this format is not compatible with Apache Spark, which I intend to use for processing. on the inference side.

For governance reasons, I need to fit the drift calculator object in a different machine to the one where I will perform inference and have access to the analysis chunks, and the machine performing inference uses spark. Therefore I need to fit the object, serialise it, move it to the inference machine, load it in pyspark and then calculate the data drift on the inference data. This is not working using pickle but spark does have json load methods (spark doesnt have pickle loading methods).

JSON would be a good alternative to pickle as there are json load methods in spark.

The text was updated successfully, but these errors were encountered:

KGoldsmith11 added the enhancement New feature or request label Jun 4, 2024

KGoldsmith11 assigned nnansters Jun 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make UnivariateDriftCalculator (and other objects) JSON serializable #394

Make UnivariateDriftCalculator (and other objects) JSON serializable #394

KGoldsmith11 commented Jun 4, 2024 •

edited

Loading

Make UnivariateDriftCalculator (and other objects) JSON serializable #394

Make UnivariateDriftCalculator (and other objects) JSON serializable #394

Comments

KGoldsmith11 commented Jun 4, 2024 • edited Loading

KGoldsmith11 commented Jun 4, 2024 •

edited

Loading