`serializable` does not remove data implicitly stored in partial functions, leading to scaling with data size and potential privacy breaches

**edit:** New issue title more accurately explains the issue not immediately diagnosed in the original comment below. 

---

This issue is not related to the master or dev branches but to the breaking release branch [for-a-0-point-20-release](https://github.com/JuliaAI/MLJBase.jl/tree/for-a-0-point-20-release).

After merging #733 (with target [for-a-0-point-20-release](https://github.com/JuliaAI/MLJBase.jl/tree/for-a-0-point-20-release)) which was passing CI, and bringing [for-a-0-point-20-release](https://github.com/JuliaAI/MLJBase.jl/tree/for-a-0-point-20-release) up to date with `dev` (with a regular merge) I'm getting a new error in tests. File size of serialised objects has become data-dependent. The following is adapted from the failing test:

```julia
model = Stack(
       metalearner = FooBarRegressor(lambda=1.),
       model_1 = DeterministicConstantRegressor(),
       model_2=ConstantRegressor())
DeterministicStack(
    resampling = CV(
            nfolds = 6,
            shuffle = false,
            rng = Random._GLOBAL_RNG()),
    metalearner = FooBarRegressor(
            lambda = 1.0),
    model_1 = DeterministicConstantRegressor(),
    model_2 = ConstantRegressor())

filesizes = []
for n in [100, 500, 1000]
       filename = "serialized_temp_$n.jls"
       X, y = make_regression(n, 1)
       mach = machine(model, X, y)
       fit!(mach, verbosity=0)
       MLJBase.save(filename, mach)
       push!(filesizes, filesize(filename))
       rm(filename)
end

julia> filesizes
3-element Vector{Any}:
 28744
 45144
 65144
```

@olivierlabayle Are you able to reproduce? Any idea how this could have arisen.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

`serializable` does not remove data implicitly stored in partial functions, leading to scaling with data size and potential privacy breaches #750

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

serializable does not remove data implicitly stored in partial functions, leading to scaling with data size and potential privacy breaches #750

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`serializable` does not remove data implicitly stored in partial functions, leading to scaling with data size and potential privacy breaches #750