-
Notifications
You must be signed in to change notification settings - Fork 156
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MinMaxScaler (and more) #816
Comments
@egolep Thanks for this. Despite your definition function fit(transformer::MinMaxScaler, verbosity::Int, X::Any) you are getting ERROR: MethodError: no method matching fit(::MinMaxScaler, ::Int64, ::DataFrame) Maybe this is a dumb question, but did you |
BTW, you may want to focus on just the univariate case, in view of JuliaAI/MLJModels.jl#288 . |
Hi @ablaom, Now the univariate case does work and I'm going to call it a day, following your suggestion. Many thanks for your reply! |
I lied. Now also MinMaxScaler() works since leaving it unfinished triggered all my OCDs. Thanks again for your replies. If I will implement more of this kind of models, could it be worth to create a pull request? Or these transformers are kept in a small number for a reason? |
No, PR to MLJModels most welcome. You'll need to add a test... |
@egolep You still interested in make a PR to MLJModels.jl ? Let me know if I can help you make that happen. |
It would be very nice to have more transformers than Standardizer, OneHotEncoding and BoxCox (and their univariate versions)
I even tried to implement a MinMaxScaler using Standardizer as an example, but I keep getting:
[ Info: Training Machine{MinMaxScaler,…} @194.
┌ Error: Problem fitting the machine Machine{MinMaxScaler,…} @194.
└ @ MLJBase ~/.julia/packages/MLJBase/AkJde/src/machines.jl:484
[ Info: Running type checks...
[ Info: Type checks okay.
ERROR: MethodError: no method matching fit(::MinMaxScaler, ::Int64, ::DataFrame)
Closest candidates are:
fit(::MLJBase.Stack{modelnames, inp_scitype, tg_scitype} where {modelnames, inp_scitype, tg_scitype}, ::Int64, ::Any, ::Any) at /home/egolep/.julia/packages/MLJBase/AkJde/src/composition/models/stacking.jl:277
fit(::Union{MLJIteration.DeterministicIteratedModel{M}, MLJIteration.ProbabilisticIteratedModel{M}} where M, ::Any, ::Any...) at /home/egolep/.julia/packages/MLJIteration/Twn0E/src/core.jl:51
fit(::Union{MLJTuning.DeterministicTunedModel{T, M}, MLJTuning.ProbabilisticTunedModel{T, M}}, ::Integer, ::Any...) where {T, M} at /home/egolep/.julia/packages/MLJTuning/QFcuQ/src/tuned_models.jl:592
...
Stacktrace:
[1] fit_only!(mach::Machine{MinMaxScaler, true}; rows::Vector{Int64}, verbosity::Int64, force::Bool)
@ MLJBase ~/.julia/packages/MLJBase/AkJde/src/machines.jl:482
[2] #fit!#98
@ ~/.julia/packages/MLJBase/AkJde/src/machines.jl:549 [inlined]
[3] top-level scope
@ REPL[120]:1
here my implementation (of both a univariate version and the multivariate one):
import MLJModelInterface.inverse_transform
mutable struct UnivariateMinMaxScaler <: Unsupervised
end
function fit(transformer::UnivariateMinMaxScaler, verbosity::Int, v::AbstractVector{T}) where T<:Real
min, max = minimum(v), maximum(v)
fitresult = (min, max)
cache = nothing
report = NamedTuple()
return fitresult, cache, report
end
function transform(transformer::UnivariateMinMaxScaler, fitresult, x::Real)
min, max = fitresult
x_std = (x .- min) ./ (max - min)
return x_std .* (max - min) + min
end
transform(tranformer::UnivariateMinMaxScaler, fitresult, v) = [transform(tranformer, fitresult, x) for x in v]
function inverse_transform(transformer::UnivariateMinMaxScaler, fitresult, y::Real)
min, max = fitresult
y_std = y .- min ./ (max - min)
return y_std .* (max - min) .+ min
end
inverse_transform(transformer::UnivariateMinMaxScaler, fitresult, w) = [inverse_transform(transformer, fitresult, y) for y in w]
mutable struct MinMaxScaler <: Unsupervised
features::Vector{Symbol}
end
MinMaxScaler(; features=Symbol[]) = MinMaxScaler(features)
function fit(transformer::MinMaxScaler, verbosity::Int, X::Any)
end
MLJ.fitted_params(::MinMaxScaler, fitresult) = (min_and_max_given_feature=fitresult,)
function transform(transformer::MinMaxScaler, fitresult, X)
features_to_be_transformerd = keys(fitresult)
all_features = schema(X).names
end
I get the same error using both the multivariate and the univariate one.
The text was updated successfully, but these errors were encountered: