-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add fit_transform
?
#18
Comments
I think for transformers it would make sense to require exactly two methods:
No need for a separate The signature of fitted_transformer, Xout = fit_transform(transformer, Xin) |
Thanks for that suggestion. | No need for a separate fit implementation for transformers. I dunno. It seems like a non-trivial complication to the API. No separate If [I know some have argued for just one "operation" ( |
Okay, here's a variation on your idea that doesn't require adding to the namespace. Each transformer implements one Case 1: static (non-generalizing) transformersfit(strategy, X) -> model # storing `transformed_X` and any inspectable byproducts of algorithm
transform(model) -> model.transformed_X with a convenience fallback transform(strategy, X) = transform(fit(strategy, X)) Case 2: generalizing transformers:fit(strategy, X) -> model # storing `transformed_X` and `learned_parameters` and any inspectable byproducts of algorithm
transform(model, Xnew) -> transformed_Xnew # uses `model.learned_parameters` with a convenience fallback transform(strategy, X) = transform(fit(strategy, X).transformed_X) |
I'm not sure we'd want to keep a reference to an intermediate transformed data set in a trained transformer. That would prevent the garbage collector from freeing that memory as long as the pipeline is still around. It also feels conceptually a little muddy, but that's just a feeling that I haven't been able to put into more concrete terms yet. :) |
Case 2: generalizing transformers:fit(strategy, X) -> model # storing `learned_parameters` and any inspectable byproducts of algorithm
transform(model, Xnew) -> transformed_Xnew # uses `model.learned_parameters` with a convenience fallback transform(strategy, X) = transform(fit(strategy, X), X) |
Hmm, that form doesn't allow for optimizations, but, as you said, maybe there's not really that many fit-then-transform cases that get a large benefit from optimizations. |
In #30 an implementation can explicitly overload |
On dev, a learner can implement |
See discussion at #16
The text was updated successfully, but these errors were encountered: