You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is there a way to pass a list of options to CV on for individual transforms in the DataFrameMapper like here:
# determine full param search space (need to get the params for the mapper parts in here somehow)
full_params = {'clf__alpha': [1e-2,1e-3,1e-4],
'clf__loss':['modified_huber','hinge'],
'clf__penalty':['l2','l1'],
# now set the params for the datamapper part of the pipeline
'mapper__features':[[
('Name',deepcopy(name_to_tfidf).set_params(name_vect__analyzer = ['char', 'char_wb'])),
('Ticket',deepcopy(ticket_to_tfidf).set_params(ticket_vect__analyzer = ['char', 'char_wb']))
]]
}
Ideally id like to CV on what params are best for the name_to_tfidf and ticket_to_tfidf DataFrameMapper pipelines.
But passing a list of options to set_params() like this gives me this error when i go to fit:
ValueError: ['char', 'char_wb'] is not a valid tokenization scheme/analyzer
The text was updated successfully, but these errors were encountered:
# set up grid search
gs_clf = GridSearchCV(full_pipeline, full_params, n_jobs=-1)
And then:
# do the fit
gs_clf.fit(df,df['Survived'])
So i am able to do the CV on the clf params but id also like to do CV on some params within the transforms in the DataFrameMapper - just not sure how to go about this.
Basically i was passing ['char', 'char_wb'] to this line for example: ('Name',deepcopy(name_to_tfidf).set_params(name_vect__analyzer = ['char', 'char_wb'])),
As i was hoping the GridSearchCV would then also consider those two params in the grid.
Apologies for posting as an issue but feel like could be a useful use case.
I'm just wondering if something like what i'm trying to do is or should be possible.
If i set up a pipeline like:
Is there a way to pass a list of options to CV on for individual transforms in the DataFrameMapper like here:
Ideally id like to CV on what params are best for the name_to_tfidf and ticket_to_tfidf DataFrameMapper pipelines.
But passing a list of options to set_params() like this gives me this error when i go to fit:
The text was updated successfully, but these errors were encountered: