You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am currently using the "smote" pipeline in my project to balance my dataset, which includes a strata variable. The tsk$strata is set to use the feature "wt" for stratification, and the target variable is vs (binary: 0-1).
Here's the code snippet I am using:
pob = po("encode") %>>% po("smote")
pro_tsk = pob$train(train_tsk)[[1]]
However, I encountered the following error:
Cannot rbind data to task 'classification', missing the following mandatory columns: ..stratum_wt
This happened in PipeOp traindata's $train()
It seems the issue is related to the missing strata variable during the pipeline execution. But I can't fix it.
Thank you in advance for your help! 😊
The text was updated successfully, but these errors were encountered:
invain1218
changed the title
Imbalance Data when I add strata
Error: missing the following mandatory columns: ..stratum_wt
Nov 18, 2024
Sorry for the late response. The SMOTE pipeop tries to create new samples, but does not know how to fill the "strata" column for these samples. If you want to use SMOTE, you need to either remove the "strata" column altogether, or convert it into a feature column (you can convert it back to strata after the SMOTE pipeop), using po("colroles"). We now have the new_role_direct argument for po("colroles"), so you could do po("colroles", new_role_direct = list(stratum = NULL)) to remove the stratum column.
I am currently using the "smote" pipeline in my project to balance my dataset, which includes a strata variable. The
tsk$strata
is set to use the feature "wt" for stratification, and the target variable is vs (binary: 0-1).Here's the code snippet I am using:
However, I encountered the following error:
It seems the issue is related to the missing strata variable during the pipeline execution. But I can't fix it.
Thank you in advance for your help! 😊
The text was updated successfully, but these errors were encountered: