-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Elex 4549 add model flexibility #109
base: develop
Are you sure you want to change the base?
Conversation
…CV branch for testing purposes
@@ -5,7 +5,7 @@ | |||
|
|||
INSTALL_REQUIRES = ( | |||
"click>=8.1", | |||
"elex-solver>=2.1.1", | |||
"elex-solver @ git+https://github.com/washingtonpost/elex-solver.git@updating-residuals", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed this temporarily so we can test / run the tests here in github 🎉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks pretty good! 🎉 🎉 I have some questions here and on the elex-solver
PR, and I was also wondering if you could add documentation about the new solver options to the README? 🤔 🎉
model.fit(x, y, weights=weights, taus=0.5, **model_kwargs) | ||
else: | ||
raise BootstrapElectionModelException( | ||
f"Please use defined model type 'QR' or 'OLS', input is: {self.model_type}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is such a polite and helpful error message! 😄 🎉
model_kwargs = {"fit_intercept": True, "regularize_intercept": False, "n_feat_ignore_reg": 1} | ||
if self.model_type == "OLS": | ||
# if we run OLS we use cross validation to find the optimal lambda | ||
# TODO: why do we not do this for other models also |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lol wait, why do we not do this for other models also? 🤔
Is this a "how can I write code to make this happen" question or a theoretical question (or both)? 🤔
y_train, | ||
weights_train, | ||
aggregate_indicator_train, | ||
K=5, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious, how did you pick the default of 5 here (and elsewhere)? I mean I've done (and seen) 3-fold, 5-fold, 10-fold CV but usually my choice there depends on the number of examples in the data set and how much computing power I have 🤔
|
||
# we cannot just apply ols_y_B/old_z_B to the test units because that would be missing | ||
# our contest level random effect | ||
# TODO: DO WE NEVER DO ANYTHING WITH THE OUTPUT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤔
Description
Instead of being stuck with OLS as our base model for the bootstrap model, this PR gives us flexibility to use different models. We were wed to OLS because we had a quick way of computing the leave-one-out-residual, which we needed for the bootstrap, since the training residual would be biased towards zero. We now use k-fold cross validation to get an estimate for the leave-one-out residual (k-fold residual will be greater than or equal to loo residual, so this is if anything more conservative). We also add the ability to play with OLS vs. QR models. In the future, we may expand the models that are being used here.
This needs this branch of elex-solver, which allows us to calculate the k-fold residual. Tests will fail until this branch is merged/released.
Jira Ticket
https://arcpublishing.atlassian.net/browse/ELEX-4549
Test Steps
vs