-
Notifications
You must be signed in to change notification settings - Fork 214
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enhancement request: use xgboost as base learner #250
Comments
You would want to do at least two changes to your code:
this is an interesting experiment and would love to see how it works out! Thanks for giving it a shot and sharing the results! |
Hi @avati Thanks for your prompt answer. I made the chance to the code, in which the xgboost n_estimators= 1 while NGBoost n_estimators = 300. Unfortunately, I still get the same result. By any chance, do you have a Python code example on how to change the xgboost model to be more like a Python constructor? Ivan |
Here's one way. Instead of:
do:
|
Hi @avati Thanks for the suggestion. Before pursuing more work with xgboost, I tried the following code: #_________________ learner= GradientBoostingRegressor(loss= 'ls', learning_rate= 0.05, n_estimators= 1, criterion= 'mse', max_depth= 6, min_impurity_decrease= 0, random_state= 1969) ngb= ngboost.NGBRegressor(Dist= ngboost.distns.Normal, Score= ngboost.scores.CRPScore, Base= learner, y_preds= ngb.predict(x_validation) It gave a reasonable result which could be improved by playing with the hyperparameters. This shows the strength of NGBoost to take learners from scikit-learn library. On the other hand, the xgboost (although I am using its scikit-learn api) does not seem to work well with NGBoost - as you well explained. Could be possible that the xgboost's api library is missing something required by NGBoost? Do you have more suggestions? Ivan |
The same suggestion as my previous comment. Use learner with a 'lambda' as shown, whether it is for XGB or GBR. |
Hi @avati Thanks for the suggestion, I tried the command with lambda, and get this message: Cannot clone object '<function at 0x000001F05A98A840>' (type <class 'function'>): it does not seem to be a scikit-learn estimator as it does not implement a 'get_params' methods I am pretty sure that I am missing something on how to implement this approach. Could you provide a more detailed code example? Ivan |
I also want to use LightGBM as base learner and the same issue with @ivan-marroquin , Could you provide some advise? |
Hi @caiquanyou I think that way on how to run xgboost with ngboost (and perhaps, it applies as well to lightgbm). I found this publication: and the code source used in this publication can be found at: to make it work with xgboost, it is required to set number of estimators (along with the number of trees used in ngboost). I have xgboost 1.1.0 and ngboost 0.3.10. I used the toy example used by ngboost (adapted to work with xgboost): import numpy as np if name == 'main':
Note that xgboost will raise the following warning message: I don't know whether this issue may influence the quality of the result. Let me know what do you find on your side, Hope this helps, Ivan |
That warning shouldn't influence the predictions, but will increase the ram consumption of the computation. I'd be interested in hearing more experiences with using other packages as the Base learner. |
In case it's useful, I've written a "native" xgboost version of ngboost, implemented in the xgboost scitkit-learn API. |
Exciting! Looking forward to checking it out! |
This is fantastic @CDonnerer. If you're willing, I'd love to have features like these ported into the core NGBoost library. We've had previous discussions on how to make ngboost faster and easier to develop that you would be more than welcome to contribute to. |
Really cool library! Related question: does xgboost-distribution offer a gpu implementation like xgboost, or nah? I'm assuming the relative performance numbers are for runs on the CPU, right? |
@alejandroschuler Thanks! Sure, I'll have a look at those discussions, there might be options to port those features across in a generic way. @astrogilda No GPU support for xgboost-distribution yet, indeed, the performance numbers refer to CPU runs. |
@CDonnerer - just want to say that's a fantastic library you've written. I don't know how practical it would be to port the features over to NGboost as @alejandroschuler suggested, and the coding is way over my head. If that's at all possible, as a user, that would be a great solution (rather than having forked development across two different probabilistic libraries). This would be especially helpful for the purposes of adding additional distribution support in a consistent way. |
@CDonnerer seems like there is quite some overlap with XGBoostLSS, an approach I have developed in 2019 |
@StatMixedML thanks for sharing the link of your approach! |
@ivan-marroquin
|
Hi all,
I have Python 3.6.5 with xgboost 1.1.0 and ngboost 0.3.10
So, when I train a NGBRegressor with xgboost as base learner, I get the following warning message:
c:\temp\python\python3.6.5\lib\site-packages\xgboost\core.py:445: UserWarning: Use subset (sliced data) of np.ndarray is not recommended because it will generate extra copies and increase memory consumption
"memory consumption")
which may be the source of the poor result shown on the plot on the left in attached image.
Is it possible to use xgboost as a base learner? Please advise.
The code source is as follows:
import numpy as np
import xgboost as xgb
import ngboost
from sklearn.tree import DecisionTreeRegressor
from sklearn.datasets import load_boston
from sklearn.metrics import median_absolute_error
from sklearn.model_selection import train_test_split
import multiprocessing
if name == 'main':
cpu_count= 2 if (multiprocessing.cpu_count() < 4) else (multiprocessing.cpu_count() - 2)
Using xgboost with ngboost
learner= xgb.XGBRegressor(max_depth= 6, n_estimators= 300, verbosity= 1, objective= 'reg:squarederror',
booster= 'gbtree', tree_method= 'exact', n_jobs= cpu_count, learning_rate= 0.05, gamma= 0.15,
reg_alpha= 0.20, reg_lambda= 0.50, random_state= 1969)
comparison_xgboost-ngboost_against_only_ngboost.zip
The text was updated successfully, but these errors were encountered: