Move internal shap algorithms into separate namespace.#11985
Move internal shap algorithms into separate namespace.#11985RAMitchell wants to merge 20 commits intodmlc:masterfrom
Conversation
tests/cpp/predictor/test_shap.cc
Outdated
There was a problem hiding this comment.
It is quite annoying to get a GBTreeModel out of a booster.
There was a problem hiding this comment.
Yeah, I would like to split up the concept of "model" and everything else like "optimizers/tree builders". But it might be too much for XGBoost at its current state.
There was a problem hiding this comment.
I know this is WIP, but let's keep the objective of avoiding code duplication in mind. I tried to find ways to unify the code between CSR DMatrix and the gradient index, but it got a bit messy. But I think unifying the code is worth some extra layers of complexity.
There was a problem hiding this comment.
Agreed. I did notice that and had planned to do something about it.
There was a problem hiding this comment.
I put the data accessors for predictor in a header so I could use them in shap.
|
I am having two major difficulties with this PR. The categorical recoding logic is complicated and I don't want to carry this through to the interpretability module. The other is to simply get the tree ensemble out of learner which seems somehow impossible in the current setup. |
|
I can look into simplifying the categorical features after holiday. |
There was a problem hiding this comment.
Pull request overview
This PR refactors SHAP-related implementation details out of the CPU/GPU predictors into a new internal xgboost::interpretability namespace, aiming to reduce duplication and better isolate interpretability code paths.
Changes:
- Introduces new internal SHAP dispatch API (
interpretability::ShapValues,ShapInteractionValues,ApproxFeatureImportance) with CPU and CUDA implementations. - Refactors CPU/GPU predictors to delegate SHAP/interaction contribution computation to the new interpretability layer.
- Extracts shared CPU/GPU data access utilities into dedicated headers and updates SHAP tests to use the new entry points (with reduced runtime).
Reviewed changes
Copilot reviewed 13 out of 13 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
tests/cpp/predictor/test_shap.cc |
Updates SHAP tests to call new interpretability SHAP entry points and reduces dataset sizes/iters. |
src/predictor/interpretability/shap.h |
Adds internal SHAP dispatch header (CPU/CUDA routing). |
src/predictor/interpretability/shap.cc |
Adds CPU implementations for SHAP values, interactions, and approximate importance. |
src/predictor/interpretability/shap.cu |
Adds CUDA implementations for SHAP values/interactions using GPUTreeShap. |
src/predictor/gpu_predictor.cu |
Removes inlined GPU SHAP logic and delegates to interpretability SHAP functions; reuses extracted GPU accessors. |
src/predictor/gpu_data_accessor.cuh |
New shared GPU sparse/ellpack access helpers (used by GPU predictor and SHAP). |
src/predictor/data_accessor.h |
New shared CPU batch-to-FVec access helpers (used by CPU predictor and CPU SHAP). |
src/predictor/cpu_predictor.cc |
Removes inlined CPU SHAP logic and delegates to interpretability SHAP functions; uses extracted accessors. |
src/gbm/gbtree.h |
Formatting-only adjustments. |
src/gbm/gbtree.cc |
Formatting-only adjustments. |
include/xgboost/predictor.h |
Macro formatting-only adjustment. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
# Conflicts: # src/predictor/cpu_predictor.cc # src/predictor/gpu_predictor.cu
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 13 out of 13 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
tests/cpp/predictor/test_shap.cc
Outdated
There was a problem hiding this comment.
The tree_weights parameter is a pointer type (std::vector<float> const*), but the test is passing {} (an empty brace-initializer). This will not compile correctly. Pass nullptr instead to indicate no tree weights.
| interpretability::ShapInteractionValues(dmat->Ctx(), p_dmat.get(), &shap_interactions, *gbtree, 0, | ||
| {}, false); |
There was a problem hiding this comment.
The tree_weights parameter is a pointer type (std::vector<float> const*), but the test is passing {} (an empty brace-initializer). This will not compile correctly. Pass nullptr instead to indicate no tree weights.
No description provided.