Different Scale of SHAP values for Approx vs. ExactSHAP #31

simonschoe · 2022-01-06T09:01:50Z

Hi there,

great work with the package first and foremost.

Quick question: Does the ApproxSHAP method scale or standardize SHAP values in any way? When I create global feature attribution rankings for a GBM using Approx as well as TreeSHAP, my SHAP values end up being on substantially different scale. For example, using ApproxSHAP the mean absolute values are in the range of 0.01-0.18 while they lie between 1-14 using TreeSHAP.

Thanks in advance!

bgreenwell · 2022-01-18T15:16:56Z

Hi @simonschoe, it would be helpful if you could post a reproducible example for me to run on my end. In general, the approximate method used by fastshap depends on the variance of the feature columns. Some problems will require more Monte Carlo reps (say, nsim > 100) to get stable results. It's also useful to set adjust = TRUE in the call to explain(). Let me know if this helps!

simonschoe · 2022-02-07T13:21:46Z

Hi @bgreenwell thanks for your reply - sorry for the delay...

Unfortunately, it is difficult for me to provide a reproducible example since the entire workflow is predicated on proprietary data. What I can provide, however, is the following:

shap_values_gbm <- fastshap::explain(
  extract_fit_engine(final_fit_gbm),
  X = X_gbm,
  pred_wrapper = function(object, newdata) predict(object, newdata),
  exact = T,
  newdata = NULL,
  .parallel = T
)

shap_values_gbm2 <- fastshap::explain(
  extract_fit_engine(final_fit_gbm),
  X = X_gbm,
  pred_wrapper = function(object, newdata) predict(object, newdata),
  nsim = 1000, adjust = T,
  newdata = NULL,
  .parallel = T
)

These are the two snippets that run TreeSHAP and ApproxSHAP on my machine, respectively. The resulting top 10 rankings look as follows (the code in between the computation of shap_values_gbm/shap_values_gbm2 and the generation of the plot is identical for both approaches):

TreeSHAP

ApproxSHAP

Hope that this may provide some context as to why/how the difference occurs? Best, Simon

bgreenwell · 2022-02-09T16:27:12Z

@simonschoe The only thing I can think of is the scale on which the Shapley values are being returned in each approach. For example, in a binary outcome in a GLM, Shapley values could be returned on the link or response scale. The pred_wrapper argument let's you specify this manually, but it may not match internally with what's produced by XGBoost when using the exact (i.e., XGBoost's internal) SHAP procedure.

simonschoe · 2022-02-19T07:27:19Z

@bgreenwell But if it is simply a scaling issue shouldnt I at least obtain somewhat similar rank orderings and the same sign of the effect? In the above example, despite running 1k simulations, the two rankings are still so different from each other...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Different Scale of SHAP values for Approx vs. ExactSHAP #31

Different Scale of SHAP values for Approx vs. ExactSHAP #31

simonschoe commented Jan 6, 2022

bgreenwell commented Jan 18, 2022

simonschoe commented Feb 7, 2022

bgreenwell commented Feb 9, 2022

simonschoe commented Feb 19, 2022 •

edited

Loading

Different Scale of SHAP values for Approx vs. ExactSHAP #31

Different Scale of SHAP values for Approx vs. ExactSHAP #31

Comments

simonschoe commented Jan 6, 2022

bgreenwell commented Jan 18, 2022

simonschoe commented Feb 7, 2022

bgreenwell commented Feb 9, 2022

simonschoe commented Feb 19, 2022 • edited Loading

simonschoe commented Feb 19, 2022 •

edited

Loading