Skip to content

Latest commit

 

History

History
 
 

shap_speed

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 

SHAP values comparison

Introduction

In this benchmark we evaluated the shap values calculation performance for different gradient boosting libraries. The original paper about shap values you could find here and the official implementation here.

Let us briefly overview the asymptotic analysis for different libraries.

Catboost:

where the AverageFeatureCount is an average number of features over all trees which could be found in a tree.

XGBoost and LightGBM:

As we're interested on the large scale datasets the most important values there are TreesCount and DocsCount. The factor of TreesCount * DocsCount in Catboost (TreeDepth + AverageFeaturesCount) is much smaller than in other libraries (LeavesCount * TreeDepth^2) such as XGBoost and LightGBM. So the larger the dataset is, the larger performance gain is achieved with Catboost.

For small datasets (when LeavesCount > DocsCount, usecase: we're trying to find SHAP values for a particular document) we use a direct algorithm like in XGBoost. One can specify this behavior using shap_mode option in get_feature_importance.

Experiment infrastructure:

  • GPU: Titan X Pascal (used only for training)
  • Intel(R) Xeon(R) CPU E5-2683 v3 @ 2.00GHz

We trained models on GPU but all evaluations were done on CPU.

Parameters

We run experiments on different depths and test sizes for each library. max bin parameter was set up to 128 and other parameters were default for every library.

Dataset

We used Epsilon dataset (400К samples | 2000 features) to benchmark our performance.

Results

Time in the table is given in seconds and we didn't take into account time for data preprocessing.

depth test size catboost lightgbm xgboost
2 1000 0.311 0.090 0.112
2 5000 1.171 0.284 0.241
2 10000 2.048 0.621 0.509
4 1000 0.281 0.578 0.300
4 5000 1.081 2.094 0.931
4 10000 2.263 4.291 1.935
6 1000 0.464 4.159 1.468
6 5000 1.319 20.624 6.498
6 10000 2.396 42.788 12.981
8 1000 4.918 23.844 7.847
8 5000 5.807 118.552 38.992
8 10000 7.078 240.614 77.883
10 1000 93.152 119.527 30.872
10 5000 95.049 601.251 153.408
10 10000 95.680 1189.685 306.529

Also we compared time for data preprocessing for every test size (average time is given in seconds for 5 runs).

test size catboost lightgbm xgboost
1000 0.069 0.002 0.011
5000 0.349 0.001 0.047
10000 0.770 0.001 0.089