-
-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simplifying projection matrix rows #262
Comments
This is hard to support as of now because we sample oblique splits currently using a "density" hyper parameter that dictates how many non-zeros there are in the projection matrix. This sometimes makes certain projection rows just either all 0's, or only one +1/-1. In order to flip all the -1 rows to +1, we would have to add an extra computation, slowing down the overall training of trees. I think this would have to be handled downstream if other packages want a "simple" interpretation of those specific splits? |
You would standardize the projection matrix only once. Basically, you'd iterate over PM row-wise, and if the "effective row length" is one (ie. contains only one non-zero element), you'd set this one element to
I tried to "invert" these negative splits during PMML conversion. Something like: If input is |
But would it be possible to add some training parameter, which allows the data scientist to indicate if she's willing to accept a slight performance penalty during model training, in order to get much simplified oblique trees for later prediction and interpretation?
Right now, when you train a simplistic oblique decision tree classifier for the iris dataset, then you get two types of splits per feature. For example, This makes the interpretation of oblique forests twice as hard as it could be. |
@jovo any thoughts on how to best handle this? |
Is your feature request related to a problem? Please describe.
While developing a PMML converter for oblique trees (see #255), I noticed that the projection matrix (as retrievable via the
ObliqueTree.proj_vecs
attribute) contains two types of "axis aligned split" definitions (ie. projection matrix rows where only a single row element is set to a non-zero value).These two types are:
1.0
. For example,[0, 0, 1, 0]
.-1.0
. For example,[0, -1, 0, 0]
.Describe the solution you'd like
I would propose that all axis aligned splits should be standardized to the positive/default axis aligned split representation.
Negating a split condition does not add any information to it. But it makes interpreting the resulting oblique tree more complicated, because the associated split threshold value also appears negated.
feature <= threshold
-1 * feature <= -1 * threshold
In other words, the algorithm should not multiply standalone feature values with
-1
during training. It should keep them as-is.Describe alternatives you've considered
The current behaviour (SkTree 0.7.2) is okay, but the resulting oblique trees are unnecessarily complicated.
The text was updated successfully, but these errors were encountered: