-
Notifications
You must be signed in to change notification settings - Fork 117
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for Scikit-Survival models that is compatible with Sklearn? #174
Comments
I'm going to explore your earlier XGBoost example a bit in order to gain a better understanding about the state-of-the-art in survival analysis. The fundamental problem here is that "survival" appears to be a different endpoint than "regression". The PMML specification does not provide a dedicated "survival" mining function type: https://dmg.org/pmml/v4-4-1/GeneralStructure.html#xsdType_MINING-FUNCTION The obvious fix would be to define a new mining function type ourselves. I guess it's safe to say today that it's not reasonable to count on Data Mining Group's help here, because they're largely non-operational (still waiting to receive an initial feedback on some feature requests that I posted to them 1+ year ago).
The JPMML-SkLearn library already provides a PMML converter for the RandomForest class. RandomSurvivalForest and RandomForest should use identical tree ensemble data structures. Therefore, it would be build a PMML converter for RandomSurvivalForest by simply applying some post-processing to RandomForest prediction. |
The PMML specification currently defines a "survival" endpoint for linear models (jump to the "Cox Regression Model Explanation and Examples" section): This approach should be generalizable to other model types (eg. decision tree ensembles). |
I found that in an older version of the R package "pmml", it can export the Random Survival Forest consctructed by an old version of "randomForesSRC" package. I'm not sure why the later versions of pmml R package removed this function. I can help with Python, R codes related to survival analysis, but I'm not familiar with PMML format... install.packages("remotes") data(veteran) |
This converter was using some proprietary super-hackish way of encoding the "survival" transformation. Basically, it was a tool for enriching the standard I'm just saying that it might be worthwhile to take some time and design a proper and future-proof extension to the latest PMML standard. When speaking about |
Is is possible to export models trained using Scikit-Survival (sksurv)?
This is the repos for sksurv: https://github.com/sebp/scikit-survival
sksurv contains a RandomSurvivalForest algorithm which extend RandomForest to right-censored survival data.
In standard RandomForest, the regression target y is a number, but in survival data , the labels are in the form of [time, event_indicator]. If event_indicator==1, then time is the same as y (event is observed); however, when event_indicator == 0, we only know taht y>time (event is not observed up to the observed time).
Any help would be appreciated!
The text was updated successfully, but these errors were encountered: