Add "dummy" estimator classes #34

zhangjunqiang · 2018-01-03T09:13:23Z

Hello:
When I use Converter like this
val oneHotPMML = ConverterUtil.toPMML(onehotSource.schema, oneHotModel),
I got a Error like this:

Exception in thread "main" java.lang.IllegalArgumentException: Expected a pipeline with one or more models, got a pipeline with zero models
	at com.netease.mail.yanxuan.rms.utils.ConverterUtil.toPMML(ConverterUtil.java:118)
	at com.netease.mail.yanxuan.rms.scala.nn.feature.FeatureModelExport$.main(FeatureModelExport.scala:29)
	at com.netease.mail.yanxuan.rms.scala.nn.feature.FeatureModelExport.main(FeatureModelExport.scala)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:755)
	at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
	at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119)
	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

After debug, I got the reason.
There didn't have any ModelConverter in my model.
Is it necessary that must have a ModelConverter in my pipelinemodel?

The text was updated successfully, but these errors were encountered:

vruusmann · 2018-01-03T09:26:17Z

Is it necessary that must have a ModelConverter in my pipelinemodel?

Yes, this requirement is clearly communicated by the exception message.

If you want to export pipelines that are feature transformation-dominant, then you should consider introducing a dummy (ie. no-op) model into the pipeline. For example, in Scikit-Learn you can use estimator types DummyRegressor and DummyClassifier for that purpose.

The model object is needed to define the "schema" of the pipeline - what are the input features, what are the output features. Without the model object the converter can only generate empty PMML documents.

zhangjunqiang · 2018-01-03T09:31:51Z

Thank you for your reply，does have any dummy model in the spark mllib? I use spark ml in my train.

vruusmann · 2018-01-03T09:43:38Z

does have any dummy model in the spark mllib?

Depending on your Apache Spark ML version, there may or may not be appropriate technical workarounds available.

For example, a potential solution:

Create a model-less Pipeline and fit it.
Take the fitted PipelineModel and "manually" append an appropriate org.apache.spark.ml.PredictionModel object instance to it. Please note that you would be dealing with a PredictorModel subclass here (representing a model that has been fitted), not with a Predictor subclass (representing a model that is yet to be fitted).
Design the "schema" of the above model to match the inputs and outputs of your feature transformation workflow.

vruusmann · 2018-01-03T09:44:49Z

Someone might search the Apache Spark JIRA, and see if there is a feature request for dummy estimator classes already available or not.

I wouldn't want to create and maintain these classes myself. But if absolutely necessary, I will do it.

vruusmann · 2018-01-03T17:39:17Z

Reopening, because I might want to provide some sort of easier workaround in the JPMML-SparkML library.

zhangjunqiang · 2018-01-05T09:31:54Z

@vruusmann So nice you are!

vruusmann changed the title ~~Expected a pipeline with one or more models, got a pipeline with zero models~~ Add "dummy" estimator classes Jan 3, 2018

zhangjunqiang closed this as completed Jan 3, 2018

vruusmann reopened this Jan 3, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add "dummy" estimator classes #34

Add "dummy" estimator classes #34

zhangjunqiang commented Jan 3, 2018 •

edited

Loading

vruusmann commented Jan 3, 2018

zhangjunqiang commented Jan 3, 2018

vruusmann commented Jan 3, 2018

vruusmann commented Jan 3, 2018

vruusmann commented Jan 3, 2018

zhangjunqiang commented Jan 5, 2018

Add "dummy" estimator classes #34

Add "dummy" estimator classes #34

Comments

zhangjunqiang commented Jan 3, 2018 • edited Loading

vruusmann commented Jan 3, 2018

zhangjunqiang commented Jan 3, 2018

vruusmann commented Jan 3, 2018

vruusmann commented Jan 3, 2018

vruusmann commented Jan 3, 2018

zhangjunqiang commented Jan 5, 2018

zhangjunqiang commented Jan 3, 2018 •

edited

Loading