Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add "dummy" estimator classes #34

Open
zhangjunqiang opened this issue Jan 3, 2018 · 6 comments
Open

Add "dummy" estimator classes #34

zhangjunqiang opened this issue Jan 3, 2018 · 6 comments

Comments

@zhangjunqiang
Copy link

zhangjunqiang commented Jan 3, 2018

Hello:
When I use Converter like this

val oneHotPMML = ConverterUtil.toPMML(onehotSource.schema, oneHotModel)
,
I got a Error like this:

Exception in thread "main" java.lang.IllegalArgumentException: Expected a pipeline with one or more models, got a pipeline with zero models
	at com.netease.mail.yanxuan.rms.utils.ConverterUtil.toPMML(ConverterUtil.java:118)
	at com.netease.mail.yanxuan.rms.scala.nn.feature.FeatureModelExport$.main(FeatureModelExport.scala:29)
	at com.netease.mail.yanxuan.rms.scala.nn.feature.FeatureModelExport.main(FeatureModelExport.scala)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:755)
	at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
	at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119)
	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

After debug, I got the reason.
There didn't have any ModelConverter in my model.
Is it necessary that must have a ModelConverter in my pipelinemodel?

@vruusmann
Copy link
Member

Is it necessary that must have a ModelConverter in my pipelinemodel?

Yes, this requirement is clearly communicated by the exception message.

If you want to export pipelines that are feature transformation-dominant, then you should consider introducing a dummy (ie. no-op) model into the pipeline. For example, in Scikit-Learn you can use estimator types DummyRegressor and DummyClassifier for that purpose.

The model object is needed to define the "schema" of the pipeline - what are the input features, what are the output features. Without the model object the converter can only generate empty PMML documents.

@vruusmann vruusmann changed the title Expected a pipeline with one or more models, got a pipeline with zero models Add "dummy" estimator classes Jan 3, 2018
@zhangjunqiang
Copy link
Author

Thank you for your reply,does have any dummy model in the spark mllib? I use spark ml in my train.

@vruusmann
Copy link
Member

does have any dummy model in the spark mllib?

Depending on your Apache Spark ML version, there may or may not be appropriate technical workarounds available.

For example, a potential solution:

  1. Create a model-less Pipeline and fit it.
  2. Take the fitted PipelineModel and "manually" append an appropriate org.apache.spark.ml.PredictionModel object instance to it. Please note that you would be dealing with a PredictorModel subclass here (representing a model that has been fitted), not with a Predictor subclass (representing a model that is yet to be fitted).
  3. Design the "schema" of the above model to match the inputs and outputs of your feature transformation workflow.

@vruusmann
Copy link
Member

Someone might search the Apache Spark JIRA, and see if there is a feature request for dummy estimator classes already available or not.

I wouldn't want to create and maintain these classes myself. But if absolutely necessary, I will do it.

@vruusmann
Copy link
Member

Reopening, because I might want to provide some sort of easier workaround in the JPMML-SparkML library.

@vruusmann vruusmann reopened this Jan 3, 2018
@zhangjunqiang
Copy link
Author

@vruusmann So nice you are!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants