You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I get this error during clustering, that I think is introduced by #987 in 2.18.0-SNAPSHOT+0~20231207231802.1143~1.gbp8802a4:
Exception in thread "main" org.apache.beam.sdk.Pipeline$PipelineExecutionException: java.lang.IllegalStateException: Error encoding value: ValueInGlobalWindow{value=KV{5[179/10458$
0|-710|1996|7|28, au.org.ala.clustering.HashKeyOccurrence@318a655e}, pane=PaneInfo.NO_FIRING}
at org.apache.beam.runners.spark.SparkPipelineResult.beamExceptionFrom(SparkPipelineResult.java:73)
at org.apache.beam.runners.spark.SparkPipelineResult.waitUntilFinish(SparkPipelineResult.java:104)
at org.apache.beam.runners.spark.SparkPipelineResult.waitUntilFinish(SparkPipelineResult.java:92)
at au.org.ala.pipelines.beam.ClusteringPipeline.run(ClusteringPipeline.java:373)
at au.org.ala.pipelines.beam.ClusteringPipeline.main(ClusteringPipeline.java:81)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:845)
at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161)
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184)
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:920)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:929)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.IllegalStateException: Error encoding value: ValueInGlobalWindow{value=KV{5051752|38120|-710|1996|7|28, au.org.ala.clustering.HashKeyOccurrence@318a655e}, pane
=PaneInfo.NO_FIRING}
at org.apache.beam.runners.spark.coders.CoderHelpers.toByteArray(CoderHelpers.java:60)
at org.apache.beam.runners.spark.translation.GroupNonMergingWindowsFunctions.lambda$groupByKeyAndWindow$c9b6f5c4$1(GroupNonMergingWindowsFunctions.java:87)
at org.apache.beam.runners.spark.translation.GroupNonMergingWindowsFunctions.lambda$bringWindowToKey$0(GroupNonMergingWindowsFunctions.java:130)
at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.Iterators$6.transform(Iterators.java:785)
at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.TransformedIterator.next(TransformedIterator.java:47)
at scala.collection.convert.Wrappers$JIteratorWrapper.next(Wrappers.scala:43)
at scala.collection.Iterator$$anon$12.next(Iterator.scala:445)
at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:201)
at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:62)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:55)
at org.apache.spark.scheduler.Task.run(Task.scala:123)
at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:408)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Caused by: org.apache.beam.sdk.coders.CoderException: cannot encode a null String
at org.apache.beam.sdk.coders.StringUtf8Coder.encode(StringUtf8Coder.java:74)
at org.apache.beam.sdk.coders.StringUtf8Coder.encode(StringUtf8Coder.java:68)
at org.apache.beam.sdk.coders.StringUtf8Coder.encode(StringUtf8Coder.java:37)
at org.apache.beam.sdk.coders.IterableLikeCoder.encode(IterableLikeCoder.java:114)
at org.apache.beam.sdk.coders.IterableLikeCoder.encode(IterableLikeCoder.java:60)
at org.apache.beam.sdk.coders.RowCoderGenerator$EncodeInstruction.encodeDelegate(RowCoderGenerator.java:337)
at org.apache.beam.sdk.coders.Coder$ByteBuddy$5tgsQW7H.encode(Unknown Source)
at org.apache.beam.sdk.coders.Coder$ByteBuddy$5tgsQW7H.encode(Unknown Source)
at org.apache.beam.sdk.schemas.SchemaCoder.encode(SchemaCoder.java:124)
at org.apache.beam.sdk.coders.Coder.encode(Coder.java:136)
at org.apache.beam.sdk.coders.KvCoder.encode(KvCoder.java:73)
at org.apache.beam.sdk.coders.KvCoder.encode(KvCoder.java:37)
at org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.encode(WindowedValue.java:591)
at org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.encode(WindowedValue.java:582)
at org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.encode(WindowedValue.java:542)
at org.apache.beam.runners.spark.coders.CoderHelpers.toByteArray(CoderHelpers.java:58)
Command:
sudo -u spark la-pipelines clustering all --cluster
I get this error during clustering, that I think is introduced by #987 in
2.18.0-SNAPSHOT+0~20231207231802.1143~1.gbp8802a4
:Command:
cc @adam-collins .
The text was updated successfully, but these errors were encountered: