You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We recently tried using almond as a scala kernel within our jupyterlab environment but we are encountering errors when trying to use recent versions of spark.
Spark versions tested : 3.5.0
Scala version : 2.13.8 (scala version used in spark 3.5.0)
Java version : 17.0.2
Almond versions (tried) : 0.13.14 and 0.14.0-RC13
The errors arise when trying to send code to the execs, hence inducing serialization and deserialization. Other operations work fine (count, show, etc..)
Here's a minimalistic example causing the error :
import org.apache.spark.rdd.RDD
val rdd:RDD[Int] = spark.sparkContext.parallelize(List(1,2,3,4,5))
val multipliedRDD = rdd.map(_ * 2)
println(multipliedRDD.collect().mkString(", "))
and the error is :
java.lang.ClassCastException: cannot assign instance of java.lang.invoke.SerializedLambda to field org.apache.spark.rdd.MapPartitionsRDD.f of type scala.Function3 in instance of org.apache.spark.rdd.MapPartitionsRDD
Note that running the exact same code in a spark-shell on the jupyterlab instance works fine, hence the problem must come from almond.
Our best guess is that the classpath used by almond imports mismatched versions of some libs, but we don't have any proof that this could be the issue
Second note : we tried using our own spark installed in the jupyterlab image AND installing spark directly with ivy from a scala notebook, both induce the same error.
Does anyone have any idea what could be causing this issue ?
The text was updated successfully, but these errors were encountered:
Hello,
We recently tried using almond as a scala kernel within our jupyterlab environment but we are encountering errors when trying to use recent versions of spark.
Spark versions tested : 3.5.0
Scala version : 2.13.8 (scala version used in spark 3.5.0)
Java version : 17.0.2
Almond versions (tried) : 0.13.14 and 0.14.0-RC13
The errors arise when trying to send code to the execs, hence inducing serialization and deserialization. Other operations work fine (count, show, etc..)
Here's a minimalistic example causing the error :
and the error is :
Note that running the exact same code in a spark-shell on the jupyterlab instance works fine, hence the problem must come from almond.
Our best guess is that the classpath used by almond imports mismatched versions of some libs, but we don't have any proof that this could be the issue
Second note : we tried using our own spark installed in the jupyterlab image AND installing spark directly with ivy from a scala notebook, both induce the same error.
Does anyone have any idea what could be causing this issue ?
The text was updated successfully, but these errors were encountered: