在Spark RDD中,case class的瞬态字段会变成null。

huangapple go评论82阅读模式
英文:

transient fields of case class becomes null in Spark rdd

问题

我有一个接受 java 类 LinkedSparseMatrix- no.uib.cipr.matrix.sparsecase class如下所示

    case class A(mat: LinkedSparseMatrix)

当我尝试将 List[LinkedSparseMatrix] 转换为 Spark RDD会抛出 TaskNotSerializableException因此我将字段声明为 transient但是然后所有的 mat 字段都变为 null我认为这是由于默认值将对象定义为 transient

所以我尝试将变量定义为 lazy并将我的类更改为 -

    class A(m: LinkedSparseMatrix) extends Serializable {
        @transient lazy val mat = m
        // 其他代码
    }

但现在我仍然收到 java.io.NotSerializableException: no.uib.cipr.matrix.sparse.LinkedSparseMatrix - 这让我不明白为什么会出现这个错误

是否有任何解决方法提前致谢
英文:

I've a case class that accepts an instance of java class LinkedSparseMatrix (package - no.uib.cipr.matrix.sparse) as -

case class A(mat: LinkedSparseMatrix)

When I try to convert the List[LinkedSparseMatrix] to Spark RDD, it throws TaskNotSerializableException. So I declare the field as transient. But then all the mat fields becomes null which I think is due to default value for objects defined as transient.

So, I tried to define the variable as lazy and thus changing my class to -

class A (m: LinkedSparseMatrix) extends Serializable {
    @transient lazy val mat = m
    // some other code
}

But now, I'm still getting - java.io.NotSerializableException: no.uib.cipr.matrix.sparse.LinkedSparseMatrix - which I don't understand why !

Is there any solution for this ? Thanks in advance.

答案1

得分: 1

你可以启用Kryo序列化,而不是默认的Java序列化。Kryo可以对对象进行序列化,而无需实现java.io.Serializable

英文:

You can enable Kryo serialization instead of the default Java serialization. Kryo can serialize objects without implementing java.io.Serializable.

huangapple
  • 本文由 发表于 2020年4月8日 20:23:07
  • 转载请务必保留本文链接:https://go.coder-hub.com/61100582.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定