英文:
transient fields of case class becomes null in Spark rdd
问题
我有一个接受 java 类 LinkedSparseMatrix(包 - no.uib.cipr.matrix.sparse)的 case class,如下所示:
case class A(mat: LinkedSparseMatrix)
当我尝试将 List[LinkedSparseMatrix] 转换为 Spark RDD 时,会抛出 TaskNotSerializableException。因此,我将字段声明为 transient。但是,然后所有的 mat 字段都变为 null,我认为这是由于默认值将对象定义为 transient。
所以,我尝试将变量定义为 lazy,并将我的类更改为 -
class A(m: LinkedSparseMatrix) extends Serializable {
@transient lazy val mat = m
// 其他代码
}
但现在,我仍然收到 java.io.NotSerializableException: no.uib.cipr.matrix.sparse.LinkedSparseMatrix - 这让我不明白为什么会出现这个错误!
是否有任何解决方法?提前致谢。
英文:
I've a case class that accepts an instance of java class LinkedSparseMatrix (package - no.uib.cipr.matrix.sparse
) as -
case class A(mat: LinkedSparseMatrix)
When I try to convert the List[LinkedSparseMatrix]
to Spark RDD, it throws TaskNotSerializableException
. So I declare the field as transient. But then all the mat fields becomes null which I think is due to default value for objects defined as transient.
So, I tried to define the variable as lazy and thus changing my class to -
class A (m: LinkedSparseMatrix) extends Serializable {
@transient lazy val mat = m
// some other code
}
But now, I'm still getting - java.io.NotSerializableException: no.uib.cipr.matrix.sparse.LinkedSparseMatrix
- which I don't understand why !
Is there any solution for this ? Thanks in advance.
答案1
得分: 1
你可以启用Kryo序列化,而不是默认的Java序列化。Kryo可以对对象进行序列化,而无需实现java.io.Serializable
。
英文:
You can enable Kryo serialization instead of the default Java serialization. Kryo can serialize objects without implementing java.io.Serializable
.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论