英文:
Upgrade to spark 3.4.0 from 3.3.2 gives Exception in thread "main" java.nio.file.NoSuchFileException: , although jar is present in the location
问题
以下是您要翻译的内容:
我有一个使用Kubernetes部署的Spark作业,版本为3.3.2
最近,Spark 3.3.2中发现了一些漏洞
我已经更改了我的Dockerfile,以下载3.4.0而不是3.3.2,而且我的应用程序JAR是基于Spark 3.4.0构建的
然而,在部署时,我遇到了这个错误
异常线程"main"java.nio.file.NoSuchFileException: <path>/spark-assembly-1.0.jar
其中"spark-assembly-1.0.jar"是包含我的Spark作业的JAR文件。
我在应用程序的deployment.yaml中有以下内容
mainApplicationFile: "local:///<path>/spark-assembly-1.0.jar"
而且我没有更改与此相关的任何内容。我注意到Spark 3.4.0核心源代码中已更改了一些关于JAR文件位置的代码。
它是否真的改变了功能?是否有其他人遇到与我相同的问题?路径应该以不同的方式指定吗?
英文:
I have a spark Job that is deployed using k8s and it is of version 3.3.2
Recently there were some vulneabilities in spark 3.3.2
I changed my dockerfile to download 3.4.0 instead of 3.3.2 and also my application jar is built on spark 3.4.0
However while deploying, I get this error
Exception in thread "main" java.nio.file.NoSuchFileException: <path>/spark-assembly-1.0.jar
where "spark-assembly-1.0.jar" is the jar which contain my spark job.
I have this in deployment.yaml of the app
mainApplicationFile: "local:///<path>/spark-assembly-1.0.jar"
and I have not changed anything related to that. I see that some code has changed in spark 3.4.0 core's source code regarding jar location.
Has it really changed the functionality ? Is there anyone who is facing same issue as me ? Should the path be specified in a different way.
答案1
得分: 3
我遇到了相同的问题。我相信行为变更是在以下引入的:
SPARK-43540 - 在K8S集群模式中将工作目录添加到驱动程序的类路径中
我们的Docker文件覆盖了基本Spark镜像的工作目录
FROM apache/spark:3.4.1@sha256:a1dd2487a97fb5e35c5a5b409e830b501a92919029c62f9a559b13c4f5c50f63 as image
WORKDIR /spark-jars
COPY --from=build /...../target/scala-2.12/my-spark.jar /spark-jars/
将其更改为以下内容解决了问题:
FROM apache/spark:3.4.1@sha256:a1dd2487a97fb5e35c5a5b409e830b501a92919029c62f9a559b13c4f5c50f63 as image
USER root
RUN mkdir /spark-jars
USER spark
COPY --from=build /...../target/scala-2.12/my-spark.jar /spark-jars/
希望这有所帮助!
英文:
I hit this same issue. I believe the behaviour change was introduced in:
> SPARK-43540 - Add working directory into classpath on the driver in K8S cluster mode
Our docker file was overriding the working directory of the base spark image
FROM apache/spark:3.4.1@sha256:a1dd2487a97fb5e35c5a5b409e830b501a92919029c62f9a559b13c4f5c50f63 as image
WORKDIR /spark-jars
COPY --from=build /...../target/scala-2.12/my-spark.jar /spark-jars/
Changing it to this solved the problem:
FROM apache/spark:3.4.1@sha256:a1dd2487a97fb5e35c5a5b409e830b501a92919029c62f9a559b13c4f5c50f63 as image
USER root
RUN mkdir /spark-jars
USER spark
COPY --from=build /...../target/scala-2.12/my-spark.jar /spark-jars/
Hope this helps!
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论