英文:
Adjust classpath / change spring version in azure databricks
问题
我正尝试在Azure Databricks中使用Apache Spark/Ignite集成。我使用Databricks UI安装了org.apache.ignite:ignite-spark-2.4:2.9.0 Maven库。但在访问我的Ignite缓存时出现错误:
java.lang.NoSuchMethodError: org.springframework.util.ReflectionUtils.clearCache()V
at org.springframework.context.support.AbstractApplicationContext.resetCommonCaches(AbstractApplicationContext.java:907)
at org.springframework.context.support.AbstractApplicationContext.refresh(AbstractApplicationContext.java:567)
这里的AbstractApplicationContext
与不同版本的ReflectionUtils
一起编译。
我看到spring-core-4.3.26.RELEASE.jar在**/dbfs/FileStore/jars/maven/org/springframework目录下安装,这是在org.apache.ignite:ignite-spark-2.4:2.9.0安装过程中完成的,而在/dbfs/FileStore/jars**目录下没有其他spring版本的jar文件。
但似乎Databricks内部使用的是spring-core__4.1.4。
ls /databricks/jars | grep spring
打印输出:
spark--maven-trees--spark_2.4--com.clearspring.analytics--stream--com.clearspring.analytics__stream__2.7.0.jar
spark--maven-trees--spark_2.4--org.springframework--spring-core--org.springframework__spring-core__4.1.4.RELEASE.jar
spark--maven-trees--spark_2.4--org.springframework--spring-test--org.springframework__spring-test__4.1.4.RELEASE.jar
我不是Java程序员,所以没有经验来解决这种冲突。
是否可能以某种方式调整Databricks的类路径或以其他方式解决此问题?
也许很容易调整类路径,但我不知道怎么做。我只是在Databricks文档中看到一条备注,可以在init脚本中更改类路径。我可以创建一个init脚本,以前做过这样的事情,但我应该在其中做什么呢?
我尝试过不同的Databricks运行时版本,目前正在尝试使用6.6版本。我认为Apache Ignite与Spark 3没有集成。
英文:
I'm trying to use Apache Spark/Ignite integration in Azure Databricks. I install the org.apache.ignite:ignite-spark-2.4:2.9.0 maven library using the Databricks UI. And I have an error while accessing my ignite cahces:
: java.lang.NoSuchMethodError: org.springframework.util.ReflectionUtils.clearCache()V
at org.springframework.context.support.AbstractApplicationContext.resetCommonCaches(AbstractApplicationContext.java:907)
at org.springframework.context.support.AbstractApplicationContext.refresh(AbstractApplicationContext.java:567)
Here the AbstractApplicationContext
is compiled with ReflectionUtils
of different spring version.
I see the spring-core-4.3.26.RELEASE.jar is installed in the /dbfs/FileStore/jars/maven/org/springframework during the org.apache.ignite:ignite-spark-2.4:2.9.0 installation and there are no other spring version jars under the /dbfs/FileStore/jars
But it seems the databricks internally uses spring-core__4.1.4.
%sh
ls /databricks/jars | grep spring
prints:
spark--maven-trees--spark_2.4--com.clearspring.analytics--stream--com.clearspring.analytics__stream__2.7.0.jar
spark--maven-trees--spark_2.4--org.springframework--spring-core--org.springframework__spring-core__4.1.4.RELEASE.jar
spark--maven-trees--spark_2.4--org.springframework--spring-test--org.springframework__spring-test__4.1.4.RELEASE.jar
I'm not a java programmer, so I'm not experienced to solve this kind of conflicts.
Is it possible to adjust the databricks classpath somehow or solve this problem some other way?
It may be very easy to adjust the classpath, but I don't know how. I just see in the databricks documentation a remark that it's possible to change the classpath in init-script. I can create an init-script, have done that before, but what exactly should I do in it?
I've tried different databricks runtime versions and I try to use the 6.6 at the moment. I think Apache Ignite has no integration with the spark 3.
答案1
得分: 2
以下是翻译好的内容:
在链接 https://kb.databricks.com/libraries/replace-default-jar-new-jar.html 中,我按照以下方式创建了初始化脚本:
dbutils.fs.mkdirs("dbfs:/databricks/scripts/")
dbutils.fs.put("dbfs:/databricks/scripts/install_spring.sh",
"""
rm -rf /databricks/jars/spark--maven-trees--spark_2.4--com.h2database--h2--com.h2database__h2__1.3.174.jar
rm -rf /databricks/jars/spark--maven-trees--spark_2.4--org.springframework--spring-core--org.springframework__spring-core__4.1.4.RELEASE.jar
rm -rf /databricks/jars/spark--maven-trees--spark_2.4--org.springframework--spring-test--org.springframework__spring-test__4.1.4.RELEASE.jar
cp /dbfs/FileStore/jars/maven/com/h2database/h2-1.4.197.jar /databricks/jars/
cp /dbfs/FileStore/jars/maven/org/springframework/spring-core-4.3.26.RELEASE.jar /databricks/jars/
cp /dbfs/FileStore/jars/maven/org/springframework/spring-test-4.3.26.RELEASE.jar /databricks/jars/
""", True)
之后,我在集群上注册了此初始化脚本,集成的 Ignite 就能够正常工作了(org.apache.ignite:ignite-spark-2.4:2.9.0,ignite 2.9.0,azure databricks 6.6)。
在 /databricks/jars 目录下预装了大约 500 个 JAR 文件,有可能我已经破坏了一些依赖关系,但对于我的任务还没有注意到任何副作用。
英文:
Following the link https://kb.databricks.com/libraries/replace-default-jar-new-jar.html I created the init script like this:
dbutils.fs.mkdirs("dbfs:/databricks/scripts/")
dbutils.fs.put("dbfs:/databricks/scripts/install_spring.sh",
"""
rm -rf /databricks/jars/spark--maven-trees--spark_2.4--com.h2database--h2--com.h2database__h2__1.3.174.jar
rm -rf /databricks/jars/spark--maven-trees--spark_2.4--org.springframework--spring-core--org.springframework__spring-core__4.1.4.RELEASE.jar
rm -rf /databricks/jars/spark--maven-trees--spark_2.4--org.springframework--spring-test--org.springframework__spring-test__4.1.4.RELEASE.jar
cp /dbfs/FileStore/jars/maven/com/h2database/h2-1.4.197.jar /databricks/jars/
cp /dbfs/FileStore/jars/maven/org/springframework/spring-core-4.3.26.RELEASE.jar /databricks/jars/
cp /dbfs/FileStore/jars/maven/org/springframework/spring-test-4.3.26.RELEASE.jar /databricks/jars/
""", True)
After that I registered this init script on the cluster and the ignite integration worked for me (org.apache.ignite:ignite-spark-2.4:2.9.0, ignite 2.9.0, azure databricks 6.6)
There are about 500 jar files preinstalled under /databricks/jars and it's possible I've broken some dependencies, but have not notice some side effects for my task.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论