英文:
Which JDK to use with Spark?
问题
我是新手使用Spark,经常遇到各种"module java.base does not export XXX"的问题。我一直在JVM中添加更多的--add-open选项。关于这些问题有很多Stack Overflow的帖子。
这个帖子有一个相当长的列表。
我现在有以下选项:
--add-opens=java.base/java.nio=ALL-UNNAMED
--add-opens java.base/java.lang=ALL-UNNAMED
--add-opens=java.base/java.lang.invoke=ALL-UNNAMED
--add-opens=java.base/java.lang.reflect=ALL-UNNAMED
--add-opens=java.base/java.io=ALL-UNNAMED
--add-opens=java.base/java.net=ALL-UNNAMED
--add-opens=java.base/java.nio=ALL-UNNAMED
--add-opens java.base/java.util=ALL-UNNAMED
--add-opens=java.base/java.util.concurrent=ALL-UNNAMED
--add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED
--add-opens=java.base/sun.nio.ch=ALL-UNNAMED
--add-opens=java.base/sun.nio.cs=ALL-UNNAMED
--add-opens=java.base/sun.security.action=ALL-UNNAMED
--add-opens=java.base/sun.util.calendar=ALL-UNNAMED
--add-opens=java.security.jgss/sun.security.krb5=ALL-UNNAMED
我似乎没有遇到任何问题。但这令人不安。据我所知,这些选项没有文档。你必须不断尝试选项,直到不再出现错误。
所以我的问题是:对于Spark,推荐使用哪个JDK?3.4.0的发布说明似乎暗示Java 8即将被弃用。我想使用Java 17,因为它有新的语言特性,而且我预期我的项目依赖项将来在Java 8中不再可用。
也许更好的思考方式是:在Spark的路线图上,是否会不再需要添加所有这些未记录的选项,如果有的话,有没有关于JDK 8支持结束的时间表?
附带问题:对于IntelliJ IDEA IDE来说,这真是一大痛苦,因为这些选项必须粘贴到每个运行配置中。有点次要的问题,但这些选项能否放在IDEA的全局位置,以便所有运行配置都能使用?
附带问题:我不使用Hadoop,是否可以通过某种方式排除Spark的支持,从而提供更简化的选项?
更新:一位同事告诉我将这些选项放入一个文件中,然后在JVM选项中使用@filepath,会使事情变得更容易。
英文:
I am new to Spark, and I keep running into various "module java.base does not export XXX". I keep adding more --add-open options to the JVM. There are a lot of SO posts about these issues.
this post has a pretty long list.
I am now at these options:
--add-opens=java.base/java.nio=ALL-UNNAMED
--add-opens java.base/java.lang=ALL-UNNAMED
--add-opens=java.base/java.lang.invoke=ALL-UNNAMED
--add-opens=java.base/java.lang.reflect=ALL-UNNAMED
--add-opens=java.base/java.io=ALL-UNNAMED
--add-opens=java.base/java.net=ALL-UNNAMED
--add-opens=java.base/java.nio=ALL-UNNAMED
--add-opens java.base/java.util=ALL-UNNAMED
--add-opens=java.base/java.util.concurrent=ALL-UNNAMED
--add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED
--add-opens=java.base/sun.nio.ch=ALL-UNNAMED
--add-opens=java.base/sun.nio.cs=ALL-UNNAMED
--add-opens=java.base/sun.security.action=ALL-UNNAMED
--add-opens=java.base/sun.util.calendar=ALL-UNNAMED
--add-opens=java.security.jgss/sun.security.krb5=ALL-UNNAMED
and I don't seem to have any issues. But this is disturbing. These options are not documented AFAIK. You just have to keep figuring out options until you stop getting errors.
So my question is: Which JDK is recommended for Spark? Release notes for 3.4.0 rather sound like Java 8 is on the way to being deprecated. And I would like to use Java 17 because of new language features, and the expectation that my project dependencies will no longer be available in Java 8 some day.
Perhaps a better way to think of this is: Where on Spark's roadmap, if at all, will there no longer be a requirement to add all of these undocumented options? Is there a timeline for JDK 8 support end-of-life?
PS: This is a real pain for Intellij IDEA IDE because these options have to be pasted into every Run Configuration. Kind of a side question, but can these options be put into a global place in IDEA so that all Run Configurations pick it up?
PPS: I am not using Hadoop, does that make simpler options available by somehow excluding that support from Spark?
UPDATE: a colleague told me to put these in a file and use @filepath in the JVM options, makes things somewhat easier.
答案1
得分: 1
Spark 3.4.0 在 Java 8/11/17、Scala 2.12/2.13、Python 3.7+ 和 R 3.5+ 上运行。
在 Spark 3.4.0 中,Java 8 版本在 8u362 之前的支持已被弃用。
英文:
Spark 3.4.0 runs on Java 8/11/17, Scala 2.12/2.13, Python 3.7+, and R 3.5+.
Java 8 prior to version 8u362 support is deprecated as of Spark 3.4.0.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论