你可以在Windows上使用预构建的Spark而无需Hadoop吗?

huangapple go评论45阅读模式
英文:

Can I use Spark prebuilt without hadoop on Windows?

问题

我正在生产环境的Unix服务器上使用不带Hadoop的预构建Spark 3.1.3。Spark以独立模式运行。我使用本地文件系统而不是像Hadoop这样的分布式文件系统。

理想情况下,我想在本地复制我的生产环境,但不幸的是,我只能在Windows上使用。

通常情况下,我能够在Windows上运行Spark,方法是使用为Hadoop Y预构建的Spark 3.1.3,并使用此处提供的winutils工具:https://github.com/steveloughran/winutils

我理解winutils是模拟Hadoop而不是Unix文件系统。

我能否在生产环境和我的Windows开发机上使用完全相同的Spark二进制文件?还是我只能在本地使用为Hadoop预构建的Spark?

你能解释为什么两种解决方案都有效吗?

我尝试在本地运行我的Spark脚本,使用不带Hadoop的预构建版本,但无法启动我的脚本。(将在回到Windows机器后提供一些日志并编辑此内容。)

英文:

I'm using Spark 3.1.3 prebuilt without Hadoop on a production unix based server. Spark is running in standalone mode. I'm using local filesystem rather than a distributed filesystem such as Hadoop.

I'd ideally like to replicate my production environment locally but unfortunately I'm restricted to using Windows.

Typically, I am able to run Spark on Windows by using Spark 3.1.3 prebuilt for Hadoop Y and using the winutils tool provided here: https://github.com/steveloughran/winutils

It's my understanding that winutils is simulating Hadoop rather than a unix FS.

Am I able to use the exact same Spark binaries in production and on my Windows development machine? Or am I restricted to using Spark prebuilt for Hadoop locally?

Can you explain why either solution works?

I tried running my Spark scripts locally using the version prebuilt without Hadoop but I'm unable to start my scripts. (Will provide some logs and edit this when I'm back on my Windows machine)

答案1

得分: 1

"Without" 只指的是下载的 tarball 中的脚本/库。更正确的术语应该是 "自带 Hadoop"。您仍然需要设置 HADOOP_CONF_DIR + HADOOP_HOME,以及 HDFS 客户端 JAR 库来使用本地文件系统。

是的,您可以在 Windows 上使用 Spark,只需包括正确版本的 Winutils。或者您可以使用 WSL2 并在完整的 Unix 环境中下载 Spark。

英文:

"Without" only refers to the scripts/libraries in the downloaded tarball. The more correct term would be "bring your own Hadoop". You will still need HADOOP_CONF_DIR + HADOOP_HOME set, as well as HDFS client JAR libraries to use a local FS.

Yes, you can use Spark on Windows by including the correct version of Winutils. Or you can use WSL2 and download Spark within a full Unix environment.

huangapple
  • 本文由 发表于 2023年2月10日 06:26:36
  • 转载请务必保留本文链接:https://go.coder-hub.com/75405097.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定