英文:
How the executor memory is determined while running pyspark in local mode?
问题
如果我提交 Spark 程序如下:
->submit-spark --driver-memory 500M --executor-memory 300M spark_dataframe_example.py
然后分配的执行器内存是 "110 MiB",尽管我将其设置为 "300 M"。请参考下面的图片:
如果我提交 Spark 程序如下:
-> spark-submit --driver-memory 1G --executor-memory 200M --num-executors 2 spark_dataframe_example.py
然后分配的执行器内存是 "413.9 MiB",尽管我将其设置为 "200 M"。请参考下面的图片:
那么有人能确认执行器内存是如何分配的吗?
英文:
If I submit spark program as
->submit-spark --driver-memory 500M --executor-memory 300M spark_dataframe_example.py
then executor memory allocated is "110 MiB" even though I have set it as "300 M" .See image below:-
-> spark-submit --driver-memory 1G --executor-memory 200M --num-executors 2 spark_dataframe_example.py
then executor memory allocated is "413.9 MiB" even though I have set it as "200 M".See image below:-
So could someone confirm how this executor memory is allocated?
答案1
得分: 2
如评论中所述,--executor-memory
标志在本地模式下被忽略。要确认这一点,尝试使用一个非常高的--executor-memory
标志来运行您的spark-submit
命令(大于您的机器内存):它不会抱怨,因为它被忽略。
在本地模式下运行Spark是一个例外,因为您的驱动程序和执行程序在单个JVM内运行。因此,在这里重要的是您的--driver-memory
标志。
那么这些110MB和413.9MB是从哪里来的?在本帖发布时,版本3.3.2 - 最新版本 - 它们是根据此计算得出的:
val systemMemory = conf.get(TEST_MEMORY)
val reservedMemory = conf.getLong(TEST_RESERVED_MEMORY.key,
if (conf.contains(IS_TESTING)) 0 else RESERVED_SYSTEM_MEMORY_BYTES)
// 一些不相关的行
...
val usableMemory = systemMemory - reservedMemory
val memoryFraction = conf.get(config.MEMORY_FRACTION)
(usableMemory * memoryFraction).toLong
有3个值,我们需要找出它们来自何处:
systemMemory
来自TEST_MEMORY
,最终是您运行的JVM进程的Xmx
值。这将始终略小于您的--driver-memory
,但接近它。您无需自己设置此值,Spark会根据您的内存需求为您设置。reservedMemory
将由RESERVED_SYSTEM_MEMORY_BYTES
定义(因为我们不处于测试场景中),这个值是314572800(300MB)。memoryFraction
在默认情况下为0.6,这似乎是您的情况。
因此,最终的计算是:(systemMemory - reservedMemory) * memoryFraction
现在我们可以进行计算!
您的第一个情况
--driver-memory
为500M,所以让我们计算:
- (524288000 - 314572800) * 0.6 = 125829120 = 120MB
- 由于我们知道您的JVM进程的
Xmx
值接近但小于--driver-memory
,因此这非常接近110MB!
您的第二个情况
--driver-memory
为1G,所以让我们计算:
- (1048576000 - 314572800) * 0.6 = 440401920 = 420MB
- 由于我们知道您的JVM进程的
Xmx
值接近但小于--driver-memory
,因此这非常接近413.9MB!
英文:
As was said in the comments, the --executor-memory
flag is ignored in local mode. To confirm this, try running your spark-submit
command with a ridiculously high --executor-memory
flag (bigger than your machine): it won't complain because it is ignored.
Running Spark in local mode is a bit of an exception, since your driver and executor run inside of a single JVM. So the value that counts here is your --driver-memory
flag.
Now, where do those 110MB and 413.9MB come from? In version 3.3.2 - the most recent version at the time of this post - they are the result of this calculation:
val systemMemory = conf.get(TEST_MEMORY)
val reservedMemory = conf.getLong(TEST_RESERVED_MEMORY.key,
if (conf.contains(IS_TESTING)) 0 else RESERVED_SYSTEM_MEMORY_BYTES)
// skipping some irrelevant lines
...
val usableMemory = systemMemory - reservedMemory
val memoryFraction = conf.get(config.MEMORY_FRACTION)
(usableMemory * memoryFraction).toLong
There are 3 values which we need to find out where they come from:
systemMemory
comes fromTEST_MEMORY
which finally is theXmx
value of your running JVM process. This will always be a bit smaller than your--driver-memory
but close to it. You don't set this yourself, Spark does that for you in function of your memory requirements.reservedMemory
will be defined byRESERVED_SYSTEM_MEMORY_BYTES
(since we're not in a testing scenario) and this value is 314572800 (300MB).memoryFraction
is 0.6 in the default scenario, which seems to be the case for you.
The final calculation is thus: `(systemMemory - reservedMemory)*memoryFraction
Now we can do our calculations!
Your first case
--driver-memory
was 500M, so let's calculate:
- (524288000 - 314572800) * 0.6 = 125829120 = 120MB
- since we know the
Xmx
value of your JVM process is close to, but smaller than--driver-memory
this is very close to 110MB!
Your second case
--driver-memory
was 1G, so let's calculate:
-
(1048576000 - 314572800) * 0.6 = 440401920 = 420MB
-
since we know the
Xmx
value of your JVM process is close to, but smaller than--driver-memory
this is very close to 413.9MB!
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论