英文:
Specifying mapreduce.map.java.opts without overriding memory settings?
问题
我正在使用一个运行 mapr 5.2 的 Hadoop 集群,存在与 Unicode 字符编码相关的问题。我发现将以下行添加到 mapred-site.xml
中可以解决这个问题:
<property>
<name>mapreduce.map.java.opts</name>
<value>-Dfile.encoding=utf-8</value>
</property>
<property>
<name>mapreduce.map.java.opts</name>
<value>-Dfile.encoding=utf-8</value>
</property>
然而,不幸的是,这会导致许多工作(在没有这些属性的情况下正常工作)抛出像这样的错误:
Container [pid=63155,containerID=container_e40_1544666751235_12271_01_000004] 超出物理内存限制。当前使用情况:已使用 8.0 GB 的 8 GB 物理内存;已使用 31.7 GB 的 16.8 GB 虚拟内存。正在终止容器。
我已经尝试将 mapreduce.map.memory.mb
的值增加到根据此错误消息所允许的最大值:
作业 job_1544666751235_12267 由于 MAP 所需的资源能力超过集群中支持的最大容器资源能力而失败。正在终止作业。mapResourceRequest: <memory:16000, vCores:1, disks:0.5> maxContainerCapability:<memory:8192, vCores:20, disks:4.0>
但容器仍然被终止。正如我所说,这些作业在设置 mapreduce.*.java.opts
属性之前工作正常,因此我认为它们正在覆盖某些内容。有没有办法在不覆盖其他 Java 参数的情况下设置 -Dfile.encoding
呢?
英文:
I am using a hadoop cluster running mapr 5.2 that has problems with unicode character encodings. I discovered that adding the following lines to mapred-site.xml
solved this issue:
<property>
<name>mapreduce.map.java.opts</name>
<value>-Dfile.encoding=utf-8</value>
</property>
<property>
<name>mapreduce.map.java.opts</name>
<value>-Dfile.encoding=utf-8</value>
</property>
Unfortunately, this causes many jobs (that work fine without these properties) to throw errors like this:
Container [pid=63155,containerID=container_e40_1544666751235_12271_01_000004] is running beyond physical memory limits. Current usage: 8.0 GB of 8 GB physical memory used; 31.7 GB of 16.8 GB virtual memory used. Killing container.
I've tried increasing the value of mapreduce.map.memory.mb
to the maximum allowed according to this error mesage:
Job job_1544666751235_12267 failed with state KILLED due to: MAP capability required is more than the supported max container capability in the cluster. Killing the Job. mapResourceRequest: <memory:16000, vCores:1, disks:0.5> maxContainerCapability:<memory:8192, vCores:20, disks:4.0>
But containers are still killed. Like I said, these jobs worked fine before setting the mapreduce.*.java.opts
properties, so I assume they are overriding something. Is there a way to set -Dfile.encoding
without overriding other Java parameters?
答案1
得分: 1
以下是翻译好的内容:
之前是否存在 mapreduce.*.java.opts
的值?通常像 -Xmx
这样的 Java 内存设置会放在那里。因此,仅保留 -Dfile.encoding=utf-8
可能已删除了那些设置,这可能会影响其他作业。在这里您有两个选择:
- 将您的编码设置追加到先前存在的值中。但是在这里,编码设置将适用于使用该
mapred-site.xml
的所有作业。
<property>
<name>mapreduce.map.java.opts</name>
<value>将您先前存在的java_opts值放在这里 -Dfile.encoding=utf-8</value>
</property>
<property>
<name>mapreduce.reduce.java.opts</name>
<value>将您先前存在的java_opts值放在这里 -Dfile.encoding=utf-8</value>
</property>
- 仅在运行作业时为作业设置此值,前提是您的代码中使用了
org.apache.hadoop.util.GenericOptionsParser
。因此,编码设置仅适用于您的作业。
yarn jar <your_jar> <class> -Dmapreduce.map.java.opts="将您先前存在的java_opts值放在这里 -Dfile.encoding=utf-8"
英文:
Is there a value existed earlier for mapreduce.*.java.opts
? Usually the Java memory settings like -Xmx
etc goes in there. So just keeping -Dfile.encoding=utf-8
might have removed those settings and that might have affected other jobs. You have two options here
- Append your encoding settings to the earlier existed value. But here encoding setting will be applicable to all the jobs using that
mapred-site.xml
<property>
<name>mapreduce.map.java.opts</name>
<value>your_earlier_existed_java_opts_value_goes_here -Dfile.encoding=utf-8</value>
</property>
<property>
<name>mapreduce.reduce.java.opts</name>
<value>your_earlier_existed_java_opts_value_goes_here -Dfile.encoding=utf-8</value>
</property>
- Set this value only to your job while running, provided you use
org.apache.hadoop.util.GenericOptionsParser
on your code. So encoding settings will be applicable only for your job.
yarn jar <your_jar> <class> -Dmapreduce.map.java.opts="your_earlier_existed_java_opts_value_goes_here -Dfile.encoding=utf-8"
</details>
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论