问题

我正在尝试在FAIR调度模式下运行Glue作业。为此，我创建了一个名为fairschedular.xml的XML文件。

然后，我将这个fairschedular.xml文件添加到S3存储桶中，并将该位置添加到Glue作业的引用路径中，如下所示：

<?xml version="1.0"?>
<allocations>
  <pool name="1">
    <schedulingMode>FIFO</schedulingMode>
    <weight>1</weight>
    <minShare>2</minShare>
  </pool>
  <pool name="2">
    <schedulingMode>FIFO</schedulingMode>
    <weight>1</weight>
    <minShare>2</minShare>
  </pool>
</allocations>

然后，我在脚本中使用如下方式：

class JobBase(object):
    
    fair_scheduler_config_file = "fairscheduler.xml"
    rowAsDict = {}
    Oracle_Username = None
    Oracle_Password = None
    Oracle_jdbc_url = None

    def __start_spark_glue_context(self):
        conf = SparkConf().setAppName("python_thread").set('spark.scheduler.mode', 'FAIR').set("spark.scheduler.allocation.file", self.fair_scheduler_config_file)
        self.sc = SparkContext(conf=conf)
        self.glueContext = GlueContext(self.sc)
        self.spark = self.glueContext.spark_session

但是当代码运行时，我在Spark UI历史服务器中看不到公平调度池，但我看到了FAIR调度。

英文:

I am trying to run glue job in FAIR Scheduling mode . For this I created one xml file with name fairschedular.xml

Then I added this fairschedular.xml in s3 bucket and add that location in reference path of glue job as follows :

&lt;?xml version=&quot;1.0&quot;?&gt;
&lt;allocations&gt;
 &lt;pool name=&quot;1&quot;&gt;
   &lt;schedulingMode&gt;FIFO&lt;/schedulingMode&gt;
   &lt;weight&gt;1&lt;/weight&gt;
   &lt;minShare&gt;2&lt;/minShare&gt;
 &lt;/pool&gt;
 &lt;pool name=&quot;2&quot;&gt;
   &lt;schedulingMode&gt;FIFO&lt;/schedulingMode&gt;
   &lt;weight&gt;1&lt;/weight&gt;
   &lt;minShare&gt;2&lt;/minShare&gt;
 &lt;/pool&gt;
&lt;/allocations&gt;

Then I used in script as follows :

class JobBase(object):
    
    fair_scheduler_config_file= &quot;fairscheduler.xml&quot;
    rowAsDict={}
    Oracle_Username=None
    Oracle_Password=None
    Oracle_jdbc_url=None

    def __start_spark_glue_context(self):
        conf = SparkConf().setAppName(&quot;python_thread&quot;).set(&#39;spark.scheduler.mode&#39;, &#39;FAIR&#39;).set(&quot;spark.scheduler.allocation.file&quot;, self.fair_scheduler_config_file)
        self.sc = SparkContext(conf=conf)
        self.glueContext = GlueContext(self.sc)
        self.spark = self.glueContext.spark_session

But when code is running I don't see fair schedule pools in spark ui history server . I do see FAIR scheduling.

答案1

得分: 0

问题已解决。我可以在AWS日志中看到池正在生成。

英文:

Issues is resolved . I can see in AWS logs pool are getting generated.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何在AWS Glue脚本中导入引用的文件（XML）？

问题

答案1

如何使用多列和条件像PySpark一样连接Pandas数据框。

如何将 EMR 无服务器 PySpark 的 entryPointArguments 作为变量传递

如何使用PySpark在Databricks中将变量插入方法

将Pyspark Dataframe转换为字典不起作用。

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论