英文:
How to import referenced files (XML ) in AWS Glue script
问题
我正在尝试在FAIR调度模式下运行Glue作业。为此,我创建了一个名为fairschedular.xml的XML文件。
然后,我将这个fairschedular.xml文件添加到S3存储桶中,并将该位置添加到Glue作业的引用路径中,如下所示:
<?xml version="1.0"?>
<allocations>
<pool name="1">
<schedulingMode>FIFO</schedulingMode>
<weight>1</weight>
<minShare>2</minShare>
</pool>
<pool name="2">
<schedulingMode>FIFO</schedulingMode>
<weight>1</weight>
<minShare>2</minShare>
</pool>
</allocations>
然后,我在脚本中使用如下方式:
class JobBase(object):
fair_scheduler_config_file = "fairscheduler.xml"
rowAsDict = {}
Oracle_Username = None
Oracle_Password = None
Oracle_jdbc_url = None
def __start_spark_glue_context(self):
conf = SparkConf().setAppName("python_thread").set('spark.scheduler.mode', 'FAIR').set("spark.scheduler.allocation.file", self.fair_scheduler_config_file)
self.sc = SparkContext(conf=conf)
self.glueContext = GlueContext(self.sc)
self.spark = self.glueContext.spark_session
但是当代码运行时,我在Spark UI历史服务器中看不到公平调度池,但我看到了FAIR调度。
英文:
I am trying to run glue job in FAIR Scheduling mode . For this I created one xml file with name fairschedular.xml
Then I added this fairschedular.xml in s3 bucket and add that location in reference path of glue job as follows :
<?xml version="1.0"?>
<allocations>
<pool name="1">
<schedulingMode>FIFO</schedulingMode>
<weight>1</weight>
<minShare>2</minShare>
</pool>
<pool name="2">
<schedulingMode>FIFO</schedulingMode>
<weight>1</weight>
<minShare>2</minShare>
</pool>
</allocations>
Then I used in script as follows :
class JobBase(object):
fair_scheduler_config_file= "fairscheduler.xml"
rowAsDict={}
Oracle_Username=None
Oracle_Password=None
Oracle_jdbc_url=None
def __start_spark_glue_context(self):
conf = SparkConf().setAppName("python_thread").set('spark.scheduler.mode', 'FAIR').set("spark.scheduler.allocation.file", self.fair_scheduler_config_file)
self.sc = SparkContext(conf=conf)
self.glueContext = GlueContext(self.sc)
self.spark = self.glueContext.spark_session
But when code is running I don't see fair schedule pools in spark ui history server . I do see FAIR scheduling.
答案1
得分: 0
问题已解决。我可以在AWS日志中看到池正在生成。
英文:
Issues is resolved . I can see in AWS logs pool are getting generated.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论