问题

我正在尝试在我的Glue脚本中使用pg8000，在Glue作业中的参数如下：

--extra-py-files s3://mybucket/pg8000libs.zip //NOTE: my zip contains __init__.py

一些有关代码的见解：

import sys
import os
from awsglue.transforms import *
from awsglue.utils import getResolvedOptions
from pyspark.context import SparkContext
from awsglue.context import GlueContext
from awsglue.job import Job
import boto3
from pyspark.sql import Row
from datetime import datetime, date

zip_path = os.path.join('/tmp', 'pg8000libs.zip')
sys.path.insert(0, zip_path)

def dump_python_path():
    print("python path:", sys.path)

    for path in sys.path:
        if os.path.isdir(path):
            print(f"dir: {path}")
            print("\t" + str(os.listdir(path)))
        print(path)

print(os.listdir('/tmp'))
dump_python_path()
# Import the library
import pg8000

CloudWatch中的Dump：

英文:

I am trying to use pg8000 in my Glue Script, following are params in Glue Job

--extra-py-files	s3://mybucket/pg8000libs.zip  //NOTE: my zip contains __init__.py

Some Insights towards code

import sys
import os
from awsglue.transforms import *
from awsglue.utils import getResolvedOptions
from pyspark.context import SparkContext
from awsglue.context import GlueContext
from awsglue.job import Job
import boto3
from pyspark.sql import Row
from datetime import datetime, date

zip_path = os.path.join(&#39;/tmp&#39;, &#39;pg8000libs.zip&#39;)
sys.path.insert(0, zip_path)


def dump_python_path():
    print(&quot;python path:&quot;, sys.path)

    for path in sys.path:
        if os.path.isdir(path):
            print(f&quot;dir: {path}&quot;)
            print(&quot;\t&quot; + str(os.listdir(path)))
        print(path)

print(os.listdir(&#39;/tmp&#39;))
dump_python_path()
# Import the library
import pg8000

Dump in cloudwatch

python path: [&#39;/tmp/pg8000libs.zip&#39;, &#39;/opt/amazon/bin&#39;, &#39;/tmp/pg8000libs.zip&#39;, &#39;/opt/amazon/spark/jars/spark-core_2.12-3.1.1-amzn-0.jar&#39;, &#39;/opt/amazon/spark/python/lib/pyspark.zip&#39;, &#39;/opt/amazon/spark/python/lib/py4j-0.10.9-src.zip&#39;, &#39;/opt/amazon/lib/python3.6/site-packages&#39;, &#39;/usr/lib64/python37.zip&#39;, &#39;/usr/lib64/python3.7&#39;, &#39;/usr/lib64/python3.7/lib-dynload&#39;, &#39;/home/spark/.local/lib/python3.7/site-packages&#39;, &#39;/usr/lib64/python3.7/site-packages&#39;, &#39;/usr/lib/python3.7/site-packages&#39;]

答案1

得分: 1

在尝试了所有标准方法后，我找到了一种使用 sys.path 的解决方法。通过将当前目录添加到 Python 导入搜索路径，Glue 作业能够成功定位和导入额外的 .py 文件。我将整个目录添加到 Python 路径。以下是我使用的代码示例：

import sys
import os

current_dir = os.path.dirname(os.path.abspath(__file__))
sys.path.append(current_dir)

from utils import *

重要说明：

修改导入搜索路径应谨慎使用，因为它可能引入模块名称冲突或意外导入。建议确保正确的文件组织并进行必要的调整，以获得更稳健和可维护的解决方案。

英文:

After exhausting all the standard approaches, I found a workaround using sys.path. By adding the current directory to the Python import search path, the Glue job was able to locate and import the additional .py file successfully. I added the whole directory to python path. Here's an example of the code I used:

import sys
import os

current_dir = os.path.dirname(os.path.abspath(__file__))
sys.path.append(current_dir)

from utils import *

Important Note:

Modifying the import search path should be used carefully, as it may introduce module name conflicts or unintended imports. It's recommended to ensure proper file organization and make the necessary adjustments for a more robust and maintainable solution.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

AWS –extra-py-files 抛出 ModuleNotFoundError: No module named ‘pg8000’

问题

答案1

为什么我的for循环与if语句一起不起作用？

如何在 Jinja 模板上添加索引？

如何将亚马逊上的图书信息转化为表格形式？

如何使用Robot Framework（语言：Python）启动cmd。

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论