英文:
How do you add python packages in AWS Glue 3.0 Jupyter Notebook jobs?
问题
我正在尝试将一个项目迁移到AWS Glue,并为此需要安装一些新的包。考虑到结构和查看输出的需求,我想要使用Jupyter Notebook作业而不是Python Shell作业。我需要在作业中安装openpyxl,显然这不是通过运行"!pip install openpyxl"然后导入openpyxl来完成的,因为无法找到该模块,而是通过将以下键-值对添加为作业详细信息高级属性部分的新参数来完成的。当我尝试在Python Shell版本中添加"--additional-python-modules":"openpyxl==3.1.2"时,它允许我这样做,但是当我尝试在Jupyter Notebook作业下执行相同操作时,没有选项可以添加新参数。
如何在Glue的Jupyter Notebook作业中添加新参数?我是否漏掉了什么?
英文:
I am trying to migrate a project over to AWS Glue and in order to do this I need to install a few new packages. Given the structure and the need to see the outputs, I want to use the Jupyter Notebook job rather then the Python Shell job. I need to install openpyxl on the job and apparently this is done not by running the !pip install openpyxl and then import openpyxl which isn't able to find the module, but rather by adding the following key-value pair as a new parameter under the Job details advanced properties section. When I try and add "--additional-python-modules":"openpyxl==3.1.2" in the Python Shell version, it allows to do it, but when I try and do the same thing under the Jupyter Notebook job, there is no option to add a new parameter.
How to I add new parameters to the Jupyter Notebook job in Glue? Is there something that I am missing here?
答案1
得分: 0
结果证明我一开始做错了。首先,我需要运行以下命令:
%additional_python_modules openpyxl==3.1.2
然后我需要停止会话:
%stop_session
然后,如果我运行类似这样的任何命令,它将重新启动会话:
print('开始会话')
然后当我尝试导入Python包时,它会正常工作:
import openpyxl
print("已安装包")
英文:
Turns out that I was trying to do this wrong. First, I need to run the below command:
%additional_python_modules openpyxl==3.1.2
Then I needed to stop the session:
%stop_session
Then if I run any command like this it would restart the session:
print('Start session')
And then when I try import the python package, it works:
import openpyxl
print("Installed Package")
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论