英文:
automatic reloading of jupyter notebook after crash
问题
以下是翻译的内容:
有没有一种方法可以在 Jupyter Notebook 崩溃时自动重新加载它?
我正在运行一个笔记本,用于训练深度学习模型(笔记本可以在内核重新启动后重新加载模型的最后状态,包括优化器和调度器的状态),因此在崩溃后重新加载笔记本可以使我恢复到最后一个状态,而不会丢失大量的计算。
我想知道是否有一种简单的方法可以使用 Jupyter Notebook API 来实现这一点,或者使用 Jupyter Notebook 的信号(也许在日志上)。
另外,我正在 Google Cloud 平台上运行这个笔记本(在计算引擎上),如果您知道使用 GCP 的故障排除服务和日志代理来实现这一点的有效方法,对我和其他遇到同样问题的人可能会很有兴趣。
再次感谢您的时间。
我尝试在 Stack Overflow 上寻找解决方案,但我没有找到类似的问题。
英文:
is there a way to reload automatically a jupyter notebook, each time it crashes ?
I am actually running a notebook, that trains a Deep learning model (the notebook can reload the last state of model, with state of optimizer and scheduler, after each restart of the kernel ), so that reloading the notebook after a crash enables to get back the last state without a substantial loss of computations.
I was wondering if there was a simple way to do that using the jupyter notebook API, or a signal from the jupyter notebook for example (maybe on logs).
Also, I am running the notebook on google cloud platform (on compute engine), if you know any efficient way to do it, using the GCP troubleshooting services, and the logging agent, it might be interested for me and for others with the same issue.
Thank you again for you time.
I tried to look up for a solution on stack overflow, but I didn't find any similar question.
答案1
得分: 0
从您的评论中:
"重新加载笔记本在崩溃后能够在不严重损失计算的情况下恢复到上次状态。"
什么叫做崩溃?它是否生成可以从/var/log
或其他位置(例如journalctl -u jupyter.service
)解析的日志?如果是这样,您可以手动创建一个shell脚本。
对于用户管理的笔记本,您可以使用"post-startup-script"或"startup-script"的概念。
"post-startup-script" 是一个Bash脚本的路径,在笔记本实例完全启动后会自动运行。路径必须是URL或云存储路径。例如:"gs://path-to-file/file-name"。
这个脚本可以是一个监视您提到的崩溃的循环。
英文:
From your comment:
"reloading the notebook after a crash enables to get back the last state without a substantial loss of computations."
What do you call a crash?, does it generate logs that can be parsed from /var/log
or other location (e.g journalctl -u jupyter.service
) ? If so you can manually create a shell script.
With User Managed Notebooks you have the concept of post-startup-script
or startup-script
post-startup-script
, is path to a Bash script that automatically runs after a notebook instance fully boots up. The path must be a URL or Cloud Storage path. Example: "gs://path-to-file/file-name"
This script can be a loop that monitors the crash you mention
答案2
得分: 0
我认为在执行代码之前,你需要添加以下代码:
import os
os.environ["KMP_DUPLICATE_LIB_OK"] = "TRUE"
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论