英文:
Where are files saved when using Google Colab with a Custom GCE VM
问题
我正在使用Google Colab与基于这里的说明创建的自定义GCE VM相结合。现在,由于此问题和此问题中描述的错误,我需要一种从VM检索文件而不使用Colab界面的方法。我已经查看了关于托管实例上文件存储的这个类似问题的答案,但我不认为它对我有帮助。
我尝试通过SSH登录到机器来查找文件,但我找不到我希望在root
中看到的/content
目录。在浏览文件系统后,我发现/mnt/stateful_partition/var/lib/docker
目录正在使用我期望的磁盘空间量,其中有一个看起来有希望的文件对象叫做colab-vmdisk
。我不确定如何继续,但考虑到文件路径,我希望有一个基于Docker的解决方案,但我不知道该如何操作。
英文:
I am using Google Colab in combination with a Custom GCE VM based on the instructions here. I now need a way to retrieve files from the VM without using the Colab interface due to a bug described in this issue and this issue. I've reviewed the answers from this similar question about file storage on hosted instances, but I don't think it helps me in this case.
I've attempted to SSH into the machine to find files, but I can't locate the /content
directory that I expect to see in root
. After digging through the file system I found the /mnt/stateful_partition/var/lib/docker
directory is using the amount of disk space I expect to reflect the size of the data with a file object called colab-vmdisk
that looks promising. I'm not sure how to proceed, but given the file path I expect there's a docker-based solution here that I don't know.
答案1
得分: 1
@hidude562的回答非常准确。它对我有用。我一直在尝试找到一个具有良好下载速度的方法(与我在使用gdrive与Colab的托管运行时时一样快)。
正如@s_go所提到的,Colab似乎在一个Docker容器内管理整个过程。这也解释了他们如何从一开始就保持流行的库更新,包括gdown库。我发现最好使用gdown从Gdrive下载大文件到Colab;因为在使用自定义GCE VM运行时时,谷歌不允许您将个人gdrive挂载到Colab,因为存在一些授权阻碍。这种方法以Google能够达到的最高速度将文件下载到Colab中(我曾看到最高达约500mbps)。
此外,从Docker文件中提取文件后,我使用FileZilla SFTP将文件下载到本地。速度如预期那样快,直接从SSH下载的速度大约为每秒100kbps,但在同一台VM上使用FileZilla时,我获得了最高达每秒13mbps的下载速度(我的wifi下载带宽约为每秒25mbps)。
希望这条评论为其他读者验证@hidude562的答案!
感谢您的问题@s_go和您的回答@hidude562!
英文:
@hidude562's answer is on point. It worked for me. I've been trying to figure out a method with good download speeds (as fast as when I was using gdrive with Colab on their hosted runtime)
Colab seems to be managing the entire thing within a docker container, as you rightly mentioned @s_go. It also explains how they keep the popular libraries updated right from the start, including the gdown library. I figured it's best to use gdown to download large files into Colab from Gdrive; as google doesnt let you mount your personal gdrive to Colab when using a custom GCE VM runtime, due to some authorisation blockers. This method downloads files into colab at full speed Google is capable of (I've seen upto ~500mbps)
Adding on, after extracting the file from the Docker file, I used FileZilla SFTP to download the file to my local. It was as fast as expected, direct download from the SSH was around ~100kbps for some reason, with FileZilla on the same VM I got download speeds of upto ~13mbps (my wifi dl bandwidth is about ~25mbps)
Hope this comment validates @hidude562's answer for other readers.!
Thank you for your question @s_go and your answer @hidude562!:)
答案2
得分: 0
Google Colab从GCE中运行,位于自己的Docker容器中,就像你发现的那样。如果你想访问Google Colab会话中的文件,请运行docker ps
并复制底部行中的容器ID。至于复制文件,执行docker cp (你的容器ID):/path/to/google/colab/folder/ /path/to/gce/
。
英文:
Google Colab from GCE is in its own docker container as you found. If you want to access the files in the google colab session, run docker ps
and copy the container id from the bottom row. As for copying a file over, do docker cp (your container id):/path/to/google/colab/folder/ /path/to/gce/
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论