问题

我有一个在AWS Batch中执行的Nextflow流水线。最近，我尝试添加一个从本地计算机上传文件到S3存储桶的流程，这样我就不必在每次运行之前手动上传文件了。我编写了一个处理上传的Python脚本，并将其包装成一个Nextflow流程。因为我是从本地计算机上传的，所以我希望上传流程使用executor 'local'。

这需要启用Fusion文件系统，以便在S3中拥有工作目录。但是当我启用Fusion文件系统时，我无法访问本地文件系统。在我理解中，启用Fusion文件系统时，任务在Wave容器中运行，无法访问主机文件系统。有没有人有使用启用FusionFS的Nextflow并如何访问主机文件系统的经验？谢谢！

英文:

I have a Nextflow pipeline executed in AWS Batch. Recently, I tried to add a process that uploads files from local machine to S3 bucket so I don't have to upload files manually before each run. I wrote a python script that handles the upload and I wrapped it into a Nextflow process. Since I am uploading from a local machine, I want the upload process with
executor 'local'

This requires a Fusion filesystem enabled in order to have a Work Dir in S3. But when I enable the Fusion filesystem I don't have access to my local filesystem. In my understanding, when Fusion filesystem is enabled, the task runs in Wave container without access to host filesystem. Does anyone have experience with running Nextflow with FusionFS enabled and how to access host filesystem? Thanks!

答案1

得分: 1

I don't think you need to manage a 混合工作负载 here. Pipeline inputs can be stored either locally or in an S3 bucket. If your files are stored locally and you specify a working directory in S3, Nextflow will already try to upload them into the staging area for you. For example, if you specify your working directory in S3 using -work-dir 's3://mybucket/work', Nextflow will try to stage the input files under s3://mybucket/work/stage-<session-uuid>. Once the files are in the staging area, Nextflow can then begin to submit jobs that require them.

Note that a Fusion file system is not strictly required to have your working directory in S3. Nextflow includes support for S3. Either include your AWS access and secret keys in your pipeline configuration or use an IAM role to allow your EC2 instances full access to S3 storage.

英文:

I don't think you need to manage a hybrid workload here. Pipeline inputs can be stored either locally or in an S3 bucket. If your files are stored locally and you specify a working directory in S3, Nextflow will already try to upload them into the staging area for you. For example, if you specify your working directory in S3 using -work-dir 's3://mybucket/work', Nextflow will try to stage the input files under s3://mybucket/work/stage-<session-uuid>. Once the files are in the staging area, Nextflow can then begin to submit jobs that require them.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

Nextflow脚本同时使用’local’和’awsbatch’执行器。

问题

答案1

Lost coordinate file name in process all outputs altogether.

nextflow – spltiCSV – each element – error : 如果需要重复使用相同的组件

如何使用AWS Batch恢复步骤函数

获取目录通道 nexflow 的文件列表。

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论