2023年2月16日 05:48:56go评论59阅读模式

英文:

path not being detected by Nextflow

问题

我对 nf-core/nextflow 是新手，不用说文档可能并没有反映实际实现。但我在下面定义了基本的流程：

nextflow.enable.dsl=2

process RUNBLAST{
input:
val thr
path query
path db
path output

output:
path output

script:
"""
    blastn -query ${query} -db ${db} -out ${output} -num_threads ${thr}
"""
}

workflow{

//println "我想要对 $params.query 使用 $params.threads 个CPU，将其 BLAST 到 $params.dbDir/$params.dbName 并输出到 $params.outdir"

RUNBLAST(params.threads,params.query,params.dbDir, params.output)

}

然后我使用以下命令执行流程：

nextflow run main.nf --query test2.fa --dbDir blast/blastDB

然后我得到以下错误：

N E X T F L O W  ~  version 22.10.6
启动 `main.nf` [dreamy_hugle] DSL2 - 修订版: c388cf8f31
执行进程时出错 > 'RUNBLAST'
执行进程时出错 > 'RUNBLAST'

导致原因:
  不是一个有效的路径值: 'test2.fa'

提示: 你可以通过切换到流程工作目录并输入 bash .command.run 命令来复制此问题。

我知道 test2.fa 存在于当前目录：

(nfcore) MN:nf-core-basicblast jraygozagaray$ ls
CHANGELOG.md        conf            other.nf
CITATIONS.md        docs            pyproject.toml
CODE_OF_CONDUCT.md  lib             subworkflows
LICENSE             main.nf         test.fa
README.md           modules         test2.fa
assets              modules.json    work
bin                 nextflow.config workflows
blast               nextflow_schema.json

我也尝试过将 path 替换为 file，但那已被弃用并引发了其他类型的错误。

了解如何修复这个问题将有助于我开始构建流程。

难道 nextflow 不应该将文件复制到执行路径吗？

谢谢


<details>
<summary>英文:</summary>

i&#39;m new to nf-core/nextflow and needless to say the documentation does not reflect what might be actually implemented. But i&#39;m defining the basic pipeline below:

nextflow.enable.dsl=2


process RUNBLAST{
input:
val thr
path query
path db
path output

output:
path output

script:
&quot;&quot;&quot;
    blastn -query ${query} -db ${db} -out ${output} -num_threads ${thr}
&quot;&quot;&quot;

}

workflow{

//println &quot;I want to BLAST $params.query to $params.dbDir/$params.dbName using $params.threads CPUs and output it to $params.outdir&quot;

RUNBLAST(params.threads,params.query,params.dbDir, params.output)

}


Then i&#39;m executing the pipeline with 

```nextflow run main.nf --query test2.fa --dbDir blast/blastDB```

Then i get the following error:

N E X T F L O W ~ version 22.10.6
Launching main.nf [dreamy_hugle] DSL2 - revision: c388cf8f31
Error executing process > 'RUNBLAST'
Error executing process > 'RUNBLAST'

Caused by:
Not a valid path value: 'test2.fa'

Tip: you can replicate the issue by changing to the process work dir and entering the command bash .command.run

I know test2.fa exists in the current directory:

(nfcore) MN:nf-core-basicblast jraygozagaray$ ls
CHANGELOG.md conf other.nf
CITATIONS.md docs pyproject.toml
CODE_OF_CONDUCT.md lib subworkflows
LICENSE main.nf test.fa
README.md modules test2.fa
assets modules.json work
bin nextflow.config workflows
blast nextflow_schema.json



I also tried with &quot;file&quot; instead of path but that is deprecated and raises other kind of errors.

It&#39;ll be helpful to know how to fix this to get myself started with the pipeline building process.


Shouldn&#39;t nextflow copy the file to the execution path?

Thanks

</details>


# 答案1
**得分**: 1

由于`params.query`实际上不是`path`值，所以出现了上述错误。它可能只是一个简单的字符串或GString。解决方法是提供一个`file`对象，例如：

```groovy
workflow {

    query = file(params.query)

    BLAST( query, ... )
}

请注意，当使用简单值调用进程时，会隐式创建一个value channel，就像上面的file对象一样。如果您需要能够对多个查询文件进行BLAST，您将需要一个queue channel，可以使用fromPath工厂方法创建，例如：

params.query = "${baseDir}/data/*.fa"
params.db = "${baseDir}/blastdb/nt"
params.outdir = './results'

db_name = file(params.db).name
db_path = file(params.db).parent

process BLAST {

    publishDir(
        path: "{params.outdir}/blast",
        mode: 'copy',
    )

    input:
    tuple val(query_id), path(query)
    path db

    output:
    tuple val(query_id), path("${query_id}.out")

    """
    blastn \\
        -num_threads ${task.cpus} \\
        -query "${query}" \\
        -db "${db}/${db_name}" \\
        -out "${query_id}.out"
    """
}

workflow{

    Channel
        .fromPath( params.query )
        .map { file -> tuple(file.baseName, file) }
        .set { query_ch }

    BLAST( query_ch, db_path )
}

请注意，通常指定线程/处理器数量的方法是使用cpus指令，可以在您的nextflow.config中使用process selector进行配置，例如：

process {

    withName: BLAST {
        cpus = 4
    }
}

英文:

You get the above error because params.query is not actually a path value. It's probably just a simple String or GString. The solution is to instead supply a file object, for example:

workflow {

    query = file(params.query)

    BLAST( query, ... )
}

Note that a value channel is implicitly created by a process when it is invoked with a simple value, like the above file object. If you need to be able to BLAST multiple query files, you'll instead need a queue channel, which can be created using the fromPath factory method, for example:

params.query = &quot;${baseDir}/data/*.fa&quot;
params.db = &quot;${baseDir}/blastdb/nt&quot;
params.outdir = &#39;./results&#39;

db_name = file(params.db).name
db_path = file(params.db).parent


process BLAST {

    publishDir(
        path: &quot;{params.outdir}/blast&quot;,
        mode: &#39;copy&#39;,
    )

    input:
    tuple val(query_id), path(query)
    path db

    output:
    tuple val(query_id), path(&quot;${query_id}.out&quot;)

    &quot;&quot;&quot;
    blastn \\
        -num_threads ${task.cpus} \\
        -query &quot;${query}&quot; \\
        -db &quot;${db}/${db_name}&quot; \\
        -out &quot;${query_id}.out&quot;
    &quot;&quot;&quot;
}

workflow{

    Channel
        .fromPath( params.query )
        .map { file -&gt; tuple(file.baseName, file) }
        .set { query_ch }

    BLAST( query_ch, db_path )
}

Note that the usual way to specify the number of threads/cpus is using cpus directive, which can be configured using a process selector in your nextflow.config. For example:

process {

    withName: BLAST {
        cpus = 4
    }
}

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

路径未被Nextflow检测到

问题

Nextflow脚本同时使用’local’和’awsbatch’执行器。

nextflow – spltiCSV – each element – error : 如果需要重复使用相同的组件

Jenkins: 使用catchError()中止构建。

所需的 Blob 在预览复制数据活动的接收器数据集时丢失

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论