路径未被Nextflow检测到

huangapple go评论59阅读模式
英文:

path not being detected by Nextflow

问题

我对 nf-core/nextflow 是新手,不用说文档可能并没有反映实际实现。但我在下面定义了基本的流程:

nextflow.enable.dsl=2

process RUNBLAST{
input:
val thr
path query
path db
path output

output:
path output

script:
"""
    blastn -query ${query} -db ${db} -out ${output} -num_threads ${thr}
"""
}

workflow{

//println "我想要对 $params.query 使用 $params.threads 个CPU,将其 BLAST 到 $params.dbDir/$params.dbName 并输出到 $params.outdir"

RUNBLAST(params.threads,params.query,params.dbDir, params.output)

}

然后我使用以下命令执行流程:

nextflow run main.nf --query test2.fa --dbDir blast/blastDB

然后我得到以下错误:

N E X T F L O W  ~  version 22.10.6
启动 `main.nf` [dreamy_hugle] DSL2 - 修订版: c388cf8f31
执行进程时出错 > 'RUNBLAST'
执行进程时出错 > 'RUNBLAST'

导致原因:
  不是一个有效的路径值: 'test2.fa'

提示: 你可以通过切换到流程工作目录并输入 bash .command.run 命令来复制此问题。

我知道 test2.fa 存在于当前目录:

(nfcore) MN:nf-core-basicblast jraygozagaray$ ls
CHANGELOG.md        conf            other.nf
CITATIONS.md        docs            pyproject.toml
CODE_OF_CONDUCT.md  lib             subworkflows
LICENSE             main.nf         test.fa
README.md           modules         test2.fa
assets              modules.json    work
bin                 nextflow.config workflows
blast               nextflow_schema.json

我也尝试过将 path 替换为 file,但那已被弃用并引发了其他类型的错误。

了解如何修复这个问题将有助于我开始构建流程。

难道 nextflow 不应该将文件复制到执行路径吗?

谢谢


<details>
<summary>英文:</summary>

i&#39;m new to nf-core/nextflow and needless to say the documentation does not reflect what might be actually implemented. But i&#39;m defining the basic pipeline below: 
nextflow.enable.dsl=2


process RUNBLAST{
input:
val thr
path query
path db
path output

output:
path output

script:
&quot;&quot;&quot;
    blastn -query ${query} -db ${db} -out ${output} -num_threads ${thr}
&quot;&quot;&quot;

}

workflow{

//println &quot;I want to BLAST $params.query to $params.dbDir/$params.dbName using $params.threads CPUs and output it to $params.outdir&quot;

RUNBLAST(params.threads,params.query,params.dbDir, params.output)

}


Then i&#39;m executing the pipeline with 

```nextflow run main.nf --query test2.fa --dbDir blast/blastDB```

Then i get the following error:

N E X T F L O W ~ version 22.10.6
Launching main.nf [dreamy_hugle] DSL2 - revision: c388cf8f31
Error executing process > 'RUNBLAST'
Error executing process > 'RUNBLAST'

Caused by:
Not a valid path value: 'test2.fa'

Tip: you can replicate the issue by changing to the process work dir and entering the command bash .command.run

I know test2.fa exists in the current directory:

(nfcore) MN:nf-core-basicblast jraygozagaray$ ls
CHANGELOG.md conf other.nf
CITATIONS.md docs pyproject.toml
CODE_OF_CONDUCT.md lib subworkflows
LICENSE main.nf test.fa
README.md modules test2.fa
assets modules.json work
bin nextflow.config workflows
blast nextflow_schema.json



I also tried with &quot;file&quot; instead of path but that is deprecated and raises other kind of errors.

It&#39;ll be helpful to know how to fix this to get myself started with the pipeline building process.


Shouldn&#39;t nextflow copy the file to the execution path?

Thanks

</details>


# 答案1
**得分**: 1

由于`params.query`实际上不是`path`值,所以出现了上述错误。它可能只是一个简单的字符串或GString。解决方法是提供一个`file`对象,例如:

```groovy
workflow {

    query = file(params.query)

    BLAST( query, ... )
}

请注意,当使用简单值调用进程时,会隐式创建一个value channel,就像上面的file对象一样。如果您需要能够对多个查询文件进行BLAST,您将需要一个queue channel,可以使用fromPath工厂方法创建,例如:

params.query = "${baseDir}/data/*.fa"
params.db = "${baseDir}/blastdb/nt"
params.outdir = './results'

db_name = file(params.db).name
db_path = file(params.db).parent

process BLAST {

    publishDir(
        path: "{params.outdir}/blast",
        mode: 'copy',
    )

    input:
    tuple val(query_id), path(query)
    path db

    output:
    tuple val(query_id), path("${query_id}.out")

    """
    blastn \\
        -num_threads ${task.cpus} \\
        -query "${query}" \\
        -db "${db}/${db_name}" \\
        -out "${query_id}.out"
    """
}
workflow{

    Channel
        .fromPath( params.query )
        .map { file -> tuple(file.baseName, file) }
        .set { query_ch }

    BLAST( query_ch, db_path )
}

请注意,通常指定线程/处理器数量的方法是使用cpus指令,可以在您的nextflow.config中使用process selector进行配置,例如:

process {

    withName: BLAST {
        cpus = 4
    }
}
英文:

You get the above error because params.query is not actually a path value. It's probably just a simple String or GString. The solution is to instead supply a file object, for example:

workflow {

    query = file(params.query)

    BLAST( query, ... )
}

Note that a value channel is implicitly created by a process when it is invoked with a simple value, like the above file object. If you need to be able to BLAST multiple query files, you'll instead need a queue channel, which can be created using the fromPath factory method, for example:

params.query = &quot;${baseDir}/data/*.fa&quot;
params.db = &quot;${baseDir}/blastdb/nt&quot;
params.outdir = &#39;./results&#39;

db_name = file(params.db).name
db_path = file(params.db).parent


process BLAST {

    publishDir(
        path: &quot;{params.outdir}/blast&quot;,
        mode: &#39;copy&#39;,
    )

    input:
    tuple val(query_id), path(query)
    path db

    output:
    tuple val(query_id), path(&quot;${query_id}.out&quot;)

    &quot;&quot;&quot;
    blastn \\
        -num_threads ${task.cpus} \\
        -query &quot;${query}&quot; \\
        -db &quot;${db}/${db_name}&quot; \\
        -out &quot;${query_id}.out&quot;
    &quot;&quot;&quot;
}
workflow{

    Channel
        .fromPath( params.query )
        .map { file -&gt; tuple(file.baseName, file) }
        .set { query_ch }

    BLAST( query_ch, db_path )
}

Note that the usual way to specify the number of threads/cpus is using cpus directive, which can be configured using a process selector in your nextflow.config. For example:

process {

    withName: BLAST {
        cpus = 4
    }
}

huangapple
  • 本文由 发表于 2023年2月16日 05:48:56
  • 转载请务必保留本文链接:https://go.coder-hub.com/75465741.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定