2023年8月5日 08:25:38go评论70阅读模式

英文:

How to specify optional inputs for nextflow processes?

问题

我是你的中文翻译，以下是代码部分的翻译：

我是初学者，正在尝试为一些Python脚本创建一个小型的nextflow管道。然而，我遇到了一个关于处理可选输入的问题，似乎无法找到解决方法。我也想知道处理可选输入和参数的最佳实践是什么。

```python
#!/usr/bin/env nextflow

params.out = ""
params.kml_1 = null
params.kml_2 = null
params.loc = ""
params.new_data_1 = false
params.new_data_2 = false

process getPolygons {
    input:
    tuple val(db_table), path(path_to_kml), val(new_data)
    val loc
    path path_to_outdir

    def new_data_arg = new_data ? "--new_data" : ""
    def kml_arg = (path_to_kml != null) ? "--kml $path_to_kml" : ""

    script:
    """
    python3 ${baseDir}/bin/polygon_data.py --loc $loc --db_table $db_table $kml_arg $new_data_arg --outdir $path_to_outdir
    """
}

workflow {
    outdir_ch = Channel.fromPath(params.out)
    location_ch = Channel.of(params.loc)

    tables = [
        tuple("Table1", params.kml_1 ? params.new_data_1 : null, params.new_data_1),
        tuple("Table2", params.kml_2 ? params.new_data_2: null, params.new_data_2)
    ]
    tables_ch = Channel.from(tables)

    getPolygons(tables_ch, location_ch, outdir_ch)
}

在添加可选输入之前，此代码是有效的。在这之前，tables 是一个包含表格名称的列表，没有考虑到 getPolygons 中的可选参数 path_to_kml 和 new_data。它看起来是这样的：

tables = ["Table1", "Table2"]

我一直遇到以下错误：

ERROR ~ No such variable: new_data

或

ERROR ~ No such variable: path_to_kml

具体的错误消息取决于创建变量 new_data_arg 和 kml_arg 以及在脚本中使用它们的顺序。

尝试使用元组的方法是我最新尝试解决此问题的方式，该程序在处理可选参数 new_data 和 path_to_kml 时遇到问题。我之前将它们作为 getPolygons 的独立输入。问题可能是在脚本中创建变量 new_data_arg 和 kml_arg 并在调用 polygon_data.py 时使用它们而不是直接使用 new_data 和 path_to_kml。如果是这样，我不太确定解决方法是什么，因为根据我的需求，我需要在调用 polygon_data.py 之前对 new_data 和 path_to_kml 进行一些逻辑处理。


<details>
<summary>英文:</summary>

I&#39;m new to nextflow and have been trying to create a small pipeline for some python scripts I have. However, I have encountered an issue regarding optional inputs to processes that I can&#39;t seem to figure out a workaround for. I&#39;m also curious what best practices would be for optional inputs and parameters.

#!/usr/bin/env nextflow

params.out = ""
params.kml_1 = null
params.kml_2 = null
params.loc = ""
params.new_data_1 = false
params.new_data_2 = false

process getPolygons {
input:
tuple val(db_table), path(path_to_kml), val(new_data)
val loc
path path_to_outdir

def new_data_arg = new_data ? &quot;--new_data&quot; : &quot;&quot;
def kml_arg = (path_to_kml != null) ? &quot;--kml $path_to_kml&quot; : &quot;&quot;

script:
&quot;&quot;&quot;
python3 ${baseDir}/bin/polygon_data.py --loc $loc --db_table $db_table $kml_arg $new_data_arg --outdir $path_to_outdir
&quot;&quot;&quot;

}

workflow {
outdir_ch = Channel.fromPath(params.out)
location_ch = Channel.of(params.loc)

tables = [
    tuple(&quot;Table1&quot;, params.kml_1 ? params.new_data_1 : null, params.new_data_1),
    tuple(&quot;Table2&quot;, params.kml_2 ? params.new_data_2: null, params.new_data_2)
]
tables_ch = Channel.from(tables)

getPolygons(tables_ch, location_ch, outdir_ch)

}



The code worked prior to adding in the optional inputs. This was before I had made `tables` a list of tuples in order to account for the optional parameters in getPolygons: path_to_kml and new_data, instead it was:

```tables = [&quot;Table1&quot;, &quot;Table2&quot;]```

I keep running into the error

```ERROR ~ No such variable: new_data``` or ```ERROR ~ No such variable: path_to_kml```

depending on the order of creating the variables new_data_arg and kml_arg.


Trying the tuple method is the latest thing I have done to address this issue that the program has with the optional parameters new_data and path_to_kml. I previously had them as separate inputs to getPolygons. Could the issue be with creating the variables new_data_arg and kml_arg and using them in the script instead of using new_data and path_to_kml directly? If so, I&#39;m not really sure what the work around is because for my purposes, I need some logic applied to new_data and path_to_kml before adding this information when invoking polygon_data.py.



</details>


# 答案1
**得分**: 0

我已找到一个解决方案，使用了元组。首先，“ERROR ~ No such variable”问题是因为变量`new_data_arg`和`kml_arg`不在流程的脚本组件中（初学者的错误）。

接下来，我意识到这不会遍历元组，所以我能够像这样利用每个元组，将元组作为变量`tuple_info`传递，并使用空字符串而不是null作为path_to_kml，因为它是一个路径，可能会出现null的问题。所以这是我的流程的最终可行版本：

```shell
process getPolygons {
    input:
    each tuple_info
    val loc
    path path_to_outdir

    script:
    def (db_table, path_to_kml, new_data) = tuple_info
    def new_data_arg = new_data ? "--new_data" : ""
    def kml_arg = (path_to_kml != "") ? "--kml $path_to_kml" : ""

    """
    python3 ${baseDir}/bin/polygon_data.py --loc $loc --db_table $db_table $kml_arg $new_data_arg --outdir $path_to_outdir
    """
}

我也意识到，我本可以简化tables列表，因为在处理参数的初始化时，没有必要围绕params.kml_1和params.kml_2构建额外的逻辑。

tables = [
    tuple("Table1", params.kml_1, params.new_data_1),
    tuple("Table2", params.kml_2, params.new_data_2)
]

英文:

I have found a solution to this that utilized tuples. First the ERROR ~ No such variable issues were due to the variables new_data_arg and kml_arg not being inside the script component of the process (rookie mistake).

Next, I realized that this would not iterate over the tuples, so I was able to utilize each to do so passing in the tuple as the variable tuple_info like so, and used "" instead of null for the path_to_kml as it is a path and there could be issues with null. so this is the final workable version for my process:

process getPolygons {
    input:
    each tuple_info
    val loc
    path path_to_outdir

    script:
    def (db_table, path_to_kml, new_data) = tuple_info
    def new_data_arg = new_data ? &quot;--new_data&quot; : &quot;&quot;
    def kml_arg = (path_to_kml != &quot;&quot;) ? &quot;--kml $path_to_kml&quot; : &quot;&quot;

    &quot;&quot;&quot;
    python3 ${baseDir}/bin/polygon_data.py --loc $loc --db_table $db_table $kml_arg $new_data_arg --outdir $path_to_outdir
    &quot;&quot;&quot;
}

I also realize that I could have simplified the tables list as theres no reason to build extra logic surrounding params.kml_1 and params.kml_2 when the initialization of the parameters handles this.

tables = [
    tuple(&quot;Table1&quot;, params.kml_1, params.new_data_1),
    tuple(&quot;Table2&quot;, params.kml_2, params.new_data_2)
]

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何为Nextflow进程指定可选输入？

问题

nextflow – spltiCSV – each element – error : 如果需要重复使用相同的组件

nextflow: 在另一个脚本中使用全局变量并使用 .name（创建索引）

避免生成工作目录，尽管有输出文件夹。

Nextflow 在 GCP 上 – 等待容器错误

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论