snakemake 选择要运行的规则。

huangapple go评论76阅读模式
英文:

snakemake chose rule to run

问题

我正在尝试编辑一个 Snakemake 流程,这个流程有很多规则,如下所示:

rule A:
rule B:
rule C:
    ...

rule C 的输入是 rule B 的输出,以下是一个简单的示例:

rule B:
    output:
       'path/to/output_B.txt'
    shell:
       """
           echo "This is rule B output" > {output}
       """

rule C:
    input:
       'path/to/output_B.txt'
    output:
       'path/to/output_C.txt'
    shell:
       """
           cat {input} > {output}
       """

我的期望:

现在,我想添加一个 rule D 和一个名为 use_rule_D 的配置文件参数,当 use_rule_D 设置为 true 时,我希望 Snakemake 调用 rule D 来生成 rule C 的输入,而不是 rule B

我的解决方案:

我添加了一个 rule D,输出为 /path/to/output_D.txt,并使用一个函数来选择 rule B 的输入。这个方法有效:

rule B:
    output:
       'path/to/output_B.txt'
    shell:
       """
           echo "This is rule B output" > {output}
       """

def get_input(config):
    if config['use_rule_D']:
        return 'path/to/output_D.txt'
    return 'path/to/output_B.txt'

rule C:
    input:
       input_data=get_input(config)
    output:
       'path/to/output_C.txt'
    shell:
       """
           cat {input.input} > {output}
       """

rule D:
    output:
       'path/to/output_D.txt'
    shell:
       """
           echo "This is rule D output" > {output}
       """

但是,我的实际 Snakemake 脚本中有很多规则都需要 rule B 的输出。这意味着我需要修改每个相关的输入。我希望创建一个简单的方法,使 Snakemake 可以根据不同的参数生成 /path/to/output_B.txt,然后用于不同的规则。

英文:

I am trying to edit a snakemake pipeline, This pipeline has a lot of rules, like:

rule A:
rule B:
rule C:
    ...

The rule C input is the output of B, here is a simple example:

rule B:
    output: 
       'path/to/output_B.txt'
    shell: 
       """
           echo "This is rule B output" > {output}
       """

rule C:
    input: 
       'path/to/output_B.txt'
    output: 
       'path/to/output_C.txt'
    shell: 
       """
           cat {input} > {output}
       """

My expected:

Now, I want to add a rule D and a params in configfile called use_rule_D, when use_rule_D be set to true, I want snakemake call rule D to generate input of rule C instead rule B

My solutions:

I add rule D with output /path/to/output_D.txt and use a function to chose the input of rule B, Its work

rule B:
    output: 
       'path/to/output_B.txt'
    shell: 
       """
           echo "This is rule B output" > {output}
       """

def get_input(config): 
    if config['use_rule_D']: 
        return 'path/to/output_D.txt'
    return 'path/to/output_B.txt'

rule C:
    input: 
       input_data=get_input(config)
    output: 
       'path/to/output_C.txt'
    shell: 
       """
           cat {input.input} > {output}
       """

rule D:
    output: 
       'path/to/output_D.txt'
    shell: 
       """
           echo "This is rule D output" > {output}
       """

But, there are many rule in my real snakemake script need the output of rule B, This means that I need to modify every relevant input. I want to create a simple method to make Snakemake generate /path/to/output_B.txt using different rules under different parameters.

答案1

得分: 2

> 我想要在配置文件中添加规则 D 和一个名为 use_rule_D 的参数,当 use_rule_D 被设置为 true 时,我希望 snakemake 调用规则 D 来生成规则 C 的输入,而不是规则 B。

要实现这一点的一种方法是在 Snakefile 中添加显式条件:

if config.get("use_rule_D") is True:
    # 使用规则 D
    rule B:
        output: 
           'path/to/output_complex.txt'
else:
    # 使用规则 B
    rule B:
        output: 
           'path/to/output_complex.txt'

请注意,在这两种情况下,输出都是相同的(output_complex.txt),但预期执行(shell/run)会有所不同。将输出命名为相同的名称允许您避免为下游规则定义条件。

英文:

> I want to add a rule D and a params in configfile called use_rule_D, when use_rule_D be set to true, I want snakemake call rule D to generate input of rule C instead rule B

One way to achieve this is to add an explicit conditional in the snakefile:

if config.get("use_rule_D") is True:
    # use rule D
    rule B:
        output: 
           'path/to/output_complex.txt'
else:
    # use rule B
    rule B:
        output: 
           'path/to/output_complex.txt'

Note that in both cases the output is the same (output_complex.txt), but the execution (shell/run) is expected to be different. Having the same name for the output allows you to avoid defining conditionals for downstream rules.

huangapple
  • 本文由 发表于 2023年7月24日 18:23:58
  • 转载请务必保留本文链接:https://go.coder-hub.com/76753543.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定