snakemake 选择要运行的规则。

huangapple go评论97阅读模式
英文:

snakemake chose rule to run

问题

我正在尝试编辑一个 Snakemake 流程,这个流程有很多规则,如下所示:

  1. rule A:
  2. rule B:
  3. rule C:
  4. ...

rule C 的输入是 rule B 的输出,以下是一个简单的示例:

  1. rule B:
  2. output:
  3. 'path/to/output_B.txt'
  4. shell:
  5. """
  6. echo "This is rule B output" > {output}
  7. """
  8. rule C:
  9. input:
  10. 'path/to/output_B.txt'
  11. output:
  12. 'path/to/output_C.txt'
  13. shell:
  14. """
  15. cat {input} > {output}
  16. """

我的期望:

现在,我想添加一个 rule D 和一个名为 use_rule_D 的配置文件参数,当 use_rule_D 设置为 true 时,我希望 Snakemake 调用 rule D 来生成 rule C 的输入,而不是 rule B

我的解决方案:

我添加了一个 rule D,输出为 /path/to/output_D.txt,并使用一个函数来选择 rule B 的输入。这个方法有效:

  1. rule B:
  2. output:
  3. 'path/to/output_B.txt'
  4. shell:
  5. """
  6. echo "This is rule B output" > {output}
  7. """
  8. def get_input(config):
  9. if config['use_rule_D']:
  10. return 'path/to/output_D.txt'
  11. return 'path/to/output_B.txt'
  12. rule C:
  13. input:
  14. input_data=get_input(config)
  15. output:
  16. 'path/to/output_C.txt'
  17. shell:
  18. """
  19. cat {input.input} > {output}
  20. """
  21. rule D:
  22. output:
  23. 'path/to/output_D.txt'
  24. shell:
  25. """
  26. echo "This is rule D output" > {output}
  27. """

但是,我的实际 Snakemake 脚本中有很多规则都需要 rule B 的输出。这意味着我需要修改每个相关的输入。我希望创建一个简单的方法,使 Snakemake 可以根据不同的参数生成 /path/to/output_B.txt,然后用于不同的规则。

英文:

I am trying to edit a snakemake pipeline, This pipeline has a lot of rules, like:

  1. rule A:
  2. rule B:
  3. rule C:
  4. ...

The rule C input is the output of B, here is a simple example:

  1. rule B:
  2. output:
  3. 'path/to/output_B.txt'
  4. shell:
  5. """
  6. echo "This is rule B output" > {output}
  7. """
  8. rule C:
  9. input:
  10. 'path/to/output_B.txt'
  11. output:
  12. 'path/to/output_C.txt'
  13. shell:
  14. """
  15. cat {input} > {output}
  16. """

My expected:

Now, I want to add a rule D and a params in configfile called use_rule_D, when use_rule_D be set to true, I want snakemake call rule D to generate input of rule C instead rule B

My solutions:

I add rule D with output /path/to/output_D.txt and use a function to chose the input of rule B, Its work

  1. rule B:
  2. output:
  3. 'path/to/output_B.txt'
  4. shell:
  5. """
  6. echo "This is rule B output" > {output}
  7. """
  8. def get_input(config):
  9. if config['use_rule_D']:
  10. return 'path/to/output_D.txt'
  11. return 'path/to/output_B.txt'
  12. rule C:
  13. input:
  14. input_data=get_input(config)
  15. output:
  16. 'path/to/output_C.txt'
  17. shell:
  18. """
  19. cat {input.input} > {output}
  20. """
  21. rule D:
  22. output:
  23. 'path/to/output_D.txt'
  24. shell:
  25. """
  26. echo "This is rule D output" > {output}
  27. """

But, there are many rule in my real snakemake script need the output of rule B, This means that I need to modify every relevant input. I want to create a simple method to make Snakemake generate /path/to/output_B.txt using different rules under different parameters.

答案1

得分: 2

> 我想要在配置文件中添加规则 D 和一个名为 use_rule_D 的参数,当 use_rule_D 被设置为 true 时,我希望 snakemake 调用规则 D 来生成规则 C 的输入,而不是规则 B。

要实现这一点的一种方法是在 Snakefile 中添加显式条件:

  1. if config.get("use_rule_D") is True:
  2. # 使用规则 D
  3. rule B:
  4. output:
  5. 'path/to/output_complex.txt'
  6. else:
  7. # 使用规则 B
  8. rule B:
  9. output:
  10. 'path/to/output_complex.txt'

请注意,在这两种情况下,输出都是相同的(output_complex.txt),但预期执行(shell/run)会有所不同。将输出命名为相同的名称允许您避免为下游规则定义条件。

英文:

> I want to add a rule D and a params in configfile called use_rule_D, when use_rule_D be set to true, I want snakemake call rule D to generate input of rule C instead rule B

One way to achieve this is to add an explicit conditional in the snakefile:

  1. if config.get("use_rule_D") is True:
  2. # use rule D
  3. rule B:
  4. output:
  5. 'path/to/output_complex.txt'
  6. else:
  7. # use rule B
  8. rule B:
  9. output:
  10. 'path/to/output_complex.txt'

Note that in both cases the output is the same (output_complex.txt), but the execution (shell/run) is expected to be different. Having the same name for the output allows you to avoid defining conditionals for downstream rules.

huangapple
  • 本文由 发表于 2023年7月24日 18:23:58
  • 转载请务必保留本文链接:https://go.coder-hub.com/76753543.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定