如何在不重新运行整个Snakemake流程的情况下运行其中一部分?

huangapple go评论71阅读模式
英文:

How to run one part of a Snakemake pipeline without rerunning entire pipeline?

问题

我正在尝试测试管道的一个部分是否产生了正确的输出。

我通过将一个bash脚本提交到集群来运行Snakemake,并尝试添加:
--rule
--allowed-rules

在一个for循环中,使整个命令为:run_pipeline.sh <rule1> <rule2> 等...

当我使用--rule时,它说规则不能包含通配符。
当我使用--allowed-rules时,它显示无需执行任何操作,文件已经存在。

--allowed-rules的消息不准确,因为没有生成正确的输出文件。

我真的需要重新运行整个管道并删除所有先前的输入文件吗,还是有更好的方法?

英文:

I am trying to test that one section of the pipeline produces the correct output.

I am running Snakemake by submitting a bash script to a cluster, and have tried adding:
--rule
--allowed-rules

inside of a for loop so that the entire command is: run_pipeline.sh <rule1> <rule2> etc...

When I use --rule, it says that the rules cannot contain wildcards.
When I use --allowed-rules, it says nothing to be done, files already present.

The --allowed-rules message is not accurate because the correct output files are not generated.

Do I really have to rerun the entire pipeline and delete all previous input files or is there a better way?

答案1

得分: 2

不要回答我要翻译的问题。

替代使用内部命令,如 --allowed-rules(关于此的文档说明:“请注意,这主要用于内部使用,否则可能导致意外结果。”),请使用以下之一:

--until, -U <target>

运行流程直到达到指定的规则或文件。仅运行指定规则或文件的依赖任务,不运行兄弟DAG。

--omit-from, -O <target>

阻止执行或创建给定规则或文件,以及DAG中这些目标下游的任何规则或文件。还运行与此处指定的规则或文件无关的兄弟DAG中的任务。

如果不确定目标是什么,可以帮助可视化您的DAG

您的流程不应重新运行所有规则,而只应重新运行受您对流程的更改或受 --force 执行规则影响的文件的规则。

英文:

Instead of using internal commands like --allowed-rules (for which the documentation states: "Note that this is intended primarily for internal use and may lead to unexpected results otherwise.") use one of the following:

> --until, -U &lt;target&gt;
>
> Runs the pipeline until it reaches the specified rules or files. Only runs jobs that are dependencies of the specified rule or files, does not run sibling DAGs.
>
> --omit-from, -O &lt;target&gt;
>
> Prevent the execution or creation of the given rules or files as well as any rules or files that are downstream of these targets in the DAG. Also runs jobs in sibling DAGs that are independent of the rules or files specified here.

If you are unsure what the targets are, it can help to visualise your DAG.

Your pipeline should not rerun all rules, but only those of files affected by your changes to the pipeline or files affected by rules with --force exeucion.

huangapple
  • 本文由 发表于 2023年5月10日 12:35:16
  • 转载请务必保留本文链接:https://go.coder-hub.com/76214925.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定