英文:
How to run one part of a Snakemake pipeline without rerunning entire pipeline?
问题
我正在尝试测试管道的一个部分是否产生了正确的输出。
我通过将一个bash脚本提交到集群来运行Snakemake,并尝试添加:
--rule
--allowed-rules
在一个for循环中,使整个命令为:run_pipeline.sh <rule1> <rule2> 等...
当我使用--rule时,它说规则不能包含通配符。
当我使用--allowed-rules时,它显示无需执行任何操作,文件已经存在。
--allowed-rules的消息不准确,因为没有生成正确的输出文件。
我真的需要重新运行整个管道并删除所有先前的输入文件吗,还是有更好的方法?
英文:
I am trying to test that one section of the pipeline produces the correct output.
I am running Snakemake by submitting a bash script to a cluster, and have tried adding:
--rule
--allowed-rules
inside of a for loop so that the entire command is: run_pipeline.sh <rule1> <rule2> etc...
When I use --rule, it says that the rules cannot contain wildcards.
When I use --allowed-rules, it says nothing to be done, files already present.
The --allowed-rules message is not accurate because the correct output files are not generated.
Do I really have to rerun the entire pipeline and delete all previous input files or is there a better way?
答案1
得分: 2
不要回答我要翻译的问题。
替代使用内部命令,如 --allowed-rules
(关于此的文档说明:“请注意,这主要用于内部使用,否则可能导致意外结果。”),请使用以下之一:
--until, -U <target>
运行流程直到达到指定的规则或文件。仅运行指定规则或文件的依赖任务,不运行兄弟DAG。
--omit-from, -O <target>
阻止执行或创建给定规则或文件,以及DAG中这些目标下游的任何规则或文件。还运行与此处指定的规则或文件无关的兄弟DAG中的任务。
如果不确定目标是什么,可以帮助可视化您的DAG。
您的流程不应重新运行所有规则,而只应重新运行受您对流程的更改或受 --force
执行规则影响的文件的规则。
英文:
Instead of using internal commands like --allowed-rules
(for which the documentation states: "Note that this is intended primarily for internal use and may lead to unexpected results otherwise.") use one of the following:
> --until, -U <target>
>
> Runs the pipeline until it reaches the specified rules or files. Only runs jobs that are dependencies of the specified rule or files, does not run sibling DAGs.
>
> --omit-from, -O <target>
>
> Prevent the execution or creation of the given rules or files as well as any rules or files that are downstream of these targets in the DAG. Also runs jobs in sibling DAGs that are independent of the rules or files specified here.
If you are unsure what the targets are, it can help to visualise your DAG.
Your pipeline should not rerun all rules, but only those of files affected by your changes to the pipeline or files affected by rules with --force
exeucion.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论