Snakemake工作流在本地运行但不作为GitHub操作运行?

huangapple go评论52阅读模式
英文:

Snakemake workflow runs locally but not as github action?

问题

The provided text appears to be a detailed description of an issue you're facing with Snakemake and GitHub Actions related to a rule in your workflow. It seems that you've tried various approaches to resolve the issue, but it persists when running as a GitHub Action. The update mentions potential differences in the bash shell between Ubuntu and macOS.

If you have specific questions or need assistance with a particular aspect of this issue, please let me know, and I'll do my best to provide guidance or suggestions.

英文:

I have been banging my head against this for a few days now. I am stumped. I have tried so many different things, it's hard to keep everything straight. And running snakemake actually works locally, with pretty much everything I've tried, but this one rule's test always fails as a github action.

The crux of it is, I want to run a tool called bigwigmerge to sum the signals of bigwig files. However, that tool doesn't work when there's only 1 input file, so I decided that for that case, I would use another tool called bigwigtobedgraph (because the output of bigwigmerge is a bedgraph file).

My initial attempt was this:

rule bigwigs_to_summed_bedgraph:
    input:
        expand(
            "results/bigwig/{dataset}.bw",
            dataset=DATASETS,
        ),
    output:
        pipe("results/bigwig/all_summed.bedGraph"),
    params:
        nargs=len(DATASETS),
    log:
        std="results/bigwig/logs/bigwigs_to_summed_bedgraph.stdout",
        err="results/bigwig/logs/bigwigs_to_summed_bedgraph.stderr",
    conda:
        "../envs/bigwig_tools.yml"
    shell:
        # bigwigmerge requires a minimum of 2 files, but the original R script
        # supports 1, so to support 1 file here, we use bigwigtobedgraph
        """
        if [ {params.nargs} -eq 1 ]; then \
            bigwigtobedgraph {input:q} {output:q} 1> {log.std:q} 2> {log.err:q}; \
        else \
            bigwigmerge {input:q} {output:q} 1> {log.std:q} 2> {log.err:q}; \
        fi
        """

That works locally when I run:

snakemake --directory .tests/test_6_summits --use-conda --cores 2 --printshellcmds all_atac_summits --forceall --show-failed-logs

But as a github action:

  temporary-summits-delete-after-59-merged:
    runs-on: ubuntu-latest
    steps:
      - name: Checkout Codebase
        uses: actions/checkout@v3
      - name: Test summits steps only
        uses: snakemake/snakemake-github-action@v1
        with:
          directory: '.tests/test_6_summits'
          snakefile: 'workflow/Snakefile'
          args: >-
            --use-conda -c 2 --show-failed-logs -p --verbose all_atac_summits

with the all_atac_summits rule:

rule all_atac_summits:
    input:
        "results/all_atac_summits.bed",
        "results/all_atac_summits.tsv",

It always issues the error:

Error in group 8c80e530-b72c-4cd2-ba80-00b10c34aa9e:
    jobs:
        rule sort_summed_bedgraph:
            jobid: 5
            output: results/bigwig/all_sorted.bedGraph
            log: results/bigwig/logs/sort_summed_bedgraph.stderr (check log file(s) for error details)
        rule bigwigs_to_summed_bedgraph:
            jobid: 6
            output: results/bigwig/all_summed.bedGraph (pipe)
            log: results/bigwig/logs/bigwigs_to_summed_bedgraph.stdout, results/bigwig/logs/bigwigs_to_summed_bedgraph.stderr (check log file(s) for error details)
Logfile results/bigwig/logs/sort_summed_bedgraph.stderr: empty file

Removing output files of failed job sort_summed_bedgraph since they might be corrupted:
results/bigwig/all_sorted.bedGraph
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message

Things I have tried:

  • change pipe() to temp() -> works locally but same gh action error
  • change if [ {params.nargs} -eq 1 ]; then \ to if [ "{params.nargs}" == "1" ]; then \ -> works locally but same gh action error
  • change if [ {params.nargs} -eq 1 ]; then \ to if [[ {params.nargs} -eq 1 ]]; then \ -> works locally but got a different gh action error. I think it was syntax, but the log is gone, so I can't copy/paste it here
  • change the shell to:
    bigwigmerge {input:q} {output:q} 1> {log.std:q} 2> {log.err:q} || \
    bigwigtobedgraph {input:q} {output:q} 1> {log.std:q} 2> {log.err:q}
    

    -> works locally but same gh action error

  • As a sanity check, change the shell to (I only test the 1 file case RN):
    bigwigtobedgraph {input:q} {output:q} 1> {log.std:q} 2> {log.err:q}
    

    -> works locally AND works as gh action (i.e. I am sane)

I considered using run instead of shell, but I learned that you can't use conda with run because it runs the python code in the snakemake environment for some reason.

I've been struggling to break this up into separate rules, to the point where I'm not convinced it can be done. I simply don't understand what the problem is and the error in the gh action log is completely uninformative. If I had such a paucity of debug info locally, I would use --printshellcmds so I could copy and paste the command to try it on its own - then I would usually see some shell-level error, like command not found or something. But I can't do that on github and I can't reproduce the error on my machine. Why would everything work as expected on my machine, but not as a github action?

UPDATE: So my boss's workstation is an Ubuntu box, and he can produce the same error that github throws. My machine is macOS. So it must have to do with differences in the bash shell that snakemake uses? He has GNU bash, version 5.1.16(1)-release (x86_64-pc-linux-gnu). I have GNU bash, version 5.0.17(1)-release (x86_64-apple-darwin19.4.0).

答案1

得分: 3

以下是已翻译的内容:

所以事实证明这里发生了多个问题... 涉及的因素有:

  • macOS 文件系统不区分大小写
  • snakemake 的 --show-failed-logs 仅打印多个日志中的一个
  • 在由 pipe() 自动创建的组中,从组中打印的日志不是原始错误的日志,尽管我确信随后的规则也会失败。

错误来自创建管道的规则,但打印的日志来自使用管道的规则。我知道这一点是因为我的老板在他的 ubuntu 机器上本地运行工作流时查看了日志文件,他发现了 bedGraphToBigWig 的 "command not found" 错误。

在我的具体情况中,问题是命令实际上是 bedGraphToBigWigbigWigMerge,但我使用的是 bigwigtobedgraphbigwigmerge。由于 macOS 不区分大小写,所以在我的机器上没有错误。在 GitHub Action 中,我不知道问题在哪里,因为它不显示相关的日志,而且我无法在本地重现错误。

依我看,--show-failed-logs 应该显示具有错误的规则所有日志,而不仅仅是来自没有错误的规则的单个任意日志。

将命令更改为它们的实际大小写使规则在 GitHub Action 中成功。

请注意,我没有看到我的老板的屏幕,所以我相信 "command not found" 错误出现在 results/bigwig/logs/bigwigs_to_summed_bedgraph.stderr 中,而不是控制台。

英文:

So it turns out that there are multiple things going on here... The factors involved are:

  • The macOS file system is case insensitive
  • snakemake's --show-failed-logs only prints 1 of multiple logs
  • In a group automatically created by pipe(), the log that's printed from the group is not the one with the original error, though I'm sure the subsequent rule would fail as well.

The error was from the rule that created the pipe, but the log that was printed was from the one that consumed the pipe. I know that because my boss looked at the log files when he ran the workflow locally on his ubuntu box and he discovered the "command not found" error for bedGraphToBigWig.

The problem in my specific case was that the commands are actually bedGraphToBigWig and bigWigMerge, but I had bigwigtobedgraph and bigwigmerge. Since macOS is case insensitive, it worked without error on my machine. In the github action, I had no clue what the issue was becasue it doesn't show me the relevant log and I could not reproduce the error locally.

IMHO, --show-failed-logs should show all logs from the rule with the error, not just a single arbitrary log from the rule without the error.

Changing the commands to their actual case made the rule succeed in the github action.

Note, I haven't seen my boss's screen, so I'm taking his word for it that the command not found error was in results/bigwig/logs/bigwigs_to_summed_bedgraph.stderr and not to the console.

huangapple
  • 本文由 发表于 2023年6月9日 04:00:33
  • 转载请务必保留本文链接:https://go.coder-hub.com/76435321.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定