英文:
Snakemake wrappers not working on SLURM cluster compute nodes without internet
问题
I am trying to use wrappers in my pipeline on a SLURM cluster, where the compute nodes do not have internet access.
我正在尝试在SLURM集群上使用包装器,其中计算节点没有互联网访问权限。
I have run the pipeline with --conda-create-envs-only
first, and then changed the wrapper:
directives to point to local folders containing environment.yaml
files.
我首先使用--conda-create-envs-only
运行了流程,然后更改了wrapper:
指令,使其指向包含environment.yaml
文件的本地文件夹。
Jobs fail without specific error. Test rule with the same configuration but without wrappers works. Rules with wrappers work fine on login node if I switch back the wrapper:
directive to look online.
作业失败,没有明确的错误。具有相同配置但没有包装器的测试规则可以正常工作。如果我将wrapper:
指令切换回在线查找,那么带有包装器的规则在登录节点上也能正常工作。
I am running:
snakemake --profile myprofile --cores 40 --use-conda
我正在运行:
snakemake --profile myprofile --cores 40 --use-conda
Example rule:
# Run FastQC on the fastq.gz files
rule fastqc_fastq_gz:
input:
input_dir + "{sample}_{read}_001.fastq.gz",
output:
html = output_dir + "fastqc/{sample}_{read}_fastqc.html",
zip = output_dir + "fastqc/{sample}_{read}_fastqc.zip",
params:
extra = "--quiet",
log:
output_dir + "logs/fastqc/{sample}_{read}.log",
threads: 1,
resources:
mem_mb = 1024,
wrapper:
# "file:///path/envs/v1.31.0/bio/fastqc/" # <- Fails on both login and compute nodes
"v1.31.0/bio/fastqc" # <- Works on login node to download env
示例规则:
# 在fastq.gz文件上运行FastQC
rule fastqc_fastq_gz:
input:
input_dir + "{sample}_{read}_001.fastq.gz",
output:
html = output_dir + "fastqc/{sample}_{read}_fastqc.html",
zip = output_dir + "fastqc/{sample}_{read}_fastqc.zip",
params:
extra = "--quiet",
log:
output_dir + "logs/fastqc/{sample}_{read}.log",
threads: 1,
resources:
mem_mb = 1024,
wrapper:
# "file:///path/envs/v1.31.0/bio/fastqc/" # <- Fails on both login and compute nodes
"v1.31.0/bio/fastqc" # <- Works on login node to download env
I have also tried using more resources, same behavior.
我还尝试使用更多资源,但行为相同。
My profile:
cluster:
mkdir -p logs/{rule} &&
sbatch
--partition={resources.partition}
--qos={resources.qos}
--cpus-per-task={threads}
--mem={resources.mem_mb}
--job-name=smk-{rule}-{wildcards}
--output=logs/{rule}/{rule}-{wildcards}-%j.out
--error=logs/{rule}/{rule}-{wildcards}-.%j.err
--account=<account>
--time={resources.time}
--parsable
default-resources:
- partition=<partition>
- qos=sbatch
- mem_mb="490G"
- tmpdir="/path/to/temp/"
- time="0-10:00:00"
max-jobs-per-second: 10
max-status-checks-per-second: 1
latency-wait: 60
jobs: 16
keep-going: True
rerun-incomplete: True
printshellcmds: True
scheduler: greedy
use-conda: True
cluster-status: status-sacct.sh
我的配置文件:
cluster:
mkdir -p logs/{rule} &&
sbatch
--partition={resources.partition}
--qos={resources.qos}
--cpus-per-task={threads}
--mem={resources.mem_mb}
--job-name=smk-{rule}-{wildcards}
--output=logs/{rule}/{rule}-{wildcards}-%j.out
--error=logs/{rule}/{rule}-{wildcards}-.%j.err
--account=<account>
--time={resources.time}
--parsable
default-resources:
- partition=<partition>
- qos=sbatch
- mem_mb="490G"
- tmpdir="/path/to/temp/"
- time="0-10:00:00"
max-jobs-per-second: 10
max-status-checks-per-second: 1
latency-wait: 60
jobs: 16
keep-going: True
rerun-incomplete: True
printshellcmds: True
scheduler: greedy
use-conda: True
cluster-status: status-sacct.sh
The submission log reads:
Error in rule fastqc_fastq_gz:
jobid: 38
input: /path/raw/S9_S9_R2_001.fastq.gz
output: /path/pipeline_out/fastqc/S9_S9_R2_fastqc.html, /path/pipeline_out/fastqc/S9_S9_R2_fastqc.zip
log: /path/pipeline_out/logs/fastqc/S9_S9_R2.log (check log file(s) for error details)
conda-env: /path/.snakemake/conda/a116f377bbaddedd93b228a3d4f74b1d_
cluster_jobid: 1491196
Error executing rule fastqc_fastq_gz on cluster (jobid: 38, external: 1491196, jobscript: /path/.snakemake/tmp.l0raomfw/snakejob.fastqc_fastq_gz.38.sh). For error details see the cluster log and the log files of the involved rule(s).
提交日志显示:
Error in rule fastqc_fastq_gz:
jobid: 38
input: /path/raw/S9_S9_R2_001.fastq.gz
output: /path/pipeline_out/fastqc/S9_S9_R2_fastqc.html, /path/pipeline_out/fastqc/S9_S9_R2_fastqc.zip
log: /path/pipeline_out/logs/fastqc/S9_S9_R2.log (check log file(s) for error details)
conda-env: /path/.snakemake/conda/a116f377bbaddedd93b228a3d
<details>
<summary>英文:</summary>
I am trying to use wrappers in my pipeline on a SLURM cluster, where the compute nodes do not have internet access.
I have run the pipeline with `--conda-create-envs-only` first, and then changed the `wrapper:` directives to point to local folders containing `environment.yaml` files.
Jobs fail without specific error. Test rule with the same configuration but without wrappers works. Rules with wrappers work fine on login node if I switch back the `wrapper:` directive to look online.
I am running:
```shell
snakemake --profile myprofile --cores 40 --use-conda
Example rule:
# Run FastQC on the fastq.gz files
rule fastqc_fastq_gz:
input:
input_dir + "{sample}_{read}_001.fastq.gz",
output:
html = output_dir + "fastqc/{sample}_{read}_fastqc.html",
zip = output_dir + "fastqc/{sample}_{read}_fastqc.zip",
params:
extra = "--quiet",
log:
output_dir + "logs/fastqc/{sample}_{read}.log",
threads: 1,
resources:
mem_mb = 1024,
wrapper:
# "file:///path/envs/v1.31.0/bio/fastqc/" # <- Fails on both login and compute nodes
"v1.31.0/bio/fastqc" # <- Works on login node to download env
I have also tried using more resources, same behavior.
My profile:
cluster:
mkdir -p logs/{rule} &&
sbatch
--partition={resources.partition}
--qos={resources.qos}
--cpus-per-task={threads}
--mem={resources.mem_mb}
--job-name=smk-{rule}-{wildcards}
--output=logs/{rule}/{rule}-{wildcards}-%j.out
--error=logs/{rule}/{rule}-{wildcards}-.%j.err
--account=<account>
--time={resources.time}
--parsable
default-resources:
- partition=<partition>
- qos=sbatch
- mem_mb="490G"
- tmpdir="/path/to/temp/"
- time="0-10:00:00"
max-jobs-per-second: 10
max-status-checks-per-second: 1
latency-wait: 60
jobs: 16
keep-going: True
rerun-incomplete: True
printshellcmds: True
scheduler: greedy
use-conda: True
cluster-status: status-sacct.sh
The submission log reads:
Error in rule fastqc_fastq_gz:
jobid: 38
input: /path/raw/S9_S9_R2_001.fastq.gz
output: /path/pipeline_out/fastqc/S9_S9_R2_fastqc.html, /path/pipeline_out/fastqc/S9_S9_R2_fastqc.zip
log: /path/pipeline_out/logs/fastqc/S9_S9_R2.log (check log file(s) for error details)
conda-env: /path/.snakemake/conda/a116f377bbaddedd93b228a3d4f74b1d_
cluster_jobid: 1491196
Error executing rule fastqc_fastq_gz on cluster (jobid: 38, external: 1491196, jobscript: /path/.snakemake/tmp.l0raomfw/snakejob.fastqc_fastq_gz.38.sh). For error details see the cluster log and the log files of the involved rule(s).
The tmp files do not exist.
Job logs simply read:
Building DAG of jobs...
Using shell: /usr/bin/bash
Provided cores: 80
Rules claiming more threads will be scaled down.
Provided resources: mem_mb=1000, mem_mib=954, disk_mb=7876, disk_mib=7512
Select jobs to execute...
[Date Time]
rule fastqc_fastq_gz:
input: /path/raw/S9_S9_R2_001.fastq.gz
output: /path/pipeline_out/fastqc/S9_S9_R2_fastqc.html, /path/pipeline_out/fastqc/S9_S9_R2_fastqc.zip
log: /path/pipeline_out/logs/fastqc/S9_S9_R2.log
jobid: 0
reason: Missing output files: /path/pipeline_out/fastqc/S9_S9_R2_fastqc.zip, /path/pipeline_out/fastqc/S9_S9_R2_fastqc.html
wildcards: sample=S9_S9, read=R2
threads: 2
resources: mem_mb=1000, mem_mib=954, disk_mb=7876, disk_mib=7512, tmpdir=/path/tmp/snakemake, partition=el7taskp, qos=sbatch, time=0-40:00:00
[Date Time]
Error in rule fastqc_fastq_gz:
jobid: 0
input: /path/raw/230319/S9_S9_R2_001.fastq.gz
output: /path/pipeline_out/fastqc/S9_S9_R2_fastqc.html, /path/pipeline_out/fastqc/S9_S9_R2_fastqc.zip
log: /path/pipeline_out/logs/fastqc/S9_S9_R2.log (check log file(s) for error details)
conda-env: /path/.snakemake/conda/a116f377bbaddedd93b228a3d4f74b1d_
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
FastQC logs are empty.
If I run on compute nodes with the wrapper
directives pointing to github it fails with error:
Building DAG of jobs...
Failed to open source file https://github.com/snakemake/snakemake-wrappers/raw/v1.31.0/bio/fastqc/environment.yaml
ConnectionError: HTTPSConnectionPool(host='github.com', port=443): Max retries exceeded with url: /snakemake/snakemake-wrappers/raw/v1.31.0/bio/fastqc/environment.yaml (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7fd067df0810>: Failed to establish a new connection: [Errno 113] No route to host')), attempt 1/3 failed - retrying in 3 seconds...
Failed to open source file https://github.com/snakemake/snakemake-wrappers/raw/v1.31.0/bio/fastqc/environment.yaml
ConnectionError: HTTPSConnectionPool(host='github.com', port=443): Max retries exceeded with url: /snakemake/snakemake-wrappers/raw/v1.31.0/bio/fastqc/environment.yaml (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7fd067df2150>: Failed to establish a new connection: [Errno 113] No route to host')), attempt 2/3 failed - retrying in 6 seconds...
Failed to open source file https://github.com/snakemake/snakemake-wrappers/raw/v1.31.0/bio/fastqc/environment.yaml
ConnectionError: HTTPSConnectionPool(host='github.com', port=443): Max retries exceeded with url: /snakemake/snakemake-wrappers/raw/v1.31.0/bio/fastqc/environment.yaml (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7fd067dfc250>: Failed to establish a new connection: [Errno 113] No route to host')), attempt 3/3 failed - giving up!
WorkflowError:
Failed to open source file https://github.com/snakemake/snakemake-wrappers/raw/v1.31.0/bio/fastqc/environment.yaml
ConnectionError: HTTPSConnectionPool(host='github.com', port=443): Max retries exceeded with url: /snakemake/snakemake-wrappers/raw/v1.31.0/bio/fastqc/environment.yaml (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7fd067dfc250>: Failed to establish a new connection: [Errno 113] No route to host'))
File "/path/mambaforge/envs/snakemake/lib/python3.11/site-packages/reretry/api.py", line 218, in retry_call
File "/path/mambaforge/envs/snakemake/lib/python3.11/site-packages/reretry/api.py", line 31, in __retry_internal
答案1
得分: 1
Ok, so I am marking this as solved.
Snakemake requires internet connection for wrappers, even if the environment has been previously built using --conda-create-envs-only
. My solution does not work because snakemake needs the whole wrapper pointed to and not just the yaml
file. The solution Troy pointed to works, but a somewhat more elegant solution suggested by euronion would be to move all wrappers in a path ending in v1.31.0/bio/<tool>
and then use the --wrapper-prefix="file:///path/envs/
flag when calling snakemake on the compute nodes. In any case I am opening a feature request on Github for a more streamlined way to do this because even the prefix solution feels clunky and requires the added step of cloning the snakemake-wrappers
repo.
英文:
Ok, so I am marking this as solved.
Snakemake requires internet connection for wrappers, even if the environment has been previously built using --conda-create-envs-only
. My solution does not work because snakemake needs the whole wrapper pointed to and not just the yaml
file. The solution Troy pointed to works, but a somewhat more elegant solution suggested by euronion would be to move all wrappers in a path ending in v1.31.0/bio/<tool>
and then use the --wrapper-prefix="file:///path/envs/
flag when calling snakemake on the compute nodes. In any case I am opening <a href="https://github.com/snakemake/snakemake/issues/2262">a feature request on Github</a> for a more streamlined way to do this because even the prefix solution feels clunky and requires the added step of cloning the snakemake-wrappers
repo.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论