在Snakemake中同时使用dict和expand时会丢失某些值。

huangapple go评论62阅读模式
英文:

Loss of certain values when using dict and expand together in Snakemake

问题

我尝试使用一个函数作为规则输入来返回一个字典,这是一个简单的例子:

# debug.smk
input_a='/{output}/to/{group_name}/input_a'
input_b='/{output}/to/{group_name}/input_b'

rule_a=True
rule_b=True

output='output'
group_name='group_name'

def get_rule_output(): 
    output={}
    if rule_a: 
        output['output_a']=expand(input_a, output=output, group_name=group_name)
        print(expand(input_a, output=output, group_name=group_name))
    if rule_b: 
        output['output_xx']=expand(input_b, output=output, group_name=group_name)

    print(output)
    
    return output

rule all: 
    input: 
        get_rule_output(), 

我使用以下命令运行它:

snakemake -s debug.smk

我的期望是:

get_rule_output() 返回一个包含两个输出的字典

{'output_a': ['/output_a/to/group_name/input_a'], 'output_xx': ['/output_a/to/group_name/input_b']}

然而,它的返回值是:

{'output_a': [], 'output_xx': ['/output_a/to/group_name/input_b']}

通过print()调试输出如下:

snakemake -s snakemake/debug.smk
['/output_a/to/group_name/input_a']
{'output_a': [], 'output_xx': ['/output_a/to/group_name/input_b']}
Building DAG of jobs...

#其他snakemake调试输出

请注意,这是一个代码示例,我只能提供翻译和解释,无法运行代码或提供技术支持。

英文:

I have try to use a function to return a dict as a rule input, here is a simple expamle:

# debug.smk
input_a='/{output}/to/{group_name}/input_a'
input_b='/{output}/to/{group_name}/input_b'

rule_a=True
rule_b=True

output='output'
group_name='group_name'

def get_rule_output(): 
    output={}
    if rule_a: 
        output['output_a']=expand(input_a, output=output, group_name=group_name)
        print(expand(input_a, output=output, group_name=group_name))
    if rule_b: 
        output['output_xx']=expand(input_b, output=output, group_name=group_name)

    print(output)
    
    return output

rule all: 
    input: 
        get_rule_output(), 

I use following command to run it

snakemake -s debug.smk

My expected:

get_rule_output() return a dict with two output

{'output_a': ['/output_a/to/group_name/input_a'], 'output_xx': ['/output_a/to/group_name/input_b']}

however, its return value is

{'output_a': [], 'output_xx': ['/output_a/to/group_name/input_b']}

debug output by print() is following

snakemake -s snakemake/debug.smk
['/output_a/to/group_name/input_a']
{'output_a': [], 'output_xx': ['/output_a/to/group_name/input_b']}
Building DAG of jobs...

#other snakemake debug output

答案1

得分: 1

变量名的选择不太幸运。有两个变量名为output,一个用于输出,另一个在工作流中定义为:

output='output'

这导致了变量扩展的循环依赖。修正变量名应该可以解决这个问题。

英文:

The choice of variable names is unfortunate. There are two variables output, one is used for the output and another one is defined in the workflow as:

output='output'

This leads to circularity in variable expansion. Fixing the names should resolve the issue.

答案2

得分: 0

你需要在函数内部将output={}重命名:

在你的代码中,output有两个含义:

  • output='output',这也是你想在expand(...)中使用的值
  • output={},这是你想要保存结果的字典,但你却在expand(...)中使用了它

由于局部作用域的原因,你的expand(...)语句中使用了output={}而不是output='output'

以下是修改后的代码:

# debug.smk
input_a = "/{output}/to/{group_name}/input_a"
input_b = "/{output}/to/{group_name}/input_b"

rule_a = True
rule_b = True

output = "output"
group_name = "group_name"


def get_rule_output():

    d= {}

    d["output_a"] = expand(input_a, output=output, group_name=group_name)
    d["output_b"] = expand(input_b, output=output, group_name=group_name)

    print(d)

    return output


rule all:
    input:
        get_rule_output(),

希望对你有帮助!

英文:

You need to rename output={} inside your function:

In your code output has two meanings:

  • output='output' which is also the value you want to use in expand(...)
  • output={} which is the dictionary you want to hold your results, but which you are using in your expand(...) instead

Due to the local scope, output={} is used instead of output='output' for your expand(...) statement.

This works

# debug.smk
input_a = "/{output}/to/{group_name}/input_a"
input_b = "/{output}/to/{group_name}/input_b"

rule_a = True
rule_b = True

output = "output"
group_name = "group_name"


def get_rule_output():

    d= {}

    d["output_a"] = expand(input_a, output=output, group_name=group_name)
    d["output_b"] = expand(input_b, output=output, group_name=group_name)

    print(d)

    return output


rule all:
    input:
        get_rule_output(),

huangapple
  • 本文由 发表于 2023年7月27日 15:33:15
  • 转载请务必保留本文链接:https://go.coder-hub.com/76777450.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定