使用for循环来迭代地重命名文档

huangapple go评论57阅读模式
英文:

For-loop to rename documents iteratively

问题

对于所有的.mcool文件,如果clr.chromnames不以chr子串开头,则使用trans = {chrom: f"chr{chrom}" for chrom in clr.chromnames},然后使用cooler.rename_chroms(clr, trans)来追加这个子串。

我的代码有多余的嵌套for循环。如何使我的代码更有效率?

pathlist = Path(data_dir).glob('**/*.mcool')
for path in pathlist:
    cool_file = str(path)
    filename = cool_file.split("/", 1)[1]
    resolution = [i.rsplit("/", 1)[1] for i in cooler.fileops.list_coolers(cool_file)]

    ### 为每个分辨率加载一个cooler
    for j in resolution:
        clr = cooler.Cooler(f'{cool_file}::resolutions/{j}')
        for chrom in (chrom for chrom in clr.chromnames if not chrom.startswith("chr")):
            trans = {chrom: f"chr{chrom}" for chrom in clr.chromnames}
            cooler.rename_chroms(clr, trans)
        print(f'chromosomes: {clr.chromnames}, binsize: {clr.binsize}')

输入数据:

clr.chromnames

['M', 'chr1', '2', '3', 'chr4']
['7', '8', 'chr9', '10', '11', 'chr12']
['X', 'chrY', 'chr1', '2', '4']

期望输出:

clr.chromnames

['chrM', 'chr1', 'chr2', 'chr3', 'chr4']
['chr7', 'chr8', 'chr9', 'chr10', 'chr11', 'chr12']
['chrX', 'chrY', 'chr1', 'chr2', 'chr4']

英文:

For all the .mcool files, if the clr.chromnames does not start with the chr substring, append this substring using trans = {chrom: f"chr{chrom}" for chrom in clr.chromnames} followed by cooler.rename_chroms(clr, trans).

My code has redundant nested for-loop. How do I make my code more efficient?

for chrom in (chrom for chrom in clr.chromnames if not chrom.startswith("chr")):
    trans = {chrom: f"chr{chrom}" for chrom in clr.chromnames}

Full code:

pathlist = Path(data_dir).glob('**/*.mcool')
for path in pathlist:
    
     cool_file = str(path)
     filename = cool_file.split("/",1)[1]    
     resolution = [i.rsplit("/", 1)[1] for i in cooler.fileops.list_coolers(cool_file)]
         
     ### load a cooler for each resolution 
     for j in resolution:
            clr = cooler.Cooler(f'{cool_file}::resolutions/{j}')
            for chrom in (chrom for chrom in clr.chromnames if not chrom.startswith("chr")):
                trans = {chrom: f"chr{chrom}" for chrom in clr.chromnames}
                cooler.rename_chroms(clr, trans)
            print(f'chromosomes: {clr.chromnames}, binsize: {clr.binsize}')  

Input Data:

clr.chromnames

['M', 'chr1', '2', '3', 'chr4']
['7', '8', 'chr9', '10', '11', 'chr12']
['X', 'chrY', 'chr1', '2', '4']

Expected output:

clr.chromnames

['chrM', 'chr1', 'chr2', 'chr3', 'chr4']
['chr7', 'chr8', 'chr9', 'chr10', 'chr11', 'chr12']
['chrX', 'chrY', 'chr1', 'chr2', 'chr4']

答案1

得分: 0

只将理解部分移动到需要它的语句。

英文:

Just move the comprehension down to the statement where you need it.

     ### load a cooler for each resolution 
     for j in resolution:
            clr = cooler.Cooler(f'{cool_file}::resolutions/{j}')
            trans = {chrom: f"chr{chrom}" for chrom in clr.chromnames if not chrom.startswith("chr")}
            cooler.rename_chroms(clr, trans)

答案2

得分: 0

以下是已翻译的内容:

我没有与冷却器的任何经验但从Python的角度来看考虑以下代码块

for chrom in (chrom for chrom in clr.chromnames if not chrom.startswith("chr")):
    trans = {chrom: f"chr{chrom}" for chrom in clr.chromnames}
    cooler.rename_chroms(clr, trans)

我相信 cooler.rename_chroms 将重命名所有内容,因此您不需要将其放入循环中。您还可以将这两个循环合并为一个:

trans = {
    chrom: f"chr{chrom}"
    for chrom in clr.chromnames
    if not chrom.startswith("chr")
}
cooler.rename_chroms(clr, trans) # 只调用一次

另外,请考虑以下这一行:

filename = cool_file.split("/",1)[1]

如果您只想要文件名,那么这更加健壮:

filename = path.name

还有另一个:

clr = cooler.Cooler(f'{cool_file}::resolutions/{j}')

可以改为:

clr = cooler.Cooler(f"{path}::resolution/{j}")

简而言之,一旦您使用了 pathlib.Path,您很少需要将其从 pathlib.Path 转换为 str


<details>
<summary>英文:</summary>

I don&#39;t have any experience with cooler, but am speaking in term of Python. Consider this block of code:

```python
for chrom in (chrom for chrom in clr.chromnames if not chrom.startswith(&quot;chr&quot;)):
    trans = {chrom: f&quot;chr{chrom}&quot; for chrom in clr.chromnames}
    cooler.rename_chroms(clr, trans)

I believe that the cooler.rename_chroms will rename all, so you don't need to place it into a loop. You can also reduce the two loops into 1:

trans = {
    chrom: f&quot;chr{chrom}&quot;
    for chrom in clr.chromnames
    if not chrom.startswith(&quot;chr&quot;)
}
cooler.rename_chroms(clr, trans) # Only call once

Also, consider this line:

filename = cool_file.split(&quot;/&quot;,1)[1]

If you just want the file name, then this is more robust:

filename = path.name

Another one:

clr = cooler.Cooler(f&#39;{cool_file}::resolutions/{j}&#39;)

Can be:

clr = cooler.Cooler(f&quot;{path}::resolution/{j}&quot;)

In short, once you work with pathlib.Path, you rarely need a reason to convert from pathlib.Path to str.

huangapple
  • 本文由 发表于 2023年6月13日 10:37:47
  • 转载请务必保留本文链接:https://go.coder-hub.com/76461387.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定