如何通过字典中的关键字(keys)的值来提高文件重命名的效率?

huangapple go评论65阅读模式
英文:

How to increase efficiency of rename files by dictionary value according to keywords(keys) from dictionary?

问题

I try to rename files by the dictionary value according to the keywords(key) I have. The old name of the files is a long string containing the keywords(key) not exactly the same!! I want to find the key included in the file name and rename the file by the corresponding value. The value should be the new name for all files. The dictionary structure would look like the table below:

Dictionary name: nameKeyWords

Key (Keywords) Value (Name)
abb 1
ave 2
asp 3

Below is the code I wrote, and it does work. However, the code is very inefficient because I use three for loop to go through all the files, keywords(keys) in the dictionary, and all the file_name in file_names. Is there any method that can make the code more efficient? Thanks!

for (dir_path, dir_names, file_names) in walk(dir_path):
    for file_name in file_names:
        for keyWords in nameKeyWords:
            if keyWords in file_name:
                old_name = os.path.join(dir_path,file_name)
                new_name = os.path.join(dir_path,nameKeyWords.get(keyWords)+'csv')
                os.rename(old_name, new_name)
            else:
                print(file_name)
英文:

I try to rename files by the dictionary value according to the keywords(key) I have. The old name of the files is a long string containing the keywords(key) not exactly the same!! I want to find the key included in the file name and rename the file by the corresponding value. The value should be the new name for all files. The dictionary structure would look like the table below:

Dictionary name: nameKeyWords

Key (Keywords) Value (Name)
abb 1
ave 2
asp 3

Below is the code I wrote, and it does work. However, the code is very inefficient because I use three for loop to go through all the files, keywords(keys) in the dictionary, and all the file_name in file_names. Is there any method that can make the code more efficient? Thanks!

for (dir_path, dir_names, file_names) in walk(dir_path):
    for file_name in file_names:
        for keyWords in nameKeyWords:
            if keyWords in file_name:
                old_name = os.path.join(dir_path,file_name)
                new_name = os.path.join(dir_path,nameKeyWords.get(keyWords)+'.csv')
                os.rename(old_name, new_name)
            else:
                print(file_name)

答案1

得分: 0

I don't know anyway to get all the file_names without nested for loops, but you should break after os.rename(old_name, new_name), because there's no point in renaming the same file multiple times (and wouldn't it raise FileNotFoundError after the first renaming since there will no longer be a file named file_name in that directory?). And also, using for...else (instead of if...else inside for keyWords...) would keep the same file_name from being printed multiple times.

for (dir_path, dir_names, file_names) in walk(dir_path):
    for file_name in file_names:
        for keyWords in nameKeyWords:
            if keyWords in file_name:
                old_name = os.path.join(dir_path, file_name)
                new_name = os.path.join(dir_path, nameKeyWords.get(keyWords) + '.csv')
                os.rename(old_name, new_name)
                break  ## [ no need to keep checking ]
        else:
            print(file_name)  ## [ only prints if for never breaks ]

If you just want to decrease the level of nesting in your innermost loop, you can separate the loops:

fNameGenerator = (
    (dPath, fName) for dPath, dnames, fNames
    in os.walk(dir_path) for fName in fNames
)

for dPath, file_name in fNameGenerator:
    for keyWords in nameKeyWords:
        if keyWords in file_name:
            new_name = os.path.join(dPath, f'{nameKeyWords.get(keyWords)}.csv')
            os.rename(os.path.join(dPath, file_name), new_name)
            break
    else:
        print(file_name)

You could also get new_names within fNameGenerator

# nameKeyWords = {....}
def getNewFn(oldFn: str):
    for k in nameKeyWords:
        if k in oldFn:
            return f"{nameKeyWords[k]}.csv"

fNameGenerator = (
    (dPath, fName, getNewFn(fName)) for dPath, dnames, fNames
    in os.walk(dir_path) for fName in fNames
)

for dPath, file_name, new_name in fNameGenerator:
    if new_name is None:
        print(file_name)
    else:
        os.rename(*[os.path.join(dPath, fn) for fn in [file_name, new_name]])

Please note that none of these have decreased the complexity, and the nested loops might actually be the fastest alternative [although none of them seem to take up significantly less time than the rest].

英文:

I don't know anyway to get all the file_names without nested for loops, but you should break after os.rename(old_name, new_name), because there's no point in renaming the same file multiple times (and wouldn't it raise FileNotFoundError after the first renaming since there will no longer be a file named file_name in that directory?). And also, using for...else (instead of if...else inside for keyWords...) would keep the same file_name from being printed multiple times.

for (dir_path, dir_names, file_names) in walk(dir_path):
    for file_name in file_names:
        for keyWords in nameKeyWords:
            if keyWords in file_name:
                old_name = os.path.join(dir_path,file_name)
                new_name = os.path.join(dir_path,nameKeyWords.get(keyWords)+'.csv')
                os.rename(old_name, new_name)
                break ## [ no need to keep checking ]
        else: print(file_name) ## [ only prints if for never breaks ]


If you just want to decrease the level of nesting in your innermost loop, you can separate the loops:

fNameGenerator = (
    (dPath, fName) for dPath, dnames, fNames
    in os.walk(dir_path) for fName in fNames
)

for dPath, file_name in fNameGenerator:
    for keyWords in nameKeyWords:
        if keyWords in file_name:
            new_name = os.path.join(dPath, f'{nameKeyWords.get(keyWords)}.csv')
            os.rename(os.path.join(dPath, file_name), new_name)
            break
    else: print(file_name)

You could also get new_names within fNameGenerator

# nameKeyWords = {....}
def getNewFn(oldFn:str):
    for k in nameKeyWords:
        if k in oldFn: return f"{nameKeyWords[k]}.csv"

fNameGenerator = (
    (dPath, fName, getNewFn(fName)) for dPath, dnames, fNames 
    in os.walk(dir_path) for fName in fNames
)

for dPath, file_name, new_name in fNameGenerator:
    if new_name is None: print(file_name)
    else: os.rename(*[os.path.join(dPath, fn) for fn in [file_name, new_name]]) 


Please note that none of these have decreased the complexity, and the nested loops might actually be the fastest alternative [although none of them seem take up significantly less time that the rest].

huangapple
  • 本文由 发表于 2023年2月8日 20:30:33
  • 转载请务必保留本文链接:https://go.coder-hub.com/75385824.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定