英文:
How to increase efficiency of rename files by dictionary value according to keywords(keys) from dictionary?
问题
I try to rename files by the dictionary value according to the keywords(key) I have. The old name of the files is a long string containing the keywords(key) not exactly the same!! I want to find the key included in the file name and rename the file by the corresponding value. The value should be the new name for all files. The dictionary structure would look like the table below:
Dictionary name: nameKeyWords
Key (Keywords) | Value (Name) |
---|---|
abb | 1 |
ave | 2 |
asp | 3 |
Below is the code I wrote, and it does work. However, the code is very inefficient because I use three for loop to go through all the files, keywords
(keys) in the dictionary, and all the file_name
in file_names
. Is there any method that can make the code more efficient? Thanks!
for (dir_path, dir_names, file_names) in walk(dir_path):
for file_name in file_names:
for keyWords in nameKeyWords:
if keyWords in file_name:
old_name = os.path.join(dir_path,file_name)
new_name = os.path.join(dir_path,nameKeyWords.get(keyWords)+'csv')
os.rename(old_name, new_name)
else:
print(file_name)
英文:
I try to rename files by the dictionary value according to the keywords(key) I have. The old name of the files is a long string containing the keywords(key) not exactly the same!! I want to find the key included in the file name and rename the file by the corresponding value. The value should be the new name for all files. The dictionary structure would look like the table below:
Dictionary name: nameKeyWords
Key (Keywords) | Value (Name) |
---|---|
abb | 1 |
ave | 2 |
asp | 3 |
Below is the code I wrote, and it does work. However, the code is very inefficient because I use three for loop to go through all the files, keywords
(keys) in the dictionary, and all the file_name
in file_names
. Is there any method that can make the code more efficient? Thanks!
for (dir_path, dir_names, file_names) in walk(dir_path):
for file_name in file_names:
for keyWords in nameKeyWords:
if keyWords in file_name:
old_name = os.path.join(dir_path,file_name)
new_name = os.path.join(dir_path,nameKeyWords.get(keyWords)+'.csv')
os.rename(old_name, new_name)
else:
print(file_name)
答案1
得分: 0
I don't know anyway to get all the file_names
without nested for loops, but you should break
after os.rename(old_name, new_name)
, because there's no point in renaming the same file multiple times (and wouldn't it raise FileNotFoundError
after the first renaming since there will no longer be a file named file_name
in that directory?). And also, using for...else
(instead of if...else
inside for keyWords...
) would keep the same file_name
from being printed multiple times.
for (dir_path, dir_names, file_names) in walk(dir_path):
for file_name in file_names:
for keyWords in nameKeyWords:
if keyWords in file_name:
old_name = os.path.join(dir_path, file_name)
new_name = os.path.join(dir_path, nameKeyWords.get(keyWords) + '.csv')
os.rename(old_name, new_name)
break ## [ no need to keep checking ]
else:
print(file_name) ## [ only prints if for never breaks ]
If you just want to decrease the level of nesting in your innermost loop, you can separate the loops:
fNameGenerator = (
(dPath, fName) for dPath, dnames, fNames
in os.walk(dir_path) for fName in fNames
)
for dPath, file_name in fNameGenerator:
for keyWords in nameKeyWords:
if keyWords in file_name:
new_name = os.path.join(dPath, f'{nameKeyWords.get(keyWords)}.csv')
os.rename(os.path.join(dPath, file_name), new_name)
break
else:
print(file_name)
You could also get new_name
s within fNameGenerator
# nameKeyWords = {....}
def getNewFn(oldFn: str):
for k in nameKeyWords:
if k in oldFn:
return f"{nameKeyWords[k]}.csv"
fNameGenerator = (
(dPath, fName, getNewFn(fName)) for dPath, dnames, fNames
in os.walk(dir_path) for fName in fNames
)
for dPath, file_name, new_name in fNameGenerator:
if new_name is None:
print(file_name)
else:
os.rename(*[os.path.join(dPath, fn) for fn in [file_name, new_name]])
Please note that none of these have decreased the complexity, and the nested loops might actually be the fastest alternative [although none of them seem to take up significantly less time than the rest].
英文:
I don't know anyway to get all the file_names
without nested for loops, but you should break
after os.rename(old_name, new_name)
, because there's no point in renaming the same file multiple times (and wouldn't it raise FileNotFoundError
after the first renaming since there will no longer be a file named file_name
in that directory?). And also, using for...else
(instead of if...else
inside for keyWords...
) would keep the same file_name
from being printed multiple times.
for (dir_path, dir_names, file_names) in walk(dir_path):
for file_name in file_names:
for keyWords in nameKeyWords:
if keyWords in file_name:
old_name = os.path.join(dir_path,file_name)
new_name = os.path.join(dir_path,nameKeyWords.get(keyWords)+'.csv')
os.rename(old_name, new_name)
break ## [ no need to keep checking ]
else: print(file_name) ## [ only prints if for never breaks ]
If you just want to decrease the level of nesting in your innermost loop, you can separate the loops:
fNameGenerator = (
(dPath, fName) for dPath, dnames, fNames
in os.walk(dir_path) for fName in fNames
)
for dPath, file_name in fNameGenerator:
for keyWords in nameKeyWords:
if keyWords in file_name:
new_name = os.path.join(dPath, f'{nameKeyWords.get(keyWords)}.csv')
os.rename(os.path.join(dPath, file_name), new_name)
break
else: print(file_name)
You could also get new_name
s within fNameGenerator
# nameKeyWords = {....}
def getNewFn(oldFn:str):
for k in nameKeyWords:
if k in oldFn: return f"{nameKeyWords[k]}.csv"
fNameGenerator = (
(dPath, fName, getNewFn(fName)) for dPath, dnames, fNames
in os.walk(dir_path) for fName in fNames
)
for dPath, file_name, new_name in fNameGenerator:
if new_name is None: print(file_name)
else: os.rename(*[os.path.join(dPath, fn) for fn in [file_name, new_name]])
Please note that none of these have decreased the complexity, and the nested loops might actually be the fastest alternative [although none of them seem take up significantly less time that the rest].
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论