英文:
How can I limit os.walk results for a single file?
问题
我正在尝试在给定的目录中搜索特定文件,如果该文件不存在,我希望代码显示"文件不存在"。目前,使用os.walk,我可以使其工作,但是这会命中每个不是指定文件的文件,并打印"文件不存在"。我知道这是os.walk的工作方式,但我不确定是否有办法使其仅在找到或未找到时打印一次。
文件夹结构:
根文件夹|
|项目文件夹
|file.xml
|其他文件/子文件夹
我希望代码的工作方式是进入"项目文件夹",对"file.xml"进行递归搜索,一旦找到,就打印一次"找到",否则打印一次"未找到"。
代码如下:
def check_file(x): #x = 根文件夹路径
for d in next(os.walk(x))[1]: #如果我理解正确,[1] 将是项目文件夹
for root, directories, files in os.walk(x):
for name in files:
if "file.xml" not in name:
print("找到")
else:
print("文件不存在")
如果我将代码更改为:
for name in files:
if "file.xml" in name:
print("找到")
else:
pass
代码在技术上按预期工作,但实际上并没有太多帮助来指出文件不存在,因此这不是一个好的解决方案。如果我能够为代码提供特定的路径以进行查找,那将会更容易,但由于用户可以将"根文件夹"放在他们的计算机的任何位置,而"项目文件夹"的名称将根据项目而异,我不认为我能够为代码提供特定的位置。
是否有一种方法可以使用os.walk使其工作,或者是否有其他方法效果更好?
英文:
I am trying to search a given directory for a specific file, and if that file does not exist I would want the code to say "File does not exist". Currently with os.walk I can get this to work, however this will hit on every single file that isn't the specified file and print "File dos not exist". I know that this is how os.walk functions, but I was not sure if there is a way to make it only print out once if it is found or not found.
Folder structure:
root folder|
|Project Folder
|file.xml
|other files/subfolders
How I would want the code to work is to go inside of "Project Folder", do a recursive search for "file.xml", and once it is found print out once "Found", otherwise prints out once "Not found".
The code is:
def check_file(x): #x = root folder dir
for d in next(os.walk(x))[1]: #if I understand correctly, [1] will be Project Folder
for root, directories, files in os.walk(x):
for name in files:
if "file.xml" not in name:
print("found")
else:
print("File Missing")
If I change the code to
for name in files:
if "file.xml" in name:
print("found")
else:
pass
The code technically works as intended, but it doesn't really do much to help point out if it isn't there, so this isn't a good solution. It would be easier if I was able to give the code a specific path to look in, however as the user is able to place the 'root folder' anywhere on their machine as well as the 'project folder' would have different names depending on the project, I don't think I would be able to give the code a specific location.
Is there a way to get this to work with os.walk, or would another method work best?
答案1
得分: 3
glob
模块非常方便用于基于通配符的递归搜索。特别是 **
通配符可以匹配任意深度的目录树,因此您可以在根目录的后代中的任何位置找到文件。
例如:
import glob
def check_file(x): # 其中 x 是搜索的根目录
files = glob.glob('**/file.xml', root_dir=x, recursive=True)
if files:
print(f"找到 {len(files)} 个匹配的文件")
else:
print("未找到匹配的文件")
英文:
The glob
module is very convenient for this kind of wildcard-based recursive search. Particularly, the **
wildcard matches a directory tree of arbitrary depth, so you can find a file anywhere in the descendants of your root directory.
For example:
import glob
def check_file(x): # where x is the root directory for the search
files = glob.glob('**/file.xml', root_dir=x, recursive=True)
if files:
print(f"Found {len(files)} matching files")
else:
print("Did not find a matching file")
答案2
得分: 2
以下是翻译好的部分:
[Python.Docs]: os.walk(top, topdown=True, onerror=None, followlinks=False)的清单。
你不需要2个嵌套的循环。您只需要在每次迭代时检查基本文件名是否存在于os.walk生成的第3个成员中。
此实现处理了文件存在于多个目录的情况。如果您只需要打印文件一次(无论它在目录中出现多少次),则有函数search_file_once。
code00.py:
#!/usr/bin/env python
import os
import sys
def search_file(root_dir, base_name):
found = 0
for root, dirs, files in os.walk(root_dir):
if base_name in files:
print("Found: {:s}".format(os.path.join(root, base_name)))
found += 1
if not found:
print("Not found")
# @TODO - cfati: Only care if file is found once
def search_file_once(root_dir, base_name):
for root, dirs, files in os.walk(root_dir):
if base_name in files:
print("Found: {:s}".format(os.path.join(root, base_name)))
break
else:
print("Not found")
def main(*argv):
root = os.path.dirname(os.path.abspath(__file__))
files = (
"once.xml",
"multiple.xml",
"notpresent.xml",
)
for file in files:
print("\n在 {:s} 中递归搜索 {:s}".format(root, file))
search_file(root, file)
if __name__ == "__main__":
print("Python {:s} {:03d} 位于 {:s}\n".format(" ".join(elem.strip() for elem in sys.version.split("\n")),
64 if sys.maxsize > 0x100000000 else 32, sys.platform))
rc = main(*sys.argv[1:])
print("\n完成。\n")
sys.exit(rc)
输出:
[cfati@CFATI-5510-0:e:\Work\Dev\StackExchange\StackOverflow\q076383189]> sopr.bat
### Set shorter prompt to better fit when pasted in StackOverflow (or other) pages ###
[prompt]> tree /a /f
Folder PATH listing for volume SSD0-WORK
Volume serial number is AE9E-72AC
E:.
| code00.py
|
\---dir0
+---dir00
+---dir01
| multiple.xml
| once.xml
|
\---dir02
\---dir020
multiple.xml
[prompt]>
[prompt]> "e:\Work\Dev\VEnvs\py_pc064_03.10_test0\Scripts\python.exe" ./code00.py
Python 3.10.9 (tags/v3.10.9:1dd9be6, Dec 6 2022, 20:01:21) [MSC v.1934 64 bit (AMD64)] 064 位于 win32
在 e:\Work\Dev\StackExchange\StackOverflow\q076383189 中递归搜索 once.xml
Found: e:\Work\Dev\StackExchange\StackOverflow\q076383189\dir0\dir01\once.xml
在 e:\Work\Dev\StackExchange\StackOverflow\q076383189 中递归搜索 multiple.xml
Found: e:\Work\Dev\StackExchange\StackOverflow\q076383189\dir0\dir01\multiple.xml
Found: e:\Work\Dev\StackExchange\StackOverflow\q076383189\dir0\dir02\dir020\multiple.xml
在 e:\Work\Dev\StackExchange\StackOverflow\q076383189 中递归搜索 notpresent.xml
Not found
完成。
英文:
Listing [Python.Docs]: os.walk(top, topdown=True, onerror=None, followlinks=False).
You don't need 2 nested loops. You only need to check on each iteration, if the base file name is present in the 3<sup>rd</sup> member that os.walk produces.<br>
This implementation handles the case of a file being present in multiple directories. If you only need print the file once (no matter how many times it's present in the directory), there's the function search_file_once.
code00.py:
#!/usr/bin/env python
import os
import sys
def search_file(root_dir, base_name):
found = 0
for root, dirs, files in os.walk(root_dir):
if base_name in files:
print("Found: {:s}".format(os.path.join(root, base_name)))
found += 1
if not found:
print("Not found")
# @TODO - cfati: Only care if file is found once
def search_file_once(root_dir, base_name):
for root, dirs, files in os.walk(root_dir):
if base_name in files:
print("Found: {:s}".format(os.path.join(root, base_name)))
break
else:
print("Not found")
def main(*argv):
root = os.path.dirname(os.path.abspath(__file__))
files = (
"once.xml",
"multiple.xml",
"notpresent.xml",
)
for file in files:
print("\nSearching recursively for {:s} in {:s}".format(file, root))
search_file(root, file)
if __name__ == "__main__":
print("Python {:s} {:03d}bit on {:s}\n".format(" ".join(elem.strip() for elem in sys.version.split("\n")),
64 if sys.maxsize > 0x100000000 else 32, sys.platform))
rc = main(*sys.argv[1:])
print("\nDone.\n")
sys.exit(rc)
Output:
>
> [cfati@CFATI-5510-0:e:\Work\Dev\StackExchange\StackOverflow\q076383189]> sopr.bat
> ### Set shorter prompt to better fit when pasted in StackOverflow (or other) pages ###
>
> [prompt]> tree /a /f
> Folder PATH listing for volume SSD0-WORK
> Volume serial number is AE9E-72AC
> E:.
> | code00.py
> |
> \---dir0
> +---dir00
> +---dir01
> | multiple.xml
> | once.xml
> |
> \---dir02
> \---dir020
> multiple.xml
>
>
> [prompt]>
> [prompt]> "e:\Work\Dev\VEnvs\py_pc064_03.10_test0\Scripts\python.exe" ./code00.py
> Python 3.10.9 (tags/v3.10.9:1dd9be6, Dec 6 2022, 20:01:21) [MSC v.1934 64 bit (AMD64)] 064bit on win32
>
>
> Searching recursively for once.xml in e:\Work\Dev\StackExchange\StackOverflow\q076383189
> Found: e:\Work\Dev\StackExchange\StackOverflow\q076383189\dir0\dir01\once.xml
>
> Searching recursively for multiple.xml in e:\Work\Dev\StackExchange\StackOverflow\q076383189
> Found: e:\Work\Dev\StackExchange\StackOverflow\q076383189\dir0\dir01\multiple.xml
> Found: e:\Work\Dev\StackExchange\StackOverflow\q076383189\dir0\dir02\dir020\multiple.xml
>
> Searching recursively for notpresent.xml in e:\Work\Dev\StackExchange\StackOverflow\q076383189
> Not found
>
> Done.
>
This is just one of the multiple ways possible of doing this. Check [SO]: How do I list all files of a directory? (@CristiFati's answer) for more details.
答案3
得分: 1
以下是翻译好的部分:
我以前写过这样的函数,以及其他一些函数,想要提供它们以供参考,其中一些可能需要最小或没有修改就可以用于您的情况。
## 查找所有匹配项(不仅仅是一个):
## 示例用法:findAll('*.txt','/path/to/dir')
def findAll(name, path):
result = []
for root, dirs, files in os.walk(path):
if name in files:
result.append(os.path.join(root, name))
return result
## 一个持续查找,直到找到所有目标文件的函数)
def findProjectFiles(Folder, targetFiles):
import os
os.chdir(Folder)
filesFound = []
while len(targetFiles) > len(filesFound):
for root, dirs, files in os.walk(Folder):
for f in files:
current = os.path.join(Folder, f)
if f in TargetFiles:
filesFound.append(f)
for d in dirs:
Folder = os.path.join(Folder, d)
break;
filePaths = os.path.abspath(filePaths)
return filePaths
# 在文件夹中查找所有文件路径:
def findPaths(name, path):
import os
for root, dirs, files in os.walk(path):
if name in files:
return os.path.join(root, name)
## 可以轻松搜索返回的对象以查找您想要找到的字符串
## 类似,但这将匹配模式(即不必是精确的文件名匹配)。
import os, fnmatch
def findMatch(pattern, path):
result = []
for root, dirs, files in os.walk(path):
for name in files:
if fnmatch.fnmatch(name, pattern):
result.append(os.path.join(root, name))
return result
英文:
I have written a function like this and several others in the past. Want to provide them all for context, some will work for your case with minimal to no modifcation.
## Find ALL matches (not just one):
## Example Usage: findAll('*.txt', '/path/to/dir')
def findAll(name, path):
result = []
for root, dirs, files in os.walk(path):
if name in files:
result.append(os.path.join(root, name))
return result
## A function that keeps going until all target files are found)
def findProjectFiles(Folder, targetFiles):
import os
os.chdir(Folder)
filesFound=[]
while len(targetFiles) > len(filesFound):
for root, dirs, files in os.walk(Folder):
for f in files:
current=os.path.join(Folder, f)
if f in TargetFiles:
filesFound.append(f)
for d in dirs:
Folder=os.path.join(Folder, d)
break;
filePaths=os.path.abspath(filePaths)
return filePaths
# find all file paths in folder:
def findPaths(name, path):
import os
for root, dirs, files in os.walk(path):
if name in files:
return os.path.join(root, name)
## can search the object returned for the string you want to find easily
## Similar, but this will match a pattern (i.e. does not have to be exact file name match).
import os, fnmatch
def findMatch(pattern, path):
result = []
for root, dirs, files in os.walk(path):
for name in files:
if fnmatch.fnmatch(name, pattern):
result.append(os.path.join(root, name))
return result
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论