Why is the order of directories listed by os.walk() the same irrespective of the "topdown" parameter?

huangapple go评论59阅读模式
英文:

Why is the order of directories listed by os.walk() the same irrespective of the "topdown" parameter?

问题

我有一个数据集在特定的根目录中,并尝试迭代目录和文件,但topdown参数不按预期工作。

Images
├── n01440764
│   ├── image1.jpg
│   ├─  ...
│   └── image50.jpg
└── n01443537
    ├── image1.jpg
    ├─  ...
    └── image50.jpg
import os

image_dir = os.walk("Images", topdown=False)

for root, dirs, files in image_dir:
    for d in dirs:
        print(d)

输出:

n01443537
n01440764

无论topdown=False还是True,结果都一样。但我期望的是:

n01440764
n01443537
英文:

I have a dataset in a specific root and trying to iterate through dirs and files, but the topdown parameter does not work as expected.

Images
├── n01440764
│   ├── image1.jpg
│   ├─  ...
│   └── image50.jpg
└── n01443537
    ├── image1.jpg
    ├─  ...
    └── image50.jpg
import os

image_dir = os.walk("Images", topdown=False)

for root, dirs, files in image_dir:
    for d in dirs:
        print(d)

Output:

n01443537
n01440764

Either topdown=False or True, the result is the same. But I am expecting:

n01440764
n01443537

答案1

得分: 1

The topdown option determines when the (dirpath, dirnames, filenames) tuples for a directory are generated. If topdown is True or not specified, the directory's triple is generated before its subdirectories. If topdown is False, the triple for a directory is generated after its subdirectories. Regardless of topdown, the list of subdirectories is retrieved before generating tuples.

When topdown is True, you can modify dirnames in place to filter paths. When False, dirnames' modification won't affect behavior.

In Python 3.5+, os.walk() uses os.scandir(), returning entries in arbitrary order. Prior, os.listdir() was used, also arbitrary. You can order directories using for dir in sorted(dirs) within your os.walk() loop.

英文:

The topdown option does not change the order in which directories at the same level are walked. Instead, it determines whether the (dirpath, dirnames, filenames) tuples for a directory are generated before or after the tuples for its subdirectories. In the default or True option, you can modify dirnames in place to filter out some paths and not walk them. In the False option, the subdirectories are walked first, so that sort of filtering is not possible.

From the official documentation for os.walk()

> If optional argument topdown is True or not specified, the triple for
> a directory is generated before the triples for any of its
> subdirectories (directories are generated top-down). If topdown is
> False, the triple for a directory is generated after the triples for
> all of its subdirectories (directories are generated bottom-up). No
> matter the value of topdown, the list of subdirectories is retrieved
> before the tuples for the directory and its subdirectories are
> generated.
>
> When topdown is True, the caller can modify the dirnames list in-place
> (perhaps using del or slice assignment), and walk() will only recurse
> into the subdirectories whose names remain in dirnames; this can be
> used to prune the search, impose a specific order of visiting, or even
> to inform walk() about directories the caller creates or renames
> before it resumes walk() again. Modifying dirnames when topdown is
> False has no effect on the behavior of the walk, because in bottom-up
> mode the directories in dirnames are generated before dirpath itself
> is generated.

As of Python 3.5, os.walk() uses os.scandir(), which returns them in an arbitrary order:

> The list is in arbitrary order, and does not include the special
> entries '.' and '..' even if they are present in the directory

Prior to Python 3.5, it used os.listdir(), which also returned the directories in arbitrary order. See also the answer to this question:
https://stackoverflow.com/questions/18282370/in-what-order-does-os-walk-iterates-iterate

You can get the directories in the order you want by using
for dir in sorted(dirs) within your os.walk() loop.

huangapple
  • 本文由 发表于 2023年4月17日 22:43:26
  • 转载请务必保留本文链接:https://go.coder-hub.com/76036380.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定