`python pathllib joinpath()`在另一个路径以斜杠开头时会删除主路径的一部分。

huangapple go评论71阅读模式
英文:

python pathllib joinpath() drops part of main path if other path starts with slash

问题

pathlibjoinpath() 看起来在与以斜杠开头的其他路径连接时会丢失主路径的子文件夹。

连接到 element_b 的路径缺少了 'main_folder' (!)

我一定是漏掉了什么(基本的),因为我觉得这非常令人困惑,导致难以找到错误。

首先,我期望 pathlib 及其方法的目的是优雅地处理文件路径的复杂性,以屏蔽开发人员处理这种丑陋的细节。除了模棱两可的情况外,但那时 - >

其次,如果出现模棱两可的情况(Windows 与 POSIX 的差异或其他情况),我期望库或方法会崩溃或不允许使用它,而不是默默地给我一个意外的结果。

第三,最重要的是,我期望 joinpath() 方法在任何情况下都能 连接 这两个路径。

我在 joinpath() 文档中看到它返回一个
表示子路径(如果所有参数都是相对路径)或完全不同路径(如果其中一个参数是锚定路径)的新路径

'锚定' 是什么意思,如果我正在构建由各种元素组成的路径,一些是配置的,一些由用户提供,一些是从其他路径中剥离的,我是否又回到了手动测试斜杠并基于此构建各种字符串连接的笨拙方式?

英文:

pathlib Path joinpath() seems to drop subfolder of main path if joining with other path starting with forward slash

from pathlib import Path
a = Path(r'c:\main_folder')
a.joinpath(element_a)
element_a = r'subfolderX/subfolderXY/file.txt'
Out[]: WindowsPath('c:/main_folder/subfolderX/subfolderXY/file.txt') # expected
element_b = r'/subfolderX/subfolderXY/file.txt'
a.joinpath(element_b)
Out[]: WindowsPath('c:/subfolderX/subfolderXY/file.txt')  # ?! missing 'main_folder'

joining path with element_b the result is missing 'main_folder' (!)

I must be missing something (fundamental) , since I find this very much confusing and leading to hard to find bugs

Firstly I would expect pathlib and its methods purpose is to gracefully handle filepaths intricacies to shield developers of dealing with this ugliness. Save for ambiguous cases, but then ->

Secondly, if ambiguous situation (windows vs posix differences or whatnot) I'd expect the library or method crash \ won't allow to use it, rather than silently give me unexpected result

and thirdly, mainly, I'd expect joinpath() method to join the two paths above anything else

I see in joinpath() docs it returns a
new path representing either a subpath (if all arguments are relative
paths) or a totally different path (if one of the arguments is
anchored)

What is 'anchored' and how to work around it, if I am constructing a path out of various elements, some configured, some provided by users, some stripped of other paths am I back to awkwardly manually testing for the slashes and constructing various string concatenations based on that?

答案1

得分: 1

前导的\充当驱动器的锚点。一个未锚定的路径可以添加到其他路径,但一个已锚定的路径与它指向的位置有某种相对于其他路径的参考。所以前导的\将其绑定到当前目录树的根目录,在你的情况下是C:\

你的命令被解释如下:

  • 前往C:\main_folder
  • 跟随\,这将带你回到C:\
  • 从那里前往subfolderX\subfolderXY\file.txt

实际上,你正在告诉Python将C:\main_folderC:\subfolderX\subfolderXY\file.txt连接在一起。

如果给joinpath一个已锚定的路径,那么第一个路径段将被忽略如文档所述

摆脱前导斜杠的最简单方法是执行 "\\abc\\".lstrip("/\\"),这将删除零个或多个前导/\字符。

英文:

The leading \ functions as an anchor to the drive. An unanchored path can be added to other paths, but an anchored path has some reference to where it points to, relative to some other path. So the leading \ binds it to the root of the current directory tree, in your case C:\.

Your command is interpreted like so:

  • Go to C:\main_folder
  • Follow \, which brings you back to C:\
  • Go to subfolderX\subfolderXY\file.txt from there.

Effectively, you are telling python to join C:\main_folder and C:\subfolderX\subfolderXY\file.txt together.

If an anchored path is given to joinpath, then the first path segments will be ignored as described in the Doc.

The most simple way to discard leading slashes would be to do "\\abc\\".lstrip("/\\") This will remove zero or more leading leading / or \ characters.

答案2

得分: 0

> Firstly I would expect pathlib and its methods purpose is to gracefully handle filepaths intricacies to shield developers of dealing with this ugliness. Save for ambiguous cases
首先,我期望pathlib及其方法的目的是优雅地处理文件路径的复杂性,以保护开发人员免于处理这种复杂性,除了模棱两可的情况。

Which it does, joinpath handles upwards and root (anchored) traversals for you.
它确实做到了,joinpath方法为您处理上级路径和根路径(anchored)遍历。

> Secondly, if ambiguous situation (windows vs posix differences or whatnot) I'd expect the library or method crash \ won't allow to use it, rather than silently give me unexpected result
其次,如果出现模糊情况(Windows与POSIX的差异或其他情况),我希望库或方法会崩溃,不允许使用它,而不是悄悄地给我意外的结果。

That doesn't make any sense, the goal of pathlib is not to provide a half-assed common subset of all possible filesystems (that would be completely incapable since some filesystems / APIs don't even have directories). \ is a path separator on windows, and it's a regular character on unices, that's just a thing.
这没有意义,pathlib的目标不是提供所有可能文件系统的一半折衷的公共子集(这是完全不可能的,因为某些文件系统/ API甚至没有目录)。\是Windows上的路径分隔符,而在Unix上是常规字符,这只是一种情况。

The module documentation literally tells you:
模块文档实际上告诉了你:

> This module offers classes representing filesystem paths with semantics appropriate for different operating systems.
"此模块提供了代表文件系统路径的类,其语义适用于不同的操作系统。

So it follows the semantics of the current system, if you want the semantics of one system over an other, use the Pure*Path objects as it also tells you:
因此,它遵循当前系统的语义,如果您希望一个系统的语义优于另一个系统,可以使用Pure*Path对象,因为它也告诉您:

> If you want to manipulate Windows paths on a Unix machine (or vice versa). You cannot instantiate a WindowsPath when running on Unix, but you can instantiate PureWindowsPath.
如果您想在Unix机器上操作Windows路径(反之亦然),则无法在Unix上运行时实例化WindowsPath,但可以实例化PureWindowsPath

> and thirdly, mainly, I'd expect joinpath() method to join the two paths above anything else
第三,主要,我期望joinpath()方法首先连接这两个路径。

Which it does. It essentially acts as a bunch of directory traversals (cd).
它确实做到了。它基本上充当一组目录遍历(cd)。

If you just want straight up string manipulations, you already have string manipulations, there's no need for a dedicated module.
如果您只想进行简单的字符串操作,那么您已经有了字符串操作,不需要专用模块。

> What is 'anchored' and how to work around it, if I am constructing a path out of various elements, some configured, some provided by users, some stripped of other paths am I back to awkwardly manually testing for the slashes and constructing various string concatenations based on that?
"anchored"是什么意思,如果我正在使用各种元素构建路径,其中一些是配置的,一些是由用户提供的,一些是从其他路径中剥离的,我是否需要再次尴尬地手动测试斜杠并基于此构建各种字符串连接?

If you consider thing.strip('/') to be "awkwardly manually testing for the slashes" then yes. You'll face the same issue in more or less every language, they don't provide filesystem path manipulation for trivial string concatenations, that'd be dumb.
如果您认为thing.strip('/')是“尴尬地手动测试斜杠”,那么是的。在几乎每种语言中,您都会遇到相同的问题,它们不会为琐碎的字符串连接提供文件系统路径操作,那将是愚蠢的。

英文:

> Firstly I would expect pathlib and its methods purpose is to gracefully handle filepaths intricacies to shield developers of dealing with this ugliness. Save for ambiguous cases

Which it does, joinpath handles upwards and root (anchored) traversals for you.

> Secondly, if ambiguous situation (windows vs posix differences or whatnot) I'd expect the library or method crash \ won't allow to use it, rather than silently give me unexpected result

That doesn't make any sense, the goal of pathlib is not to provide a half-assed common subset of all possible filesystems (that would be completely incapable since some filesystems / APIs don't even have directories). \ is a path separator on windows, and it's a regular character on unices, that's just a thing.

The module documentation literally tells you:

> This module offers classes representing filesystem paths with semantics appropriate for different operating systems.

So it follows the semantics of the current system, if you want the semantics of one system over an other, use the Pure*Path objects as it also tells you:

> If you want to manipulate Windows paths on a Unix machine (or vice versa). You cannot instantiate a WindowsPath when running on Unix, but you can instantiate PureWindowsPath.

> and thirdly, mainly, I'd expect joinpath() method to join the two paths above anything else

Which it does. It essentially acts as a bunch of directory traversals (cd).

If you just want straight up string manipulations, you already have string manipulations, there's no need for a dedicated module.

> What is 'anchored' and how to work around it, if I am constructing a path out of various elements, some configured, some provided by users, some stripped of other paths am I back to awkwardly manually testing for the slashes and constructing various string concatenations based on that?

If you consider thing.strip('/') to be "awkwardly manually testing for the slashes" then yes. You'll face the same issue in more or less every language, they don't provide filesystem path manipulation for trivial string concatenations, that'd be dumb.

huangapple
  • 本文由 发表于 2023年6月19日 17:58:02
  • 转载请务必保留本文链接:https://go.coder-hub.com/76505530.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定