Python ruamel.yaml库在不期望的地方添加了新行。

huangapple go评论74阅读模式
英文:

Python ruamel.yaml library adds new lines where not expected

问题

我正在使用ruamel.yaml来加载和编辑yaml文件中的特定属性。

我需要保留其他所有内容不变。到目前为止,以下代码几乎完美地工作:

yaml = ruamel.yaml.YAML()
yaml.preserve_quotes=True
yaml.explicit_start=True
yaml.indent(mapping=6, sequence=4, offset=2)

data = {}
with open("my.yaml", "r") as f:
    data = yaml.load(f)

data["my::property::user::name"] = "me"

with open("my.yaml", "w") as f:
    yaml.dump(data, f)

yaml文件很大,有很多属性,我无法使以下内容正常工作:

yaml.dump 为以下键添加了一个新行:

my::property::group_name: "path\Domain Admins"

结果为:

my::property::group_name: "%{path\Domain
      Admins"

对于某些属性,它在:后面添加了一个新行:

my::property::value: some-really-big-string-here

结果为:

my::property::value:
      some-really-big-string-here

编辑:

以下两行将添加第三个\并且行也会换行:

some::random::name: "\\\\%{expression}\\%{expression}"
another::random::name: "\\\\%{expression}\\pathname\\"

结果为:

some::random::name: "\\\\%{expression}\\
      %{expression}"
another::random::name: "\\\\%{expression}\\
      pathname\\"

也许是我的yaml文件需要一些数据修复,但是否可能在解析器级别避免这种情况?

英文:

I'm using ruamel.yaml to load and edit a specific property in a yaml file.

I need to preserve everything else as-is. So far, the following code is working almost perfect:

yaml = ruamel.yaml.YAML()
yaml.preserve_quotes=True
yaml.explicit_start=True
yaml.indent(mapping=6, sequence=4, offset=2)

data = {}
with open("my.yaml", "r") as f:
    data = yaml.load(f)

data["my::property::user::name"] = "me"

with open("my.yaml", "w") as f:
    yaml.dump(data, f)

The yaml file is big, with a lot of properties and I can't get the following to work:

yaml.dump add a new line for the following key:

my::property::group_name: "path\\Domain Admins"

Resulting in:

my::property::group_name: "%{path\\Domain
      Admins"

For some properties, it adds a new line right after the : :

my::property::value: some-really-big-string-here

Result in:

my::property::value:
      some-really-big-string-here

EDITED:

The follwing two lines will have a third \ added and the line will also break:

some::random::name: "\\\\%{expression}\\%{expression}"
another::random::name: "\\\\%{expression}\\pathname\\"

The result is:

some::random::name: "\\\\%{expression}\\\
      %{expression}"
another::random::name: "\\\\%{expression}\\\
      pathname\\"

Maybe it's my yaml file that need some data fix, but is it possible to avoid this at the parser level ?

答案1

得分: 1

我还没有尝试复现你所遇到的问题,因为我相当确定我无法通过给出的示例来复现。

转储程序会尝试将键值对放在一行上。默认行长度为80个字符。

如果值无法放在键后面的一行上,它将被包含在引号中,并在一个(单个)空格上拆分,并插入一个换行符,然后是足够的空格以避免缩进问题。如果需要,会重复这个过程。

如果值无法拆分(因为它没有空格),它将被放在下一行上,相对于键的起始位置进行缩进。或者在某些情况下,它会插入反斜杠换行符。

这仍然可能导致超过80个字符的溢出,这时会被覆盖。如果映射具有大的缩进(如你所示),并且键较小(小于缩进),则可能不会发生这种情况。

最直接影响的方法是设置:

 yaml.width = 4096

(选择一个大于最长行的值)。这将导致所有的值都在相应的键后面。

你还可以显式地将值“转换”为ruamel.yaml.scalarstring.LiteralScalarString,然后获得类似以下形式的键值对:

my::property::group_name: |
      %{path\\Domain Admins

*无论转储程序选择哪种表示形式(取决于你的设置),它读取回来的字符串与原始字符串相同。所以除了美观的原因,你不需要关心它。

没有API或简单的钩子可以让你影响转储程序在值指示符(键后面的冒号+空格)之后始终/永不插入换行符。所以我希望使用yaml.width对你来说是一个简单可接受的解决方案。

(你也可以将缩进保留在更常见的默认值,并且更不容易超出标准宽度)

英文:

I haven't tried to reproduce what you are getting, because I am pretty sure I can't with the examples given.

The dumper routine tries to fit key: value pairs on a line. The default line length is 80 characters.

If a value doesn't fit behind a key on a line it can be wrapped. In that case it will be quoted and split on a (single) space and a newline is inserted followed by enough spaces not to cause problems with indentation. If necessary this is repeated.

If the value cannot be split (because it has no spaces), it will be put on its own on the next line, indented relative to the start of the key. Or on some situations it will insert backslash newline.

This still can lead to overflowing the 80 characters which is then overruled. If you have a large indent for mappings (as you do), and small keys (smaller then the indent) this might not happen.

The most direct way to influence this is by setting:

 yaml.width = 4096  

(choose a value that is larger than your longest line). This will cause all the values to be behind the corresponding key.

You could also explicitly "convert" values to ruamel.yaml.scalarstring.LiteralScalarString and then get key: value pairs looking like:

my::property::group_name: |
      %{path\\Domain Admins

Whatever representation the dumper chooses, depending on your settings, the string that it reads back is the same as the original. So apart from aestetical reasons you should not care.

There is no API, or easy hooks that allow you to influence the dumper to always/never insert a newline after the value indicator (the ':'+ space after the key). So I hope using yaml.width is the easy acceptable solution for you.

(You can also leave the indent at the more normal default values and have less chance of overflowing the standard width)

huangapple
  • 本文由 发表于 2023年8月9日 04:09:13
  • 转载请务必保留本文链接:https://go.coder-hub.com/76862904.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定