使用`yaml.SafeDumper`时如何保持yaml格式。

huangapple go评论76阅读模式
英文:

How to keep yaml Format when using yaml.SafeDumper

问题

以下是您提供的代码部分的翻译:

def open_file(input_file):
    with open(input_file, encoding="utf8") as file:
        return_file = yaml.safe_load(file)
        return return_file
    with open(output_file_name, "w", encoding="utf8") as dump_file:
        yaml.dump(<runtime_file>, dump_file, Dumper=MyDumper, sort_keys=False, allow_unicode=True)
class MyDumper(yaml.SafeDumper):
    def write_line_break(self, data=None):
        super().write_line_break(data)
        if len(self.indents) == 1:
            super().write_line_break()
        if len(self.indents) == 2:
            super().write_line_break()
英文:

I have a YAML file

Version: &quot;1.0&quot;
title1: &quot;Title 1&quot;
title2: [Title 2]

I open the file using

def open_file(input_file):
    with open(input_file, encoding=&quot;utf8&quot;) as file:
        return_file = yaml.safe_load(file)
        return return_file

in runtime it looks like this:

{&#39;Version&#39;: &#39;1.0&#39;, &#39;title1&#39;: &#39;Title 1&#39;, &#39;title2&#39;: [&#39;Title 2&#39;]}

the output I receive is:

Version: &quot;1.0&quot;
title1: Title 1
title2:
- Title 2

How do i keep the original formatting for "title1" and "title2" ?

I write the file as such:

    with open(output_file_name, &quot;w&quot;, encoding=&quot;utf8&quot;) as dump_file:
        yaml.dump(&lt;runtime_file&gt;, dump_file, Dumper=MyDumper, sort_keys=False, allow_unicode=True)

with

class MyDumper(yaml.SafeDumper):
    def write_line_break(self, data=None):
        super().write_line_break(data)
        if len(self.indents) == 1:
            super().write_line_break()
        if len(self.indents) == 2:
            super().write_line_break()

答案1

得分: 1

你正在使用 PyYAML,它仅支持(YAML 1.1的子集),而 YAML 1.2 已于十三年前发布。你应该升级到 ruamel.yaml(免责声明:我是作者),它自2014年以来一直在开发,专门用于保留原始布局、引号、注释、锚点/别名、数字格式等。

import sys
import ruamel.yaml
from pathlib import Path

file_in = Path('input.yaml')
file_out = Path('output.yaml')
    
yaml = ruamel.yaml.YAML()
# yaml.indent(mapping=4, sequence=4, offset=2) 
yaml.preserve_quotes = True
data = yaml.load(file_in)
yaml.dump(data, file_out)

print(file_out.read_text())

这将产生:

Version: "1.0"
title1: "Title 1"
title2: [Title 2]

由于你的一些标量字符串被不必要地引用,而其他一些没有,你不能在 PyYAML 中轻松实现这一点。你将不得不将带引号的字符串加载到与非带引号的字符串不同的类中,以便它们回溯时还原为它们的原始值。根据加载的数据需要执行的操作,你需要使这些类在大多数情况下表现得像普通字符串。为此,你需要深入了解 PyYAML 的内部。ruamel.yaml 在其默认的往返模式中为你处理了所有这些。

英文:

You are using PyYAML and that supports only (a subset of) YAML 1.1, whereas YAML 1.2 was released more than thirteen
years ago. You should upgrade to ruamel.yaml (disclaimer: I am the author) that has been developed since 2014
to specifically preserve orginal layout, quotes, comments, anchors/aliases, number formats, etc.

import sys
import ruamel.yaml
from pathlib import Path

file_in = Path(&#39;input.yaml&#39;)
file_out = Path(&#39;output.yaml&#39;)
    
yaml = ruamel.yaml.YAML()
# yaml.indent(mapping=4, sequence=4, offset=2) 
yaml.preserve_quotes = True
data = yaml.load(file_in)
yaml.dump(data, file_out)

print(file_out.read_text())

which gives:

Version: &quot;1.0&quot;
title1: &quot;Title 1&quot;
title2: [Title 2]

Since some of your scalar strings are superfluously quoted and others are not, you
cannot easily achieve this in PyYAML. You would have to load the quoted strings into
a different class than the non-quoted ones, so that they dump back as their originals.
Depending on what you need to do with those loaded data, you need to make those classes behave
mostly like a normal string. For that you need to dig into the PyYAML internals. ruamel.yaml takes
care of all of that for you (in its default round-trip mode).

huangapple
  • 本文由 发表于 2023年2月23日 20:51:18
  • 转载请务必保留本文链接:https://go.coder-hub.com/75545082.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定