英文:
How to keep yaml Format when using yaml.SafeDumper
问题
以下是您提供的代码部分的翻译:
def open_file(input_file):
with open(input_file, encoding="utf8") as file:
return_file = yaml.safe_load(file)
return return_file
with open(output_file_name, "w", encoding="utf8") as dump_file:
yaml.dump(<runtime_file>, dump_file, Dumper=MyDumper, sort_keys=False, allow_unicode=True)
class MyDumper(yaml.SafeDumper):
def write_line_break(self, data=None):
super().write_line_break(data)
if len(self.indents) == 1:
super().write_line_break()
if len(self.indents) == 2:
super().write_line_break()
英文:
I have a YAML file
Version: "1.0"
title1: "Title 1"
title2: [Title 2]
I open the file using
def open_file(input_file):
with open(input_file, encoding="utf8") as file:
return_file = yaml.safe_load(file)
return return_file
in runtime it looks like this:
{'Version': '1.0', 'title1': 'Title 1', 'title2': ['Title 2']}
the output I receive is:
Version: "1.0"
title1: Title 1
title2:
- Title 2
How do i keep the original formatting for "title1" and "title2" ?
I write the file as such:
with open(output_file_name, "w", encoding="utf8") as dump_file:
yaml.dump(<runtime_file>, dump_file, Dumper=MyDumper, sort_keys=False, allow_unicode=True)
with
class MyDumper(yaml.SafeDumper):
def write_line_break(self, data=None):
super().write_line_break(data)
if len(self.indents) == 1:
super().write_line_break()
if len(self.indents) == 2:
super().write_line_break()
答案1
得分: 1
你正在使用 PyYAML,它仅支持(YAML 1.1的子集),而 YAML 1.2 已于十三年前发布。你应该升级到 ruamel.yaml
(免责声明:我是作者),它自2014年以来一直在开发,专门用于保留原始布局、引号、注释、锚点/别名、数字格式等。
import sys
import ruamel.yaml
from pathlib import Path
file_in = Path('input.yaml')
file_out = Path('output.yaml')
yaml = ruamel.yaml.YAML()
# yaml.indent(mapping=4, sequence=4, offset=2)
yaml.preserve_quotes = True
data = yaml.load(file_in)
yaml.dump(data, file_out)
print(file_out.read_text())
这将产生:
Version: "1.0"
title1: "Title 1"
title2: [Title 2]
由于你的一些标量字符串被不必要地引用,而其他一些没有,你不能在 PyYAML 中轻松实现这一点。你将不得不将带引号的字符串加载到与非带引号的字符串不同的类中,以便它们回溯时还原为它们的原始值。根据加载的数据需要执行的操作,你需要使这些类在大多数情况下表现得像普通字符串。为此,你需要深入了解 PyYAML 的内部。ruamel.yaml
在其默认的往返模式中为你处理了所有这些。
英文:
You are using PyYAML and that supports only (a subset of) YAML 1.1, whereas YAML 1.2 was released more than thirteen
years ago. You should upgrade to ruamel.yaml
(disclaimer: I am the author) that has been developed since 2014
to specifically preserve orginal layout, quotes, comments, anchors/aliases, number formats, etc.
import sys
import ruamel.yaml
from pathlib import Path
file_in = Path('input.yaml')
file_out = Path('output.yaml')
yaml = ruamel.yaml.YAML()
# yaml.indent(mapping=4, sequence=4, offset=2)
yaml.preserve_quotes = True
data = yaml.load(file_in)
yaml.dump(data, file_out)
print(file_out.read_text())
which gives:
Version: "1.0"
title1: "Title 1"
title2: [Title 2]
Since some of your scalar strings are superfluously quoted and others are not, you
cannot easily achieve this in PyYAML. You would have to load the quoted strings into
a different class than the non-quoted ones, so that they dump back as their originals.
Depending on what you need to do with those loaded data, you need to make those classes behave
mostly like a normal string. For that you need to dig into the PyYAML internals. ruamel.yaml
takes
care of all of that for you (in its default round-trip mode).
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论