ruamel.yaml.representer.RepresenterError – 为什么 ruamel.yaml 不能表示一个 np.array?

huangapple go评论90阅读模式
英文:

ruamel.yaml.representer.RepresenterError - Why ruaml.yaml can't represent an np.array?

问题

I'm new using Pydantic and ruaml.yaml, working actually on a project using these two.

I have an error when I try to load a model config and represent it with mkdocs, It appears that ruaml.yaml is unable to dump on np.array object of my Pydantic schema.

File "/usr/local/lib/python3.10/site-packages/ruamel/yaml/representer.py", line 337, in represent_undefined
                raise RepresenterError(f'cannot represent an object: {data!s}')
            ruamel.yaml.representer.RepresenterError: cannot represent an object: [0 0 0 0]

It happens when I try to represent that:

std: Optional[np.ndarray] = np.array([1.0, 1.0, 1.0, 1.0])
mean: Optional[np.ndarray] = np.array([0, 0, 0, 0])

Is there a solution or something that I can change? Thanks in advance!

TEST:
I try to replace it by:

std: Optional[np.ndarray] = [1.0, 1.0, 1.0, 1.0]
mean: Optional[np.ndarray] = [0, 0, 0, 0]

But need these to be np.array object for the rest of my code.

英文:

I'm new using Pydantic and ruaml.yaml, working actually on a project using these two.

I have an error when I try to load a model config and represent it with mkdocs, It appear that ruaml.yaml is unable to dump on np.array object of my Pydantic schema.

File "/usr/local/lib/python3.10/site-packages/ruamel/yaml/representer.py", line 337, in represent_undefined
                raise RepresenterError(f'cannot represent an object: {data!s}')
            ruamel.yaml.representer.RepresenterError: cannot represent an object: [0 0 0 0]

It happen when I try to represent that :

std: Optional[np.ndarray] = np.array([1.0, 1.0, 1.0, 1.0])
mean: Optional[np.ndarray] = np.array([0, 0, 0, 0])

Is there a solution or something that I can change ?
Thanks in advance !

TEST :
I try to replace it by :

std: Optional[np.ndarray] = [1.0, 1.0, 1.0, 1.0]
mean: Optional[np.ndarray] = [0, 0, 0, 0]

But need these to be np.array object for the rest of my code..

答案1

得分: 1

数进行转储

def represent_numpy_array(self, array, flow_style=None):
    tag = '!numpy.ndarray'
    value = []
    node = ruamel.yaml.nodes.SequenceNode(tag, value, flow_style=flow_style)
    for elem in array:
        node_elem = self.represent_data(elem)
        value.append(node_elem)
    if flow_style is None:
        node.flow_style = True
    return node

yaml = ruamel.yaml.YAML()
yaml.Representer.add_representer(numpy.ndarray, represent_numpy_array)
yaml.Representer.add_representer(numpy.float64, represent_numpy_float64)
yaml.Representer.add_representer(numpy.int64, represent_numpy_int64)
yaml.dump(data, path)
print(path.read_text(), end='')  # 文件以换行符结尾,不要再添加一个

这将产生以下结果:

std: !numpy.ndarray [1.0, 1.0, 1.0, 1.0]
mean: !numpy.ndarray [0, 0, 0, 0]

ruamel.yaml 的默认往返加载器可以加载生成的文件:

import sys

yaml = ruamel.yaml.YAML()
rtd = yaml.load(path)
print(f'{rtd}')
print('std 的类型', type(rtd['std']))
yaml.dump(rtd, sys.stdout)

这将得到以下结果:

{'std': [1.0, 1.0, 1.0, 1.0], 'mean': [0, 0, 0, 0]}
std 的类型 <class 'ruamel.yaml.comments.CommentedSeq'>
std: !numpy.ndarray [1.0, 1.0, 1.0, 1.0]
mean: !numpy.ndarray [0, 0, 0, 0]

如您所见,这并没有将文件加载为 numpy.ndarray。为此,您需要为该标签提供一个构造器,因为我们没有对浮点数/整数进行标记,所以您不需要为它们提供构造器:

def construct_numpy_array(self, node):
    return numpy.array(self.construct_sequence(node))

yaml.Constructor.add_constructor('!numpy.ndarray', construct_numpy_array)

rtd = yaml.load(path)
print(f'{rtd}')
print('std 的类型', type(rtd['std']))

这将得到以下结果:

{'std': array([1., 1., 1., 1.]), 'mean': array([0, 0, 0, 0])}
std 的类型 <class 'numpy.ndarray'>

请注意,上述代码对于(间接)递归转储不起作用,因为 construct_numpy_array 没有 yield 语句。最好做一些类似以下的操作:

def construct_numpy_array(self, node):
    data = numpy.array([])
    yield data
    numpy.append(data, numpy.array(self.construct_sequence(node)))

但由于我对 NumPy 不太熟悉,所以这没有得到相同的结果。

英文:

YAML has several types defined in the Language Independent types for YAML 1.1.
numpy.array is not in that list, primarily because it is not language independent, so there is no representation for something non-standard like that.

So you will have to provide a representer yourself, and if you represent it with a tag, you can load back the resulting representation if you provide a constructor for that tag.

It doesn't suffice to make a representer for numpy.array though, as neither std nor mean is of that type, as
you already indicate yourself by the type information you supply. You have to provide a representer for numpy.ndarray,
and also for numpy.float64 (for the elements of std and numpy.int (for the elements of mean)

Once you do that you can dump std and mean.

import sys
from pathlib import Path
import numpy
import ruamel.yaml

path = Path(&#39;numpy.yaml&#39;)

np = numpy
data = dict(
  std = np.array([1.0, 1.0, 1.0, 1.0]),
  mean = np.array([0, 0, 0, 0]),
)

def represent_numpy_float64(self, value):
    return self.represent_float(value)  # alternatively dump as a tagged float

def represent_numpy_int64(self, value):
    return self.represent_int(value)  # alternatively dump as a tagged int

def represent_numpy_array(self, array, flow_style=None):
    tag = &#39;!numpy.ndarray&#39;
    value = []
    node = ruamel.yaml.nodes.SequenceNode(tag, value, flow_style=flow_style)
    for elem in array:
        node_elem = self.represent_data(elem)
        value.append(node_elem)
    if flow_style is None:
        node.flow_style = True
    return node


yaml = ruamel.yaml.YAML()
yaml.Representer.add_representer(numpy.ndarray, represent_numpy_array)
yaml.Representer.add_representer(numpy.float64, represent_numpy_float64)
yaml.Representer.add_representer(numpy.int64, represent_numpy_int64)
yaml.dump(data, path)
print(path.read_text(), end=&#39;&#39;)  # the file ends in a newline, don&#39;t add another one

which gives:

std: !numpy.ndarray [1.0, 1.0, 1.0, 1.0]
mean: !numpy.ndarray [0, 0, 0, 0]

ruamel.yaml's default round-trip loader, can load the resuling file:

import sys

yaml = ruamel.yaml.YAML()
rtd = yaml.load(path)
print(f&#39;{rtd}&#39;)
print(&#39;type of std&#39;, type(rtd[&#39;std&#39;]))
yaml.dump(rtd, sys.stdout)

which gives:

{&#39;std&#39;: [1.0, 1.0, 1.0, 1.0], &#39;mean&#39;: [0, 0, 0, 0]}
type of std &lt;class &#39;ruamel.yaml.comments.CommentedSeq&#39;&gt;
std: !numpy.ndarray [1.0, 1.0, 1.0, 1.0]
mean: !numpy.ndarray [0, 0, 0, 0]

as you can see, this doesn't load the file as a numpy.ndarray. For that you need to provide a constructor,
for the tag (since we don't dump the floats/ints tagged, you don't need to provide a constructor for those):

def construct_numpy_array(self, node):
    return numpy.array(self.construct_sequence(node))

yaml.Constructor.add_constructor(&#39;!numpy.ndarray&#39;, construct_numpy_array)

rtd = yaml.load(path)
print(f&#39;{rtd}&#39;)
print(&#39;type of std&#39;, type(rtd[&#39;std&#39;]))

which gives:

{&#39;std&#39;: array([1., 1., 1., 1.]), &#39;mean&#39;: array([0, 0, 0, 0])}
type of std &lt;class &#39;numpy.ndarray&#39;&gt;

Note that the above doesn't work for (indirectly) recursive dumps, as the construct_numpy_array
does not have a yield statement. It would be better to do something like:

def construct_numpy_array(self, node):
    data = numpy.array([])
    yield data
    numpy.append(data, numpy.array(self.construct_sequence(node)))

but that did not give the same result (because of my unfamiliarity with numpy).

huangapple
  • 本文由 发表于 2023年6月8日 16:33:34
  • 转载请务必保留本文链接:https://go.coder-hub.com/76430001.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定