英文:
ruamel.yaml.representer.RepresenterError - Why ruaml.yaml can't represent an np.array?
问题
I'm new using Pydantic and ruaml.yaml, working actually on a project using these two.
I have an error when I try to load a model config and represent it with mkdocs, It appears that ruaml.yaml is unable to dump on np.array object of my Pydantic schema.
File "/usr/local/lib/python3.10/site-packages/ruamel/yaml/representer.py", line 337, in represent_undefined
raise RepresenterError(f'cannot represent an object: {data!s}')
ruamel.yaml.representer.RepresenterError: cannot represent an object: [0 0 0 0]
It happens when I try to represent that:
std: Optional[np.ndarray] = np.array([1.0, 1.0, 1.0, 1.0])
mean: Optional[np.ndarray] = np.array([0, 0, 0, 0])
Is there a solution or something that I can change? Thanks in advance!
TEST:
I try to replace it by:
std: Optional[np.ndarray] = [1.0, 1.0, 1.0, 1.0]
mean: Optional[np.ndarray] = [0, 0, 0, 0]
But need these to be np.array object for the rest of my code.
英文:
I'm new using Pydantic and ruaml.yaml, working actually on a project using these two.
I have an error when I try to load a model config and represent it with mkdocs, It appear that ruaml.yaml is unable to dump on np.array object of my Pydantic schema.
File "/usr/local/lib/python3.10/site-packages/ruamel/yaml/representer.py", line 337, in represent_undefined
raise RepresenterError(f'cannot represent an object: {data!s}')
ruamel.yaml.representer.RepresenterError: cannot represent an object: [0 0 0 0]
It happen when I try to represent that :
std: Optional[np.ndarray] = np.array([1.0, 1.0, 1.0, 1.0])
mean: Optional[np.ndarray] = np.array([0, 0, 0, 0])
Is there a solution or something that I can change ?
Thanks in advance !
TEST :
I try to replace it by :
std: Optional[np.ndarray] = [1.0, 1.0, 1.0, 1.0]
mean: Optional[np.ndarray] = [0, 0, 0, 0]
But need these to be np.array object for the rest of my code..
答案1
得分: 1
数进行转储
def represent_numpy_array(self, array, flow_style=None):
tag = '!numpy.ndarray'
value = []
node = ruamel.yaml.nodes.SequenceNode(tag, value, flow_style=flow_style)
for elem in array:
node_elem = self.represent_data(elem)
value.append(node_elem)
if flow_style is None:
node.flow_style = True
return node
yaml = ruamel.yaml.YAML()
yaml.Representer.add_representer(numpy.ndarray, represent_numpy_array)
yaml.Representer.add_representer(numpy.float64, represent_numpy_float64)
yaml.Representer.add_representer(numpy.int64, represent_numpy_int64)
yaml.dump(data, path)
print(path.read_text(), end='') # 文件以换行符结尾,不要再添加一个
这将产生以下结果:
std: !numpy.ndarray [1.0, 1.0, 1.0, 1.0]
mean: !numpy.ndarray [0, 0, 0, 0]
ruamel.yaml
的默认往返加载器可以加载生成的文件:
import sys
yaml = ruamel.yaml.YAML()
rtd = yaml.load(path)
print(f'{rtd}')
print('std 的类型', type(rtd['std']))
yaml.dump(rtd, sys.stdout)
这将得到以下结果:
{'std': [1.0, 1.0, 1.0, 1.0], 'mean': [0, 0, 0, 0]}
std 的类型 <class 'ruamel.yaml.comments.CommentedSeq'>
std: !numpy.ndarray [1.0, 1.0, 1.0, 1.0]
mean: !numpy.ndarray [0, 0, 0, 0]
如您所见,这并没有将文件加载为 numpy.ndarray
。为此,您需要为该标签提供一个构造器,因为我们没有对浮点数/整数进行标记,所以您不需要为它们提供构造器:
def construct_numpy_array(self, node):
return numpy.array(self.construct_sequence(node))
yaml.Constructor.add_constructor('!numpy.ndarray', construct_numpy_array)
rtd = yaml.load(path)
print(f'{rtd}')
print('std 的类型', type(rtd['std']))
这将得到以下结果:
{'std': array([1., 1., 1., 1.]), 'mean': array([0, 0, 0, 0])}
std 的类型 <class 'numpy.ndarray'>
请注意,上述代码对于(间接)递归转储不起作用,因为 construct_numpy_array
没有 yield
语句。最好做一些类似以下的操作:
def construct_numpy_array(self, node):
data = numpy.array([])
yield data
numpy.append(data, numpy.array(self.construct_sequence(node)))
但由于我对 NumPy 不太熟悉,所以这没有得到相同的结果。
英文:
YAML has several types defined in the Language Independent types for YAML 1.1.
numpy.array
is not in that list, primarily because it is not language independent, so there is no representation for something non-standard like that.
So you will have to provide a representer yourself, and if you represent it with a tag, you can load back the resulting representation if you provide a constructor for that tag.
It doesn't suffice to make a representer for numpy.array
though, as neither std
nor mean
is of that type, as
you already indicate yourself by the type information you supply. You have to provide a representer for numpy.ndarray
,
and also for numpy.float64
(for the elements of std
and numpy.int
(for the elements of mean)
Once you do that you can dump std
and mean
.
import sys
from pathlib import Path
import numpy
import ruamel.yaml
path = Path('numpy.yaml')
np = numpy
data = dict(
std = np.array([1.0, 1.0, 1.0, 1.0]),
mean = np.array([0, 0, 0, 0]),
)
def represent_numpy_float64(self, value):
return self.represent_float(value) # alternatively dump as a tagged float
def represent_numpy_int64(self, value):
return self.represent_int(value) # alternatively dump as a tagged int
def represent_numpy_array(self, array, flow_style=None):
tag = '!numpy.ndarray'
value = []
node = ruamel.yaml.nodes.SequenceNode(tag, value, flow_style=flow_style)
for elem in array:
node_elem = self.represent_data(elem)
value.append(node_elem)
if flow_style is None:
node.flow_style = True
return node
yaml = ruamel.yaml.YAML()
yaml.Representer.add_representer(numpy.ndarray, represent_numpy_array)
yaml.Representer.add_representer(numpy.float64, represent_numpy_float64)
yaml.Representer.add_representer(numpy.int64, represent_numpy_int64)
yaml.dump(data, path)
print(path.read_text(), end='') # the file ends in a newline, don't add another one
which gives:
std: !numpy.ndarray [1.0, 1.0, 1.0, 1.0]
mean: !numpy.ndarray [0, 0, 0, 0]
ruamel.yaml
's default round-trip loader, can load the resuling file:
import sys
yaml = ruamel.yaml.YAML()
rtd = yaml.load(path)
print(f'{rtd}')
print('type of std', type(rtd['std']))
yaml.dump(rtd, sys.stdout)
which gives:
{'std': [1.0, 1.0, 1.0, 1.0], 'mean': [0, 0, 0, 0]}
type of std <class 'ruamel.yaml.comments.CommentedSeq'>
std: !numpy.ndarray [1.0, 1.0, 1.0, 1.0]
mean: !numpy.ndarray [0, 0, 0, 0]
as you can see, this doesn't load the file as a numpy.ndarray
. For that you need to provide a constructor,
for the tag (since we don't dump the floats/ints tagged, you don't need to provide a constructor for those):
def construct_numpy_array(self, node):
return numpy.array(self.construct_sequence(node))
yaml.Constructor.add_constructor('!numpy.ndarray', construct_numpy_array)
rtd = yaml.load(path)
print(f'{rtd}')
print('type of std', type(rtd['std']))
which gives:
{'std': array([1., 1., 1., 1.]), 'mean': array([0, 0, 0, 0])}
type of std <class 'numpy.ndarray'>
Note that the above doesn't work for (indirectly) recursive dumps, as the construct_numpy_array
does not have a yield
statement. It would be better to do something like:
def construct_numpy_array(self, node):
data = numpy.array([])
yield data
numpy.append(data, numpy.array(self.construct_sequence(node)))
but that did not give the same result (because of my unfamiliarity with numpy).
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论