多层C类型结构的字典的YAML表示得到了一个奇怪的对象。

huangapple go评论102阅读模式
英文:

YAML representation of a dictionary from multi-level Ctype Structure gets a strange object

问题

TL;DR: 我有一个包含另一个结构的ctype结构,我将其(看似正确)转换为Python字典,然后尝试将其转储到YAML中。但内部结构的值显示错误。

Python版本:3.10
Pyyaml版本:6.0

背景:我试图使我们的内部配置处理更加用户友好。我们的配置文件是从C结构的序列化副本中的数据生成的,直到现在,它是通过十六进制编辑器手动更改的。我的计划是使这个过程更加人类可读,短期内通过读取和写入YAML文件来实现。

我使用Ctypes Structure来序列化和反序列化数据,主要是因为我必须支持位字段。Ctype Structure可以用作模板,没有实际的值。

我通过删除不相关的函数并缩短结构来简化了代码。

我创建了一个My_Structure类,它继承自Ctypes的Structure,并允许以不同的方式表示数据,包括字典。这似乎效果不错。

但是,当我尝试将这个字典转储到YAML文件时,ee_sruct内部的substruct的值被错误地转储,就像一个Python对象一样。我不明白为什么会这样,因为类型检查显示它仍然是一个字典。

如果我使用具有与get_as_dict返回的内容完全相同的硬编码字典,一切正常。

显然,转储函数没有获得与从get_as_dict打印的数据相同的数据。为什么会这样,我该如何修复它?

我尝试的方法:
我的第一个想法是实现一个递归函数,以返回内部结构的dict(类似于我为数组所做的)。但是我不确定从何处开始,因为substruct已经被报告为dict,并且使用字符串(硬编码)表示法可行。

How to export a Pydantic model instance as YAML with URL type as string 看起来是一个不错的方法,但是将StructureYAMLObject结合使用导致了元类冲突,我无法解决。

我尝试将数据转储为Json或使用ruamel.yml,但两者都引发了异常,抱怨substruct_t

Combining Dumper class with string representer to get exact required YAML output 可能是正确的方法,但看起来相当复杂,我希望有一个更简单的解决方案,我可能只是忽略了它。

我只是找到了一个“脏”解决方法,步骤如下:

  • 将从get_as_dict()得到的字典转换为字符串
  • 将所有'字符替换为"
  • 使用json.loads()在字符串上创建一个新字典,并使用该字典
    它可以工作,但只是强调了我的问题,为什么两个字典与转储器不同?

(以上内容是对您提供的问题和背景信息的总结,不包含具体的回答或解决方案。)

英文:

TL;DR: I have a ctype Structure with another Structure inside, which I convert (seemingly correct) to a python dictionary, then I attempt to dump it into YAML. However the value of the internal Structure is shown wrong.

Python version: 3.10
Pyyaml version: 6.0

Background: I am trying to make our internal configuration handling more user-friendly. Our configuration files are the serialized copy of data from a C structure, and until now it was changed manually via hexeditor. My plan is to make this process more human readable, in a short term reading and writing YAML files.

I am using Ctypes Structure to serialize and deserialize the data, mainly because I have to support bitfields. The Ctype Structure could be used as a template, without actual values in it.

I have simplified the code by removing irrelevant functions and shortening the structures.

  1. class substruct_t(My_Structure):
  2. _pack_ = 1
  3. _fields_ = [
  4. ("app", c_uint8, 4),
  5. ("showScale", c_uint8, 2),
  6. ("showIdleTemp", c_uint8, 2),
  7. ("type", c_uint8),
  8. ]
  9. class ee_struct(My_Structure):
  10. _pack_ = 1
  11. _fields_ = [
  12. ("txStatusDelay", c_uint8, 5),
  13. ("overrideHours", c_uint8, 3),
  14. ("manualSet", c_int16),
  15. ("tempOffset", c_int8),
  16. ("substruct", substruct_t),
  17. ("LogArr", (c_uint8*6)*3),
  18. ("frostTemp", c_int8),
  19. ("fileVer", c_uint8*4),
  20. ]
  21. class eeprom_t(Union):
  22. _fields_ = [("as_struct", ee_struct), ("as_bytes", c_uint8*29)]
  23. def __str__(self) -> str:
  24. return str(self.as_struct)
  25. def get_as_dict(self):
  26. return self.as_struct.as_dict()
  27. def get_as_bytes(self):
  28. return np.ndarray((29, ), 'b', self.as_bytes, order='C')

I have created a My_Structure class, which inherits from Ctypes Structure, and allows different representation of the data, including dict. This seems to work well.

  1. # Child class of Structure with string and dictionary representation functions, unwrapping arrays
  2. class My_Structure(Structure):
  3. def __recursive_carray_get(self, value):
  4. # Necessary recursive function, if value is ctype array
  5. if hasattr(value, '__len__'):
  6. rtn = list()
  7. for i in range(value.__len__()):
  8. rtn.append(self.__recursive_carray_get(value.__getitem__(i)))
  9. else:
  10. rtn = value
  11. return rtn
  12. def __handle_array_type__(self, type):
  13. # example unformatted type: <class '__main__.c_ubyte_Array_6_Array_3'>
  14. return StringBetween("'", "'", str(type)).split(".")[1]
  15. def __repr__(self) -> str:
  16. return str(self.as_dict())
  17. def __str__(self) -> str:
  18. values = ",\n".join(f"{name}={value['value']}" for name, value in self.as_dict().items())
  19. return f"<{self.__class__.__name__}: {values}>"
  20. def as_dict(self) -> dict:
  21. return {field[0]: {'value': self.__recursive_carray_get(getattr(self, field[0])), 'type': self.__handle_array_type__(field[1])}
  22. for field in self._fields_}

However when I want to dump this dict into a YAML file, the value of substruct within ee_sruct is dumped badly, like a python object. I do not understand why, as a typecheck shows it is still a dict.

  1. ### Dict representation of substructure:
  2. {'value': {'app': {'value': 2, 'type': 'c_ubyte'}, 'showScale': {'value': 0, 'type': 'c_ubyte'}, 'showIdleTemp': {'value': 1, 'type': 'c_ubyte'}, 'type': {'value': 2, 'type': 'c_ubyte'}}, 'type': 'substruct_t'}
  3. ### pyyaml dump of entire structure:
  4. txStatusDelay: {value: 8, type: c_ubyte}
  5. overrideHours: {value: 1, type: c_ubyte}
  6. manualSet: {value: 100, type: c_short}
  7. tempOffset: {value: 0, type: c_byte}
  8. substruct:
  9. value: !!python/object/apply:_ctypes._unpickle
  10. - !!python/name:__main__.substruct_t ''
  11. - !!python/tuple
  12. - {}
  13. - !!binary |
  14. QgI=
  15. type: substruct_t
  16. LogArr:
  17. value:
  18. - [1, 0, 0, 0, 0, 0]
  19. - [2, 0, 0, 0, 0, 0]
  20. - [3, 0, 0, 0, 0, 0]
  21. type: c_ubyte_Array_6_Array_3
  22. frostTemp: {value: 16, type: c_byte}
  23. fileVer:
  24. value: [65, 66, 104, 0]
  25. type: c_ubyte_Array_4

If I use a hardcoded dict with the exact same contents I get from get_as_dict, everything works.

Apparently, the dump functions don't get the same data as what gets printed from get_as_dict. Why is that, and how can I fix it?

What I tried:

My first idea was to implement a recursive function to return dict for internal structures (similarly what I did for arrays), but I was not sure where to start, as substruct is already reported as dict, and using the string (hardcoded) representation works.

How to export a Pydantic model instance as YAML with URL type as string seemed like a good approach, but combining Structure and YAMLObject resulted in a metaclass conflict, which I was unable to resolve.

I tried to dump into Json or using ruamel.yml, both throw an exception, complaining about substruct_t.

Combining Dumper class with string representer to get exact required YAML output could be the right approach, however it looks quite complicated, and I am hoping, there is a more simple solution that I just overlooked.

I just found a dirty fix, following the steps:

  • convert the dict from get_as_dict() to a string
  • replace all ' characters to "
  • use json.loads() on the string to create a new dict, and use that instead
    It works, but it just underlines my question, why are the two dicts different to the dumpers?

答案1

得分: 0

以下是您提供的内容的翻译:

Listing [Python.Docs]: ctypes - 用于Python的外部函数库

问题在于*__recursive_carray_get*,它像其名称所示(处理数组,这是一致的)。

但是,当涉及到(子)结构时,它不处理它们(或者将它们处理为任何基本类型):

  1. 因此,当将ee_struct实例序列化为字典(通过调用其as_dict方法)时,substruct键对应的值实际上是substruct_t实例

  2. 由于重写了*__repr__(也使用as_dict*),在打印字典时,** substruct_t实例也显示为字典,掩盖了以前的错误**

我修复了您的代码中的错误,并进行了一些其他改进(更改最少的情况下)。

code00.py

  1. #!/usr/bin/env python
  2. import ctypes as cts
  3. import sys
  4. from pprint import pprint as pp
  5. import yaml
  6. # 省略了代码的其余部分

输出

  1. > ```lang-bat
  2. > [cfati@CFATI-5510-0:e:\Work\Dev\StackExchange\StackOverflow\q076458298]> "e:\Work\Dev\VEnvs\py_pc064_03.10_test0\Scripts\python.exe" ./code00.py
  3. > Python 3.10.9 (tags/v3.10.9:1dd9be6, Dec 6 2022, 20:01:21) [MSC v.1934 64 bit (AMD64)] 064bit on win32
  4. >
  5. > ---------- Original object ----------
  6. > <EEStruct: txStatusDelay=0,
  7. > overrideHours=0,
  8. > manualSet=0,
  9. > tempOffset=0,
  10. > substruct={'app': {'value': 0, 'type': 'c_ubyte'}, 'showScale': {'value': 0, 'type': 'c_ubyte'}, 'showIdleTemp': {'value': 0, 'type': 'c_ubyte'}, 'type': {'value': 0, 'type': 'c_ubyte'}},
  11. > LogArr=[[0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0]],
  12. > frostTemp=0,
  13. > fileVer=[0, 0, 0, 0]>
  14. >
  15. > ---------- Type: <class 'dict'> ----------
  16. >
  17. > ---------- Dictionary representation ----------
  18. > {'LogArr': {'type': 'c_ubyte_Array_6_Array_3',
  19. > 'value': [[0, 0, 0, 0, 0, 0],
  20. > [0, 0, 0, 0, 0, 0],
  21. > [0, 0, 0, 0, 0, 0]]},
  22. > 'fileVer': {'type': 'c_ubyte_Array_4', 'value': [0, 0, 0, 0]},
  23. > 'frostTemp': {'type': 'c_byte', 'value': 0},
  24. > 'manualSet': {'type': 'c_short', 'value': 0},
  25. > 'overrideHours': {'type': 'c_ubyte', 'value': 0},
  26. > 'substruct': {'type': 'Substruct_t',
  27. > 'value': {'app': {'type': 'c_ubyte', 'value': 0},
  28. > 'showIdleTemp': {'type': 'c_ubyte', 'value': 0},
  29. > 'showScale': {'type': 'c_ubyte', 'value': 0},
  30. > 'type': {'type': 'c_ubyte', 'value': 0}}},
  31. > 'tempOffset': {'type': 'c_byte', 'value': 0},
  32. > 'txStatusDelay': {'type': 'c_ubyte', 'value': 0}}
  33. >
  34. > ---------- YAML representation ----------
  35. > LogArr:
  36. > type: c_ubyte_Array_6_Array_3
  37. > value:
  38. > - - 0
  39. > - 0
  40. > - 0
  41. > - 0
  42. > - 0
  43. > - 0
  44. > - - 0
  45. > - 0
  46. > - 0
  47. > - 0
  48. > - 0
  49. > - 0
  50. > - - 0
  51. > - 0
  52. > - 0
  53. > - 0
  54. > - 0
  55. > - 0
  56. > fileVer:
  57. > type: c_ubyte_Array_4
  58. > value:
  59. > - 0
  60. > - 0
  61. > - 0
  62. > - 0
  63. > frostTemp:
  64. > type: c_byte
  65. > value: 0
  66. > manualSet:
  67. > type: c_short
  68. >
  69. <details>
  70. <summary>英文:</summary>
  71. Listing [\[Python.Docs\]: ctypes - A foreign function library for Python](https://docs.python.org/library/ctypes.html#module-ctypes).
  72. The problem is that *\_\_recursive\_carray\_get*, does what it name suggests (handles arrays, which is consistent).
  73. **But**, when it comes to (sub) structures, it doesn&#39;t handle them (or it handles them as any basic type):
  74. 1. So, when the *ee\_struct* instance is serialized to a dictionary (by calling its *as\_dict* method), **the value corresponding to the *substruct* key is actually a *substruct\_t* instance**
  75. 2. Due to the fact that *\_\_repr\_\_* is overridden (to also use *as\_dict*), when printing the dictionary, **the *substruct\_t* instance is also displayed as a dictionary, masking the previous error**
  76. I fixed the errors in your code, and added some other improvements (with minimum changes).
  77. *code00.py*:
  78. ```python
  79. #!/usr/bin/env python
  80. import ctypes as cts
  81. import sys
  82. from pprint import pprint as pp
  83. import yaml
  84. class SerializableStructure(cts.Structure):
  85. @classmethod
  86. def _as_dict(cls, value):
  87. if isinstance(value, cts.Array):
  88. ret = [cls._as_dict(e) for e in value]
  89. elif hasattr(value, &quot;as_dict&quot;):
  90. ret = value.as_dict()
  91. else:
  92. ret = value
  93. return ret
  94. def __repr__(self) -&gt; str:
  95. return str(self.as_dict())
  96. def __str__(self) -&gt; str:
  97. values = &quot;,\n&quot;.join(f&quot;{name}={value[&#39;value&#39;]}&quot; for name, value in self.as_dict().items())
  98. return f&quot;&lt;{self.__class__.__name__}: {values}&gt;&quot;
  99. def as_dict(self) -&gt; dict:
  100. return {f[0]: {&quot;value&quot;: self._as_dict(getattr(self, f[0])), &quot;type&quot;: f[1].__name__}
  101. for f in self._fields_}
  102. class Substruct_t(SerializableStructure):
  103. _pack_ = 1
  104. _fields_ = (
  105. (&quot;app&quot;, cts.c_uint8, 4),
  106. (&quot;showScale&quot;, cts.c_uint8, 2),
  107. (&quot;showIdleTemp&quot;, cts.c_uint8, 2),
  108. (&quot;type&quot;, cts.c_uint8),
  109. )
  110. class EEStruct(SerializableStructure):
  111. _pack_ = 1
  112. _fields_ = (
  113. (&quot;txStatusDelay&quot;, cts.c_uint8, 5),
  114. (&quot;overrideHours&quot;, cts.c_uint8, 3),
  115. (&quot;manualSet&quot;, cts.c_int16),
  116. (&quot;tempOffset&quot;, cts.c_int8),
  117. (&quot;substruct&quot;, Substruct_t),
  118. (&quot;LogArr&quot;, (cts.c_uint8 * 6) * 3),
  119. (&quot;frostTemp&quot;, cts.c_int8),
  120. (&quot;fileVer&quot;, cts.c_uint8 * 4),
  121. )
  122. def main(*argv):
  123. ees = EEStruct()
  124. marker = &quot;----------&quot;
  125. print(&quot;{:s} Original object {:s}&quot;.format(marker, marker))
  126. print(ees)
  127. d = ees.as_dict()
  128. print(&quot;\n{:s} Type: {:} {:s}&quot;.format(marker, type(d[&quot;substruct&quot;][&quot;value&quot;]), marker)) # @TODO - cfati: Check for the old implementation
  129. print(&quot;\n{:s} Dictionary representation {:s}&quot;.format(marker, marker))
  130. pp(d)
  131. print(&quot;\n{:s} YAML representation {:s}&quot;.format(marker, marker))
  132. print(yaml.dump(d))
  133. if __name__ == &quot;__main__&quot;:
  134. print(&quot;Python {:s} {:03d}bit on {:s}\n&quot;.format(&quot; &quot;.join(elem.strip() for elem in sys.version.split(&quot;\n&quot;)),
  135. 64 if sys.maxsize &gt; 0x100000000 else 32, sys.platform))
  136. rc = main(*sys.argv[1:])
  137. print(&quot;\nDone.\n&quot;)
  138. sys.exit(rc)

Output:

> lang-bat
&gt; [cfati@CFATI-5510-0:e:\Work\Dev\StackExchange\StackOverflow\q076458298]&gt; &quot;e:\Work\Dev\VEnvs\py_pc064_03.10_test0\Scripts\python.exe&quot; ./code00.py
&gt; Python 3.10.9 (tags/v3.10.9:1dd9be6, Dec 6 2022, 20:01:21) [MSC v.1934 64 bit (AMD64)] 064bit on win32
&gt;
&gt; ---------- Original object ----------
&gt; &lt;EEStruct: txStatusDelay=0,
&gt; overrideHours=0,
&gt; manualSet=0,
&gt; tempOffset=0,
&gt; substruct={&#39;app&#39;: {&#39;value&#39;: 0, &#39;type&#39;: &#39;c_ubyte&#39;}, &#39;showScale&#39;: {&#39;value&#39;: 0, &#39;type&#39;: &#39;c_ubyte&#39;}, &#39;showIdleTemp&#39;: {&#39;value&#39;: 0, &#39;type&#39;: &#39;c_ubyte&#39;}, &#39;type&#39;: {&#39;value&#39;: 0, &#39;type&#39;: &#39;c_ubyte&#39;}},
&gt; LogArr=[[0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0]],
&gt; frostTemp=0,
&gt; fileVer=[0, 0, 0, 0]&gt;
&gt;
&gt; ---------- Type: &lt;class &#39;dict&#39;&gt; ----------
&gt;
&gt; ---------- Dictionary representation ----------
&gt; {&#39;LogArr&#39;: {&#39;type&#39;: &#39;c_ubyte_Array_6_Array_3&#39;,
&gt; &#39;value&#39;: [[0, 0, 0, 0, 0, 0],
&gt; [0, 0, 0, 0, 0, 0],
&gt; [0, 0, 0, 0, 0, 0]]},
&gt; &#39;fileVer&#39;: {&#39;type&#39;: &#39;c_ubyte_Array_4&#39;, &#39;value&#39;: [0, 0, 0, 0]},
&gt; &#39;frostTemp&#39;: {&#39;type&#39;: &#39;c_byte&#39;, &#39;value&#39;: 0},
&gt; &#39;manualSet&#39;: {&#39;type&#39;: &#39;c_short&#39;, &#39;value&#39;: 0},
&gt; &#39;overrideHours&#39;: {&#39;type&#39;: &#39;c_ubyte&#39;, &#39;value&#39;: 0},
&gt; &#39;substruct&#39;: {&#39;type&#39;: &#39;Substruct_t&#39;,
&gt; &#39;value&#39;: {&#39;app&#39;: {&#39;type&#39;: &#39;c_ubyte&#39;, &#39;value&#39;: 0},
&gt; &#39;showIdleTemp&#39;: {&#39;type&#39;: &#39;c_ubyte&#39;, &#39;value&#39;: 0},
&gt; &#39;showScale&#39;: {&#39;type&#39;: &#39;c_ubyte&#39;, &#39;value&#39;: 0},
&gt; &#39;type&#39;: {&#39;type&#39;: &#39;c_ubyte&#39;, &#39;value&#39;: 0}}},
&gt; &#39;tempOffset&#39;: {&#39;type&#39;: &#39;c_byte&#39;, &#39;value&#39;: 0},
&gt; &#39;txStatusDelay&#39;: {&#39;type&#39;: &#39;c_ubyte&#39;, &#39;value&#39;: 0}}
&gt;
&gt; ---------- YAML representation ----------
&gt; LogArr:
&gt; type: c_ubyte_Array_6_Array_3
&gt; value:
&gt; - - 0
&gt; - 0
&gt; - 0
&gt; - 0
&gt; - 0
&gt; - 0
&gt; - - 0
&gt; - 0
&gt; - 0
&gt; - 0
&gt; - 0
&gt; - 0
&gt; - - 0
&gt; - 0
&gt; - 0
&gt; - 0
&gt; - 0
&gt; - 0
&gt; fileVer:
&gt; type: c_ubyte_Array_4
&gt; value:
&gt; - 0
&gt; - 0
&gt; - 0
&gt; - 0
&gt; frostTemp:
&gt; type: c_byte
&gt; value: 0
&gt; manualSet:
&gt; type: c_short
&gt; value: 0
&gt; overrideHours:
&gt; type: c_ubyte
&gt; value: 0
&gt; substruct:
&gt; type: Substruct_t
&gt; value:
&gt; app:
&gt; type: c_ubyte
&gt; value: 0
&gt; showIdleTemp:
&gt; type: c_ubyte
&gt; value: 0
&gt; showScale:
&gt; type: c_ubyte
&gt; value: 0
&gt; type:
&gt; type: c_ubyte
&gt; value: 0
&gt; tempOffset:
&gt; type: c_byte
&gt; value: 0
&gt; txStatusDelay:
&gt; type: c_ubyte
&gt; value: 0
&gt;
&gt;
&gt; Done.
&gt;

For more details on the same (or similar) topic, check:

huangapple
  • 本文由 发表于 2023年6月12日 23:48:35
  • 转载请务必保留本文链接:https://go.coder-hub.com/76458298.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定