英文:
YAML representation of a dictionary from multi-level Ctype Structure gets a strange object
问题
TL;DR: 我有一个包含另一个结构的ctype结构,我将其(看似正确)转换为Python字典,然后尝试将其转储到YAML中。但内部结构的值显示错误。
Python版本:3.10
Pyyaml版本:6.0
背景:我试图使我们的内部配置处理更加用户友好。我们的配置文件是从C结构的序列化副本中的数据生成的,直到现在,它是通过十六进制编辑器手动更改的。我的计划是使这个过程更加人类可读,短期内通过读取和写入YAML文件来实现。
我使用Ctypes Structure来序列化和反序列化数据,主要是因为我必须支持位字段。Ctype Structure可以用作模板,没有实际的值。
我通过删除不相关的函数并缩短结构来简化了代码。
我创建了一个My_Structure
类,它继承自Ctypes的Structure
,并允许以不同的方式表示数据,包括字典。这似乎效果不错。
但是,当我尝试将这个字典转储到YAML文件时,ee_sruct
内部的substruct
的值被错误地转储,就像一个Python对象一样。我不明白为什么会这样,因为类型检查显示它仍然是一个字典。
如果我使用具有与get_as_dict
返回的内容完全相同的硬编码字典,一切正常。
显然,转储函数没有获得与从get_as_dict
打印的数据相同的数据。为什么会这样,我该如何修复它?
我尝试的方法:
我的第一个想法是实现一个递归函数,以返回内部结构的dict
(类似于我为数组所做的)。但是我不确定从何处开始,因为substruct
已经被报告为dict
,并且使用字符串(硬编码)表示法可行。
How to export a Pydantic model instance as YAML with URL type as string 看起来是一个不错的方法,但是将Structure
和YAMLObject
结合使用导致了元类冲突,我无法解决。
我尝试将数据转储为Json或使用ruamel.yml,但两者都引发了异常,抱怨substruct_t
。
Combining Dumper class with string representer to get exact required YAML output 可能是正确的方法,但看起来相当复杂,我希望有一个更简单的解决方案,我可能只是忽略了它。
我只是找到了一个“脏”解决方法,步骤如下:
- 将从
get_as_dict()
得到的字典转换为字符串 - 将所有
'
字符替换为"
- 使用
json.loads()
在字符串上创建一个新字典,并使用该字典
它可以工作,但只是强调了我的问题,为什么两个字典与转储器不同?
(以上内容是对您提供的问题和背景信息的总结,不包含具体的回答或解决方案。)
英文:
TL;DR: I have a ctype Structure with another Structure inside, which I convert (seemingly correct) to a python dictionary, then I attempt to dump it into YAML. However the value of the internal Structure is shown wrong.
Python version: 3.10
Pyyaml version: 6.0
Background: I am trying to make our internal configuration handling more user-friendly. Our configuration files are the serialized copy of data from a C structure, and until now it was changed manually via hexeditor. My plan is to make this process more human readable, in a short term reading and writing YAML files.
I am using Ctypes Structure to serialize and deserialize the data, mainly because I have to support bitfields. The Ctype Structure could be used as a template, without actual values in it.
I have simplified the code by removing irrelevant functions and shortening the structures.
class substruct_t(My_Structure):
_pack_ = 1
_fields_ = [
("app", c_uint8, 4),
("showScale", c_uint8, 2),
("showIdleTemp", c_uint8, 2),
("type", c_uint8),
]
class ee_struct(My_Structure):
_pack_ = 1
_fields_ = [
("txStatusDelay", c_uint8, 5),
("overrideHours", c_uint8, 3),
("manualSet", c_int16),
("tempOffset", c_int8),
("substruct", substruct_t),
("LogArr", (c_uint8*6)*3),
("frostTemp", c_int8),
("fileVer", c_uint8*4),
]
class eeprom_t(Union):
_fields_ = [("as_struct", ee_struct), ("as_bytes", c_uint8*29)]
def __str__(self) -> str:
return str(self.as_struct)
def get_as_dict(self):
return self.as_struct.as_dict()
def get_as_bytes(self):
return np.ndarray((29, ), 'b', self.as_bytes, order='C')
I have created a My_Structure
class, which inherits from Ctypes Structure
, and allows different representation of the data, including dict. This seems to work well.
# Child class of Structure with string and dictionary representation functions, unwrapping arrays
class My_Structure(Structure):
def __recursive_carray_get(self, value):
# Necessary recursive function, if value is ctype array
if hasattr(value, '__len__'):
rtn = list()
for i in range(value.__len__()):
rtn.append(self.__recursive_carray_get(value.__getitem__(i)))
else:
rtn = value
return rtn
def __handle_array_type__(self, type):
# example unformatted type: <class '__main__.c_ubyte_Array_6_Array_3'>
return StringBetween("'", "'", str(type)).split(".")[1]
def __repr__(self) -> str:
return str(self.as_dict())
def __str__(self) -> str:
values = ",\n".join(f"{name}={value['value']}" for name, value in self.as_dict().items())
return f"<{self.__class__.__name__}: {values}>"
def as_dict(self) -> dict:
return {field[0]: {'value': self.__recursive_carray_get(getattr(self, field[0])), 'type': self.__handle_array_type__(field[1])}
for field in self._fields_}
However when I want to dump this dict into a YAML file, the value of substruct
within ee_sruct
is dumped badly, like a python object. I do not understand why, as a typecheck shows it is still a dict.
### Dict representation of substructure:
{'value': {'app': {'value': 2, 'type': 'c_ubyte'}, 'showScale': {'value': 0, 'type': 'c_ubyte'}, 'showIdleTemp': {'value': 1, 'type': 'c_ubyte'}, 'type': {'value': 2, 'type': 'c_ubyte'}}, 'type': 'substruct_t'}
### pyyaml dump of entire structure:
txStatusDelay: {value: 8, type: c_ubyte}
overrideHours: {value: 1, type: c_ubyte}
manualSet: {value: 100, type: c_short}
tempOffset: {value: 0, type: c_byte}
substruct:
value: !!python/object/apply:_ctypes._unpickle
- !!python/name:__main__.substruct_t ''
- !!python/tuple
- {}
- !!binary |
QgI=
type: substruct_t
LogArr:
value:
- [1, 0, 0, 0, 0, 0]
- [2, 0, 0, 0, 0, 0]
- [3, 0, 0, 0, 0, 0]
type: c_ubyte_Array_6_Array_3
frostTemp: {value: 16, type: c_byte}
fileVer:
value: [65, 66, 104, 0]
type: c_ubyte_Array_4
If I use a hardcoded dict with the exact same contents I get from get_as_dict
, everything works.
Apparently, the dump functions don't get the same data as what gets printed from get_as_dict
. Why is that, and how can I fix it?
What I tried:
My first idea was to implement a recursive function to return dict
for internal structures (similarly what I did for arrays), but I was not sure where to start, as substruct
is already reported as dict
, and using the string (hardcoded) representation works.
How to export a Pydantic model instance as YAML with URL type as string seemed like a good approach, but combining Structure
and YAMLObject
resulted in a metaclass conflict, which I was unable to resolve.
I tried to dump into Json or using ruamel.yml, both throw an exception, complaining about substruct_t
.
Combining Dumper class with string representer to get exact required YAML output could be the right approach, however it looks quite complicated, and I am hoping, there is a more simple solution that I just overlooked.
I just found a dirty fix, following the steps:
- convert the dict from
get_as_dict()
to a string - replace all
'
characters to"
- use
json.loads()
on the string to create a new dict, and use that instead
It works, but it just underlines my question, why are the two dicts different to the dumpers?
答案1
得分: 0
以下是您提供的内容的翻译:
Listing [Python.Docs]: ctypes - 用于Python的外部函数库。
问题在于*__recursive_carray_get*,它像其名称所示(处理数组,这是一致的)。
但是,当涉及到(子)结构时,它不处理它们(或者将它们处理为任何基本类型):
-
因此,当将ee_struct实例序列化为字典(通过调用其as_dict方法)时,与substruct键对应的值实际上是substruct_t实例
-
由于重写了*__repr__(也使用as_dict*),在打印字典时,** substruct_t实例也显示为字典,掩盖了以前的错误**
我修复了您的代码中的错误,并进行了一些其他改进(更改最少的情况下)。
code00.py:
#!/usr/bin/env python
import ctypes as cts
import sys
from pprint import pprint as pp
import yaml
# 省略了代码的其余部分
输出:
> ```lang-bat
> [cfati@CFATI-5510-0:e:\Work\Dev\StackExchange\StackOverflow\q076458298]> "e:\Work\Dev\VEnvs\py_pc064_03.10_test0\Scripts\python.exe" ./code00.py
> Python 3.10.9 (tags/v3.10.9:1dd9be6, Dec 6 2022, 20:01:21) [MSC v.1934 64 bit (AMD64)] 064bit on win32
>
> ---------- Original object ----------
> <EEStruct: txStatusDelay=0,
> overrideHours=0,
> manualSet=0,
> tempOffset=0,
> substruct={'app': {'value': 0, 'type': 'c_ubyte'}, 'showScale': {'value': 0, 'type': 'c_ubyte'}, 'showIdleTemp': {'value': 0, 'type': 'c_ubyte'}, 'type': {'value': 0, 'type': 'c_ubyte'}},
> LogArr=[[0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0]],
> frostTemp=0,
> fileVer=[0, 0, 0, 0]>
>
> ---------- Type: <class 'dict'> ----------
>
> ---------- Dictionary representation ----------
> {'LogArr': {'type': 'c_ubyte_Array_6_Array_3',
> 'value': [[0, 0, 0, 0, 0, 0],
> [0, 0, 0, 0, 0, 0],
> [0, 0, 0, 0, 0, 0]]},
> 'fileVer': {'type': 'c_ubyte_Array_4', 'value': [0, 0, 0, 0]},
> 'frostTemp': {'type': 'c_byte', 'value': 0},
> 'manualSet': {'type': 'c_short', 'value': 0},
> 'overrideHours': {'type': 'c_ubyte', 'value': 0},
> 'substruct': {'type': 'Substruct_t',
> 'value': {'app': {'type': 'c_ubyte', 'value': 0},
> 'showIdleTemp': {'type': 'c_ubyte', 'value': 0},
> 'showScale': {'type': 'c_ubyte', 'value': 0},
> 'type': {'type': 'c_ubyte', 'value': 0}}},
> 'tempOffset': {'type': 'c_byte', 'value': 0},
> 'txStatusDelay': {'type': 'c_ubyte', 'value': 0}}
>
> ---------- YAML representation ----------
> LogArr:
> type: c_ubyte_Array_6_Array_3
> value:
> - - 0
> - 0
> - 0
> - 0
> - 0
> - 0
> - - 0
> - 0
> - 0
> - 0
> - 0
> - 0
> - - 0
> - 0
> - 0
> - 0
> - 0
> - 0
> fileVer:
> type: c_ubyte_Array_4
> value:
> - 0
> - 0
> - 0
> - 0
> frostTemp:
> type: c_byte
> value: 0
> manualSet:
> type: c_short
>
<details>
<summary>英文:</summary>
Listing [\[Python.Docs\]: ctypes - A foreign function library for Python](https://docs.python.org/library/ctypes.html#module-ctypes).
The problem is that *\_\_recursive\_carray\_get*, does what it name suggests (handles arrays, which is consistent).
**But**, when it comes to (sub) structures, it doesn't handle them (or it handles them as any basic type):
1. So, when the *ee\_struct* instance is serialized to a dictionary (by calling its *as\_dict* method), **the value corresponding to the *substruct* key is actually a *substruct\_t* instance**
2. Due to the fact that *\_\_repr\_\_* is overridden (to also use *as\_dict*), when printing the dictionary, **the *substruct\_t* instance is also displayed as a dictionary, masking the previous error**
I fixed the errors in your code, and added some other improvements (with minimum changes).
*code00.py*:
```python
#!/usr/bin/env python
import ctypes as cts
import sys
from pprint import pprint as pp
import yaml
class SerializableStructure(cts.Structure):
@classmethod
def _as_dict(cls, value):
if isinstance(value, cts.Array):
ret = [cls._as_dict(e) for e in value]
elif hasattr(value, "as_dict"):
ret = value.as_dict()
else:
ret = value
return ret
def __repr__(self) -> str:
return str(self.as_dict())
def __str__(self) -> str:
values = ",\n".join(f"{name}={value['value']}" for name, value in self.as_dict().items())
return f"<{self.__class__.__name__}: {values}>"
def as_dict(self) -> dict:
return {f[0]: {"value": self._as_dict(getattr(self, f[0])), "type": f[1].__name__}
for f in self._fields_}
class Substruct_t(SerializableStructure):
_pack_ = 1
_fields_ = (
("app", cts.c_uint8, 4),
("showScale", cts.c_uint8, 2),
("showIdleTemp", cts.c_uint8, 2),
("type", cts.c_uint8),
)
class EEStruct(SerializableStructure):
_pack_ = 1
_fields_ = (
("txStatusDelay", cts.c_uint8, 5),
("overrideHours", cts.c_uint8, 3),
("manualSet", cts.c_int16),
("tempOffset", cts.c_int8),
("substruct", Substruct_t),
("LogArr", (cts.c_uint8 * 6) * 3),
("frostTemp", cts.c_int8),
("fileVer", cts.c_uint8 * 4),
)
def main(*argv):
ees = EEStruct()
marker = "----------"
print("{:s} Original object {:s}".format(marker, marker))
print(ees)
d = ees.as_dict()
print("\n{:s} Type: {:} {:s}".format(marker, type(d["substruct"]["value"]), marker)) # @TODO - cfati: Check for the old implementation
print("\n{:s} Dictionary representation {:s}".format(marker, marker))
pp(d)
print("\n{:s} YAML representation {:s}".format(marker, marker))
print(yaml.dump(d))
if __name__ == "__main__":
print("Python {:s} {:03d}bit on {:s}\n".format(" ".join(elem.strip() for elem in sys.version.split("\n")),
64 if sys.maxsize > 0x100000000 else 32, sys.platform))
rc = main(*sys.argv[1:])
print("\nDone.\n")
sys.exit(rc)
Output:
> lang-bat
> [cfati@CFATI-5510-0:e:\Work\Dev\StackExchange\StackOverflow\q076458298]> "e:\Work\Dev\VEnvs\py_pc064_03.10_test0\Scripts\python.exe" ./code00.py
> Python 3.10.9 (tags/v3.10.9:1dd9be6, Dec 6 2022, 20:01:21) [MSC v.1934 64 bit (AMD64)] 064bit on win32
>
> ---------- Original object ----------
> <EEStruct: txStatusDelay=0,
> overrideHours=0,
> manualSet=0,
> tempOffset=0,
> substruct={'app': {'value': 0, 'type': 'c_ubyte'}, 'showScale': {'value': 0, 'type': 'c_ubyte'}, 'showIdleTemp': {'value': 0, 'type': 'c_ubyte'}, 'type': {'value': 0, 'type': 'c_ubyte'}},
> LogArr=[[0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0]],
> frostTemp=0,
> fileVer=[0, 0, 0, 0]>
>
> ---------- Type: <class 'dict'> ----------
>
> ---------- Dictionary representation ----------
> {'LogArr': {'type': 'c_ubyte_Array_6_Array_3',
> 'value': [[0, 0, 0, 0, 0, 0],
> [0, 0, 0, 0, 0, 0],
> [0, 0, 0, 0, 0, 0]]},
> 'fileVer': {'type': 'c_ubyte_Array_4', 'value': [0, 0, 0, 0]},
> 'frostTemp': {'type': 'c_byte', 'value': 0},
> 'manualSet': {'type': 'c_short', 'value': 0},
> 'overrideHours': {'type': 'c_ubyte', 'value': 0},
> 'substruct': {'type': 'Substruct_t',
> 'value': {'app': {'type': 'c_ubyte', 'value': 0},
> 'showIdleTemp': {'type': 'c_ubyte', 'value': 0},
> 'showScale': {'type': 'c_ubyte', 'value': 0},
> 'type': {'type': 'c_ubyte', 'value': 0}}},
> 'tempOffset': {'type': 'c_byte', 'value': 0},
> 'txStatusDelay': {'type': 'c_ubyte', 'value': 0}}
>
> ---------- YAML representation ----------
> LogArr:
> type: c_ubyte_Array_6_Array_3
> value:
> - - 0
> - 0
> - 0
> - 0
> - 0
> - 0
> - - 0
> - 0
> - 0
> - 0
> - 0
> - 0
> - - 0
> - 0
> - 0
> - 0
> - 0
> - 0
> fileVer:
> type: c_ubyte_Array_4
> value:
> - 0
> - 0
> - 0
> - 0
> frostTemp:
> type: c_byte
> value: 0
> manualSet:
> type: c_short
> value: 0
> overrideHours:
> type: c_ubyte
> value: 0
> substruct:
> type: Substruct_t
> value:
> app:
> type: c_ubyte
> value: 0
> showIdleTemp:
> type: c_ubyte
> value: 0
> showScale:
> type: c_ubyte
> value: 0
> type:
> type: c_ubyte
> value: 0
> tempOffset:
> type: c_byte
> value: 0
> txStatusDelay:
> type: c_ubyte
> value: 0
>
>
> Done.
>
For more details on the same (or similar) topic, check:
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论