使用自定义编码器压缩Pydantic模型字典。

huangapple go评论77阅读模式
英文:

Condense a Pydantic model dict with custom encoder

问题

我有两个`pydantic`模型

```python
import pydantic


class RoleBaseClass(pydantic.BaseModel):
    name: str = pydantic.Field(regex=r"^\w+$")


class SubRole(RoleBaseClass):
    ...


class Role(RoleBaseClass):
    subroles: list[SubRole] = pydantic.Field(default=[])

如果我创建一个Role模型的实例,例如

role1 = Role(name="role1", subroles=[SubRole(name="sub1")])

并运行role1.dict(),结果将是

{ "name": "role1", "subroles": [{"name": "sub1"}]}

然而,我想要去掉name字段,当调用role1.dict()时,希望将结果简化为{"role1": ["sub1"]}。是否有办法实现这一点?我没有找到dict方法的encoder,就像我们对json方法有json_encoders一样。

有人能告诉我需要进行的更改吗?


<details>
<summary>英文:</summary>

I have two `pydantic` models:

```python
import pydantic


class RoleBaseClass(pydantic.BaseModel):
    name: str = pydantic.Field(regex=r&quot;^\w+$&quot;)


class SubRole(RoleBaseClass):
    ...

class Role(RoleBaseClass):
    subroles: list[SubRole] = pydantic.Field(default=[])

If I create an instance of Role model like

role1 = Role(name=&quot;role1&quot;, subroles=[SubRole(name=&quot;sub1&quot;)])

and run role1.dict() the result would be

{ &quot;name&quot;: &quot;role1&quot;: &quot;subroles&quot;: [{&quot;name&quot;: &quot;sub1&quot;}]}.

However, I would like to get rid of the name field and when role1.dict() is called, would like to condense the result to {&quot;role1&quot;: [&quot;sub1&quot;]}. Is there a way to achieve this? I did not find any encoder for dict as we have for the json method (json_encoders).

Could someone let me know the changes I need to make?

答案1

得分: 1

pydanitc支持原生Python中的dataclass属性

因此,您可以只需声明自己的dict函数到数据模型中以实现相同的结果。

class Role(RoleBaseClass):
    """
    Role model.
    """
    subroles: list[SubRole] = pydantic.Field(default=[])

    def dict(self):
        return {self.name:
展开收缩
}

使用您的示例

# 输入
role1 = Role(name="role1", subroles=[SubRole(name="sub1")])
role1.dict()

# 输出
# {'role1': ['sub1']}

# 空的子角色
role1 = Role(name="role1")
role1.dict()

# 输出
# {'role1': []}
英文:

pydanitc supports native dataclass attributes in native python

Therefore, you can just declar your own dictfunction to the data models to achieve same result

class Role(RoleBaseClass):
    &quot;&quot;&quot;
    Role model.
    &quot;&quot;&quot;
    subroles: list[SubRole] = pydantic.Field(default=[])
        
    def dict(self):
        return {self.name:
展开收缩
}

With your example

# input
role1 = Role(name=&quot;role1&quot;, subroles=[SubRole(name=&quot;sub1&quot;)])
role1.dict()

# output
# {&#39;role1&#39;: [&#39;sub1&#39;]}

# empty subroles
role1 = Role(name=&quot;role1&quot;)
role1.dict()

# output
# {&#39;role1&#39;: []}

答案2

得分: 1

我建议为此编写一个单独的模型,因为您正在描述一个完全不同的模式。

覆盖dict方法或滥用JSON编码器机制来修改模式似乎是一个不好的主意。这会使模型的行为变得令人困惑。可以通过查看类及其属性/字段来理解Role模式的定义。但是,您的转储表示会完全不同。

相反,一个单独的模型非常清楚地表明您将处理不同的模式。您可以定义其自定义根类型dict[str, list[str]并设置一个pre=True验证器,该验证器将允许您解析常规的Role对象(其中包含字典)。如果需要,还可以通过添加诸如__iter____getitem__之类的方法来增强该模型的接口,使其更像一个字典。

以下是示例:

from collections.abc import Iterator

from pydantic import BaseModel, Field, validator


...  # RoleBaseClass 和 SubRole

class Role(RoleBaseClass):
    subroles: list[SubRole] = Field(default_factory=list)


class CondensedRole(BaseModel):
    __root__: dict[str, list[str]]

    def __iter__(self) -> Iterator[tuple[str, list[str]]]:  
        return iter(self.__root__.items())

    def __getitem__(self, item: str) -> list[str]:
        return self.__root__[item]

    def __str__(self) -> str:
        return str(self.__root__)

    @validator("__root__", pre=True)
    def parse_regular_role(cls, v: object) -> object:
        if isinstance(v, Role):
            return {v.name: 
展开收缩
}
if not isinstance(v, dict): return v name, subroles = v.get("name"), v.get("subroles", []) if name is None: return v return {name:
展开收缩
for s in subroles if "name" in s]}

用法示例:

role = Role(
    name="role",
    subroles=[SubRole(name="sub1"), SubRole(name="sub2")]
)
condensed1 = CondensedRole.parse_obj(role)
condensed2 = CondensedRole.parse_obj(role.dict())
assert condensed1 == condensed2
print(condensed1)          # {'role': ['sub1', 'sub2']}
print(condensed1["role"])  # ['sub1', 'sub2']
print(dict(condensed1))    # {'role': ['sub1', 'sub2']}

请注意,通过以此方式定义CondensedRole.__iter__,您可以简单地将CondensedRole实例传递给常规的dict构造函数,并获得您想要的完全相同的字典。

如果需要,甚至可以在Role中添加一个方便的方法,以便您可以简单地调用该方法来从中获取一个CondensedRole对象。类似于这样:

from __future__ import annotations

...

class Role(RoleBaseClass):
    subroles: list[SubRole] = Field(default_factory=list)

    def condense(self) -> CondensedRole:
        return CondensedRole.parse_obj(self)

然后,通过扩展上面的演示,您可以调用role.condense()以获得与CondensedRole.parse_obj(role)相同的对象。

英文:

I would suggest writing a separate model for this because you are describing a totally different schema.

Overriding the dict method or abusing the JSON encoder mechanisms to modify the schema that much seems like a bad idea. It makes the model's behavior confusing. The definition of the Role schema can be understood by looking at the class and its attributes/fields. But your dumped representation would look completely different.

Conversely, a separate model conveys very clearly that you will be dealing with a different schema. You can define its custom root type to be dict[str, list[str] and set up a pre=True validator that will allow you to parse regular Role objects (dictionaries thereof). If you want, you can additionally enhance that model's interface with things like __iter__ and __getitem__ to make it behave more like a dictionary itself.

Here is an example:

from collections.abc import Iterator

from pydantic import BaseModel, Field, validator


...  # RoleBaseClass and SubRole

class Role(RoleBaseClass):
    subroles: list[SubRole] = Field(default_factory=list)


class CondensedRole(BaseModel):
    __root__: dict[str, list[str]]

    def __iter__(self) -&gt; Iterator[tuple[str, list[str]]]:  # type: ignore[override]
        return iter(self.__root__.items())

    def __getitem__(self, item: str) -&gt; list[str]:
        return self.__root__[item]

    def __str__(self) -&gt; str:
        return str(self.__root__)

    @validator(&quot;__root__&quot;, pre=True)
    def parse_regular_role(cls, v: object) -&gt; object:
        if isinstance(v, Role):
            return {v.name: 
展开收缩
}
if not isinstance(v, dict): return v # invalid name, subroles = v.get(&quot;name&quot;), v.get(&quot;subroles&quot;, []) if name is None: return v # invalid return {name:
展开收缩
for s in subroles if &quot;name&quot; in s]}

Usage demo:

role = Role(
    name=&quot;role&quot;,
    subroles=[SubRole(name=&quot;sub1&quot;), SubRole(name=&quot;sub2&quot;)]
)
condensed1 = CondensedRole.parse_obj(role)
condensed2 = CondensedRole.parse_obj(role.dict())
assert condensed1 == condensed2
print(condensed1)          # {&#39;role&#39;: [&#39;sub1&#39;, &#39;sub2&#39;]}
print(condensed1[&quot;role&quot;])  # [&#39;sub1&#39;, &#39;sub2&#39;]
print(dict(condensed1))    # {&#39;role&#39;: [&#39;sub1&#39;, &#39;sub2&#39;]}

Notice that by defining CondensedRole.__iter__ the way we did here, allows us to simply pass a CondensedRole instance to the regular dict constructor and receive exactly the dictionary you want.

If you want, you can even add a convenience method to Role so that you can simply call that to receive a CondensedRole object from it. Something like this:

from __future__ import annotations

...

class Role(RoleBaseClass):
    subroles: list[SubRole] = Field(default_factory=list)

    def condense(self) -&gt; CondensedRole:
        return CondensedRole.parse_obj(self)

Extending the demo from above you could then call role.condense() to get the same object as you would from CondensedRole.parse_obj(role).

huangapple
  • 本文由 发表于 2023年6月15日 10:48:14
  • 转载请务必保留本文链接:https://go.coder-hub.com/76478760.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定