2023年6月1日 16:16:34go评论98阅读模式

英文:

Create dataclass instance from union type based on string literal

问题

我试图为我们的代码库实现强类型。代码的一个重要部分是处理来自外部设备的事件并将它们转发给不同的处理程序。这些事件都有一个值属性，但这个值可以有不同的类型。这个值类型是根据事件名称进行映射的。因此，温度事件始终具有int值，而寄存器事件始终具有RegisterInfo作为其值。

因此，我想将事件名称映射到值类型。但我们在实现上遇到了困难。

这个设置最接近我们想要的：

@dataclass
class EventBase:
    name: str
    value: Any
    value_type: str

@dataclass
class RegisterEvent(EventBase):
    value: RegisterInfo
    name: Literal["register"]
    value_type: Literal["RegisterInfo"] = "RegisterInfo"

@dataclass
class NumberEvent(EventBase):
    value: float | int
    name: Literal["temperature", "line_number"]
    value_type: Literal["number"] = "number"

@dataclass
class StringEvent(EventBase):
    value: str
    name: Literal["warning", "status"]
    value_type: Literal["string"] = "string"

Events: TypeAlias = RegisterEvent | NumberEvent | StringEvent

使用这个设置，mypy将标记不正确的代码，例如：

def handle_event(event: Events):
    if event.name == "temperature":
        event.value.upper()

（它认为温度事件应该具有int值，而int没有upper()方法）

但是，用这种方式创建事件会变得很丑陋。我不想要一个大的if语句，将每个事件名称映射到特定的事件类。我们有许多不同类型的事件，而这些映射信息已经包含在这些类中。

理想情况下，我想让它看起来像这样：

def handle_device_message(message_info):
    event_name = message_info["event_name"]
    event_value = message_info["event_value"]

    event = Events(event_name, event_value)

像这样的“一行代码”是否可能？

我感觉我们有点碰壁了，代码结构可能存在问题吗？

英文:

I'm trying to strongly type our code base. A big part of the code is handling events that come from external devices and forwarding them to different handlers. These events all have a value attribute, but this value can have different types. This value type is mapped per event name. So a temperature event always has an int value, an register event always as RegisterInfo as its value.

So I would like to map the event name to the value type. But we are struggling with implementation.

This setup comes the closest to what we want:

@dataclass
class EventBase:
    name: str
    value: Any
    value_type: str

@dataclass
class RegisterEvent(EventBase):
    value: RegisterInfo
    name: Literal[&quot;register&quot;]
    value_type: Literal[&quot;RegisterInfo&quot;] = &quot;RegisterInfo&quot;


@dataclass
class NumberEvent(EventBase):
    value: float | int
    name: Literal[&quot;temperature&quot;, &quot;line_number&quot;]
    value_type: Literal[&quot;number&quot;] = &quot;number&quot;

@dataclass
class StringEvent(EventBase):
    value: str
    name: Literal[&quot;warning&quot;, &quot;status&quot;]
    value_type: Literal[&quot;string&quot;] = &quot;string&quot;


Events: TypeAlias = RegisterEvent | NumberEvent | StringEvent

With this setup mypy will flag incorrect code like:

def handle_event(event: Events):
    if event.name == &quot;temperature&quot;:
        event.value.upper()

(It sees that a temperature event should have value type int, and that doesn't have an upper() method)

But creating the events becomes ugly this way. I don't want a big if statement that maps each event name to a specific event class. We have lots of different event types, and this mapping info is already inside these classes.

Ideally I would like it to look like this:

def handle_device_message(message_info):
    event_name = message_info[&quot;event_name&quot;]
    event_value = message_info[&quot;event_value&quot;]

    event = Events(event_name, event_value)

Is a "one-liner" like this possible?

I feel like we are kinda walking against wall here, could it be that the code is architecturally wrong?

答案1

得分: 3

UPDATE: 使用 Pydantic `v2`

如果您愿意切换到 Pydantic 而不是 dataclasses，您可以通过 typing.Annotated 定义一个带有标签的联合，并使用 TypeAdapter 作为一个能够根据提供的 name 字符串区分不同 Event 子类型的“通用”构造函数。

这是我的建议：

from typing import Annotated, Any, Literal

from pydantic import BaseModel, Field, TypeAdapter


class EventBase(BaseModel):
    name: str
    value: Any


class NumberEvent(EventBase):
    name: Literal["temperature", "line_number"]
    value: float


class StringEvent(EventBase):
    name: Literal["warning", "status"]
    value: str


Event = TypeAdapter(Annotated[
    NumberEvent | StringEvent,
    Field(discriminator="name"),
])


event_temp = Event.validate_python({"name": "temperature", "value": 3.14})
event_status = Event.validate_python({"name": "status", "value": "spam"})

print(repr(event_temp))    # NumberEvent(name='temperature', value=3.14)
print(repr(event_status))  # StringEvent(name='status', value='spam')

当然，一个无效的 name 将会引起验证错误，就像一个完全错误的 value 类型一样（无法强制转换）。示例：

from pydantic import ValidationError

try:
    Event.validate_python({"name": "temperature", "value": "foo"})
except ValidationError as err:
    print(err.json(indent=4))

try:
    Event.validate_python({"name": "foo", "value": "bar"})
except ValidationError as err:
    print(err.json(indent=4))

输出：

[
    {
        "type": "float_parsing",
        "loc": [
            "temperature",
            "value"
        ],
        "msg": "Input should be a valid number, unable to parse string as a number",
        "input": "foo",
        "url": "https://errors.pydantic.dev/2.1/v/float_parsing"
    }
]

[
    {
        "type": "union_tag_invalid",
        "loc": [],
        "msg": "Input tag 'foo' found using 'name' does not match any of the expected tags: 'temperature', 'line_number', 'warning', 'status'",
        "input": {
            "name": "foo",
            "value": "bar"
        },
        "ctx": {
            "discriminator": "'name'",
            "tag": "foo",
            "expected_tags": "'temperature', 'line_number', 'warning', 'status'"
        },
        "url": "https://errors.pydantic.dev/2.1/v/union_tag_invalid"
    }
]

原始回答：使用 Pydantic `v1`

如果您愿意切换到 Pydantic 而不是 dataclasses，您可以通过 typing.Annotated 定义一个带有标签的联合，并使用 parse_obj_as 函数作为一个能够根据提供的 name 字符串区分不同 Event 子类型的“通用”构造函数。

这是我的建议：

from typing import Annotated, Any, Literal

from pydantic import BaseModel, Field, parse_obj_as


class EventBase(BaseModel):
    name: str
    value: Any


class NumberEvent(EventBase):
    name: Literal["temperature", "line_number"]
    value: float


class StringEvent(EventBase):
    name: Literal["warning", "status"]
    value: str


Event = Annotated[
    NumberEvent | StringEvent,
    Field(discriminator="name"),
]


event_temp = parse_obj_as(Event, {"name": "temperature", "value": "3.14"})
event_status = parse_obj_as(Event, {"name": "status", "value": -10})

print(repr(event_temp))    # NumberEvent(name='temperature', value=3.14)
print(repr(event_status))  # StringEvent(name='status', value='-10')

在这个示例中，我故意使用了各自 value 字段的“错误”类型，以表明 Pydantic 将自动尝试将它们强制转换为正确类型，一旦它根据提供的 name 确定了正确的模型。

当然，一个无效的 name 将会引起验证错误，就像一个完全错误的 value 类型一样（无法强制转换）。示例：

from pydantic import ValidationError

try:
    parse_obj_as(Event, {"name": "temperature", "value": "foo"})
except ValidationError as err:
    print(err.json(indent=4))

try:
    parse_obj_as(Event, {"name": "foo", "value": "bar"})
except ValidationError as err:
    print(err.json(indent=4))

输出：

[
    {
        "loc": [
            "__root__",
            "NumberEvent",
            "value"
        ],
        "msg": "value is not a valid float",
        "type": "type_error.float"
    }
]

[
    {
        "loc": [
            "__root__"
        ],
        "msg": "No match for discriminator 'name' and value 'foo' (allowed values: 'temperature', 'line_number', 'warning', 'status')",
        "type": "value_error.discriminated_union.invalid_discriminator",
        "ctx": {
            "discriminator_key": "name",
            "discriminator_value": "foo",
            "allowed_values": "'temperature', 'line_number', 'warning', 'status'"
        }
    }
]

附注

像 NumberEvent | StringEvent 这样的类型联合的别名仍然应该有一个单数的名称，即 Event 而不是 Events，因为从语义上讲，注释 e: Event 表示 e 应该是 其中一种类型 的实例，而 e: Events 则暗示 e 将是 这些类型之一 的多个实例（集合）。

此外，联合类型 float | int 几乎总是等同于 float，因为按照约定，所有类型检查器都将 int 视为 float 的子类型。

英文:

UPDATE: Using Pydantic `v2`

If you are willing to switch to Pydantic instead of dataclasses, you can define a discriminated union via typing.Annotated and use the TypeAdapter as a "universal" constructor that is able to discriminate between distinct Event subtypes based on the provided name string.

Here is what I would suggest:

from typing import Annotated, Any, Literal

from pydantic import BaseModel, Field, TypeAdapter


class EventBase(BaseModel):
    name: str
    value: Any


class NumberEvent(EventBase):
    name: Literal[&quot;temperature&quot;, &quot;line_number&quot;]
    value: float


class StringEvent(EventBase):
    name: Literal[&quot;warning&quot;, &quot;status&quot;]
    value: str


Event = TypeAdapter(Annotated[
    NumberEvent | StringEvent,
    Field(discriminator=&quot;name&quot;),
])


event_temp = Event.validate_python({&quot;name&quot;: &quot;temperature&quot;, &quot;value&quot;: 3.14})
event_status = Event.validate_python({&quot;name&quot;: &quot;status&quot;, &quot;value&quot;: &quot;spam&quot;})

print(repr(event_temp))    # NumberEvent(name=&#39;temperature&#39;, value=3.14)
print(repr(event_status))  # StringEvent(name=&#39;status&#39;, value=&#39;spam&#39;)

An invalid name would of course cause a validation error, just like a completely wrong and type for value (that cannot be coerced). Example:

from pydantic import ValidationError

try:
    Event.validate_python({&quot;name&quot;: &quot;temperature&quot;, &quot;value&quot;: &quot;foo&quot;})
except ValidationError as err:
    print(err.json(indent=4))

try:
    Event.validate_python({&quot;name&quot;: &quot;foo&quot;, &quot;value&quot;: &quot;bar&quot;})
except ValidationError as err:
    print(err.json(indent=4))

Output:

[
    {
        &quot;type&quot;: &quot;float_parsing&quot;,
        &quot;loc&quot;: [
            &quot;temperature&quot;,
            &quot;value&quot;
        ],
        &quot;msg&quot;: &quot;Input should be a valid number, unable to parse string as a number&quot;,
        &quot;input&quot;: &quot;foo&quot;,
        &quot;url&quot;: &quot;https://errors.pydantic.dev/2.1/v/float_parsing&quot;
    }
]

[
    {
        &quot;type&quot;: &quot;union_tag_invalid&quot;,
        &quot;loc&quot;: [],
        &quot;msg&quot;: &quot;Input tag &#39;foo&#39; found using &#39;name&#39; does not match any of the expected tags: &#39;temperature&#39;, &#39;line_number&#39;, &#39;warning&#39;, &#39;status&#39;&quot;,
        &quot;input&quot;: {
            &quot;name&quot;: &quot;foo&quot;,
            &quot;value&quot;: &quot;bar&quot;
        },
        &quot;ctx&quot;: {
            &quot;discriminator&quot;: &quot;&#39;name&#39;&quot;,
            &quot;tag&quot;: &quot;foo&quot;,
            &quot;expected_tags&quot;: &quot;&#39;temperature&#39;, &#39;line_number&#39;, &#39;warning&#39;, &#39;status&#39;&quot;
        },
        &quot;url&quot;: &quot;https://errors.pydantic.dev/2.1/v/union_tag_invalid&quot;
    }
]

Original Answer: Using Pydantic `v1`

If you are willing to switch to Pydantic instead of dataclasses, you can define a discriminated union via typing.Annotated and use the parse_obj_as function as a "universal" constructor that is able to discriminate between distinct Event subtypes based on the provided name string.

Here is what I would suggest:

from typing import Annotated, Any, Literal

from pydantic import BaseModel, Field, parse_obj_as


class EventBase(BaseModel):
    name: str
    value: Any


class NumberEvent(EventBase):
    name: Literal[&quot;temperature&quot;, &quot;line_number&quot;]
    value: float


class StringEvent(EventBase):
    name: Literal[&quot;warning&quot;, &quot;status&quot;]
    value: str


Event = Annotated[
    NumberEvent | StringEvent,
    Field(discriminator=&quot;name&quot;),
]


event_temp = parse_obj_as(Event, {&quot;name&quot;: &quot;temperature&quot;, &quot;value&quot;: &quot;3.14&quot;})
event_status = parse_obj_as(Event, {&quot;name&quot;: &quot;status&quot;, &quot;value&quot;: -10})

print(repr(event_temp))    # NumberEvent(name=&#39;temperature&#39;, value=3.14)
print(repr(event_status))  # StringEvent(name=&#39;status&#39;, value=&#39;-10&#39;)

In this usage demo I purposefully used the "wrong" types for the respective value fields to show that Pydantic will automatically try to coerce them to the right types, once it determines the correct model based on the provided name.

An invalid name would of course cause a validation error, just like a completely wrong and type for value (that cannot be coerced). Example:

from pydantic import ValidationError

try:
    parse_obj_as(Event, {&quot;name&quot;: &quot;temperature&quot;, &quot;value&quot;: &quot;foo&quot;})
except ValidationError as err:
    print(err.json(indent=4))

try:
    parse_obj_as(Event, {&quot;name&quot;: &quot;foo&quot;, &quot;value&quot;: &quot;bar&quot;})
except ValidationError as err:
    print(err.json(indent=4))

Output:

[
    {
        &quot;loc&quot;: [
            &quot;__root__&quot;,
            &quot;NumberEvent&quot;,
            &quot;value&quot;
        ],
        &quot;msg&quot;: &quot;value is not a valid float&quot;,
        &quot;type&quot;: &quot;type_error.float&quot;
    }
]

[
    {
        &quot;loc&quot;: [
            &quot;__root__&quot;
        ],
        &quot;msg&quot;: &quot;No match for discriminator &#39;name&#39; and value &#39;foo&#39; (allowed values: &#39;temperature&#39;, &#39;line_number&#39;, &#39;warning&#39;, &#39;status&#39;)&quot;,
        &quot;type&quot;: &quot;value_error.discriminated_union.invalid_discriminator&quot;,
        &quot;ctx&quot;: {
            &quot;discriminator_key&quot;: &quot;name&quot;,
            &quot;discriminator_value&quot;: &quot;foo&quot;,
            &quot;allowed_values&quot;: &quot;&#39;temperature&#39;, &#39;line_number&#39;, &#39;warning&#39;, &#39;status&#39;&quot;
        }
    }
]

Side notes

An alias for a union of types like NumberEvent | StringEvent should still have a singular name, i.e. Event rather than Events because semantically the annotation e: Event indicates e should be an instance of one of those types, whereas e: Events would suggest e will be multiple instances (a collection) of either of those types.

Also the union float | int is almost always equivalent to float because int is by convention considered a subtype of float by all type checkers.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

从字符串文字基于联合类型创建数据类实例

问题

答案1

UPDATE: 使用 Pydantic `v2`

原始回答：使用 Pydantic `v1`

附注

UPDATE: Using Pydantic `v2`

Original Answer: Using Pydantic `v1`

Side notes

在WSL上运行VSCode交互窗口，使用相对导入

Numpy从另一个2D数组中减去2D数组的行，不使用for循环。

PyInstaller and PyQt5 error: ImportError: DLL load failed while importing QtWidgets: The specified procedure could not be found

使用Node.js中的spawn运行Python文件

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论

问题

答案1

UPDATE: 使用 Pydantic v2

原始回答：使用 Pydantic v1

附注

UPDATE: Using Pydantic v2

Original Answer: Using Pydantic v1

Side notes

发表评论

UPDATE: 使用 Pydantic `v2`

原始回答：使用 Pydantic `v1`

UPDATE: Using Pydantic `v2`

Original Answer: Using Pydantic `v1`