如何从marshmallow模式中推断类型(用于mypy和IDE)?

huangapple go评论145阅读模式
英文:

How to infer type (for mypy & IDE) from a marshmallow schema?

问题

I have not asked a Python question in years! Exciting. This is largely an ecosystem question. Consider this snippet:

try:
    result = schema.load(data)
except marshmallow.ValidationError as exc:
    sys.exit(f'schema validation failed: {exc}')

with schema being a marshmallow.Schema.

From mypy's point of view, result is of type Any.

What options do I have so that result gets typed based on schema?

I have searched the web quite a bit, and my conclusion for now is that there is no such magic in the marshmallow ecosystem. Seemingly, there are solutions for the inverse (derive a marshmallow schema from e.g. a dataclass). But somehow I also suspect I am just missing something really obvious. But if there really is no solution for marshmallow: is the answer maybe to use e.g. pydantic to define the schema (and then get some beautiful magic for automatically inferred types that mypy can use)?

To maybe show an inspiration. In the JavaScript/TypeScript ecosystem there is yup, and here we have the derive-type-from-schema in the quickstart docs: type User = InferType<typeof userSchema>;

英文:

I have not asked a Python question in years! Exciting. This is largely an ecosystem question. Consider this snippet:

try:
    result = schema.load(data)
except marshmallow.ValidationError as exc:
    sys.exit(f&#39;schema validation failed: {exc}&#39;)

with schema being a marshmallow.Schema.

From mypy's point of view, result is of type Any.

What options do I have so that result gets typed based on schema?

I have searched the web quite a bit, and my conclusion for now is that there is no such magic in the marshmallow ecosystem. Seemingly, there are solutions for the inverse (derive a marshmallow schema from e.g. a dataclass). But somehow I also suspect I am just missing something really obvious. But if there really is no solution for marshmallow: is the answer maybe to use e.g. pydantic to define the schema (and then get some beautiful magic for automatically inferred types that mypy can use)?

To maybe show an inspiration. In the JavaScript/TypeScript ecosystem there is yup, and here we have the derive-type-from-schema in the quickstart docs: type User = InferType&lt;typeof userSchema&gt;;

答案1

得分: 1

以下是您要翻译的内容:

Initially I was not sure what exactly you meant in your question, but I think we cleared that up, so allow me to specify it a bit for the people that did not follow this issue.

Question

Say we have a model class looks like this (taken from the marshmallow docs):

class User:
    def __init__(self, name: str, email: str) -&gt; None:
        self.name = name
        self.email = email

    def __repr__(self) -&gt; str:
        return &quot;&lt;User(name={self.name!r})&gt;&quot;.format(self=self)

The documentation section Deserializing to Objects suggest the following setup for the schema:

from marshmallow import Schema, fields, post_load
# ... import User

class UserSchema(Schema):
    name = fields.Str()
    email = fields.Email()

    @post_load
    def make_user(self, data, **kwargs):
        return User(**data)

user_data = {&quot;name&quot;: &quot;Ronnie&quot;, &quot;email&quot;: &quot;ronnie@stones.com&quot;}
schema = UserSchema()
result = schema.load(user_data)

If you were to add reveal_type(result) and run mypy over the code, it would give you Any as the type.

What can we do, to have a static type checker infer result to be an instance of User instead?

Answer

As explained in this issue, there is no built-in typing functionality for this in the marshmallow core. Since deserialization can be hooked into via @post_load methods at any point, the output type is undetermined. In its purest form (without such a hook) it would simply give us a dictionary. So the Any return makes sense.

However, if we want to restrict ourselves to only ever deserialize to specific class instances, we can define a subclass of the marshmallow.Schema and make it generic in terms of the model it deals with, which is what I suggested as a feature for marshmallow.

Since my proposal to include that in the core was politely rejected, I wrote that extension to the marshmallow ecosystem myself: marshmallow-generic (issues welcome!)

With the GenericSchema you can do the following:

from marshmallow_generic import GenericSchema, fields
# ... import User

class UserSchema(GenericSchema[User]):
    name = fields.Str()
    email = fields.Email()

user_data = {&quot;name&quot;: &quot;Monty&quot;, &quot;email&quot;: &quot;monty@python.org&quot;}
schema = UserSchema()
single_user = schema.load(user_data)
print(single_user)  # &lt;User(name=&#39;Monty&#39;)&gt;

json_data = &#39;&#39;&#39;[
    {&quot;name&quot;: &quot;Monty&quot;, &quot;email&quot;: &quot;monty@python.org&quot;},
    {&quot;name&quot;: &quot;Ronnie&quot;, &quot;email&quot;: &quot;ronnie@stones.com&quot;}
]&#39;&#39;&#39;
multiple_users = schema.loads(json_data, many=True)
print(multiple_users)  # [&lt;User(name=&#39;Monty&#39;)&gt;, &lt;User(name=&#39;Ronnie&#39;)&gt;]

Adding reveal_type(single_user) and reveal_type(multiple_users) at the bottom and running that code through <a href="https://mypy.readthedocs.io/en/stable/" target="_blank">mypy</a> would yield the following output:

note: Revealed type is &quot;builtins.list[User]&quot;```

That way we get the typing support we wanted and there is not even any need to write an explicit `@post_load` hook. All we need is to pass our `User` model as the type argument to `GenericSchema` and it will auto-magically convert to `User`.

Something as general as the TypeScript way you mentioned is not (currently or in the foreseeable future) possible in Python because the Python type system (unlike that of TypeScript) is fundamentally _nominal_, not _structural_. But we can work with what we have.

[1]: https://github.com/marshmallow-code/marshmallow/issues/2108
[2]: https://marshmallow.readthedocs.io/en/stable/quickstart.html#declaring-schemas
[3]: https://marshmallow.readthedocs.io/en/stable/quickstart.html#deserializing-to-objects
[4]: https://github.com/daniil-berg/marshmallow-generic

<details>
<summary>英文:</summary>

Initially I was not sure what exactly you meant in your question, but I think we cleared that up, so allow me to specify it a bit for the people that did not follow [this issue][1].

---

## Question

Say we have a model class looks like this (taken from the [`marshmallow` docs][2]):

```python
class User:
    def __init__(self, name: str, email: str) -&gt; None:
        self.name = name
        self.email = email

    def __repr__(self) -&gt; str:
        return &quot;&lt;User(name={self.name!r})&gt;&quot;.format(self=self)

The documentation section Deserializing to Objects suggest the following setup for the schema:

from marshmallow import Schema, fields, post_load
# ... import User


class UserSchema(Schema):
    name = fields.Str()
    email = fields.Email()

    @post_load
    def make_user(self, data, **kwargs):
        return User(**data)


user_data = {&quot;name&quot;: &quot;Ronnie&quot;, &quot;email&quot;: &quot;ronnie@stones.com&quot;}
schema = UserSchema()
result = schema.load(user_data)

If you were to add reveal_type(result) and run mypy over the code, it would give you Any as the type.

What can we do, to have a static type checker infer result to be an instance of User instead?


Answer

As explained in this issue, there is no built-in typing functionality for this in the marshmallow core. Since deserialization can be hooked into via @post_load methods at any point, the output type is undetermined. In its purest form (without such a hook) it would simply give us a dictionary. So the Any return makes sense.

However, if we want to restrict ourselves to only ever deserialize to specific class instances, we can define a subclass of the marshmallow.Schema and make it generic in terms of the model it deals with, which is what I suggested as a feature for marshmallow.

Since my proposal to include that in the core was politely rejected, I wrote that extension to the marshmallow ecosystem myself: marshmallow-generic (issues welcome!)

With the GenericSchema you can do the following:

from marshmallow_generic import GenericSchema, fields
# ... import User


class UserSchema(GenericSchema[User]):
    name = fields.Str()
    email = fields.Email()


user_data = {&quot;name&quot;: &quot;Monty&quot;, &quot;email&quot;: &quot;monty@python.org&quot;}
schema = UserSchema()
single_user = schema.load(user_data)
print(single_user)  # &lt;User(name=&#39;Monty&#39;)&gt;

json_data = &#39;&#39;&#39;[
    {&quot;name&quot;: &quot;Monty&quot;, &quot;email&quot;: &quot;monty@python.org&quot;},
    {&quot;name&quot;: &quot;Ronnie&quot;, &quot;email&quot;: &quot;ronnie@stones.com&quot;}
]&#39;&#39;&#39;
multiple_users = schema.loads(json_data, many=True)
print(multiple_users)  # [&lt;User(name=&#39;Monty&#39;)&gt;, &lt;User(name=&#39;Ronnie&#39;)&gt;]

Adding reveal_type(single_user) and reveal_type(multiple_users) at the bottom and running that code through <a href="https://mypy.readthedocs.io/en/stable/" target="_blank">mypy</a> would yield the following output:

note: Revealed type is &quot;User&quot;
note: Revealed type is &quot;builtins.list[User]&quot;

That way we get the typing support we wanted and there is not even any need to write an explicit @post_load hook. All we need is to pass our User model as the type argument to GenericSchema and it will auto-magically convert to User.


Something as general as the TypeScript way you mentioned is not (currently or in the foreseeable future) possible in Python because the Python type system (unlike that of TypeScript) is fundamentally nominal, not structural. But we can work with what we have.

huangapple
  • 本文由 发表于 2023年3月7日 21:37:33
  • 转载请务必保留本文链接:https://go.coder-hub.com/75662696.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定