英文:
How to infer type (for mypy & IDE) from a marshmallow schema?
问题
I have not asked a Python question in years! Exciting. This is largely an ecosystem question. Consider this snippet:
try:
result = schema.load(data)
except marshmallow.ValidationError as exc:
sys.exit(f'schema validation failed: {exc}')
with schema
being a marshmallow.Schema
.
From mypy
's point of view, result
is of type Any
.
What options do I have so that result
gets typed based on schema
?
I have searched the web quite a bit, and my conclusion for now is that there is no such magic in the marshmallow ecosystem. Seemingly, there are solutions for the inverse (derive a marshmallow schema from e.g. a dataclass). But somehow I also suspect I am just missing something really obvious. But if there really is no solution for marshmallow: is the answer maybe to use e.g. pydantic to define the schema (and then get some beautiful magic for automatically inferred types that mypy can use)?
To maybe show an inspiration. In the JavaScript/TypeScript ecosystem there is yup, and here we have the derive-type-from-schema in the quickstart docs: type User = InferType<typeof userSchema>;
英文:
I have not asked a Python question in years! Exciting. This is largely an ecosystem question. Consider this snippet:
try:
result = schema.load(data)
except marshmallow.ValidationError as exc:
sys.exit(f'schema validation failed: {exc}')
with schema
being a marshmallow.Schema
.
From mypy
's point of view, result
is of type Any
.
What options do I have so that result
gets typed based on schema
?
I have searched the web quite a bit, and my conclusion for now is that there is no such magic in the marshmallow ecosystem. Seemingly, there are solutions for the inverse (derive a marshmallow schema from e.g. a dataclass). But somehow I also suspect I am just missing something really obvious. But if there really is no solution for marshmallow: is the answer maybe to use e.g. pydantic to define the schema (and then get some beautiful magic for automatically inferred types that mypy can use)?
To maybe show an inspiration. In the JavaScript/TypeScript ecosystem there is yup, and here we have the derive-type-from-schema in the quickstart docs: type User = InferType<typeof userSchema>;
答案1
得分: 1
以下是您要翻译的内容:
Initially I was not sure what exactly you meant in your question, but I think we cleared that up, so allow me to specify it a bit for the people that did not follow this issue.
Question
Say we have a model class looks like this (taken from the marshmallow
docs):
class User:
def __init__(self, name: str, email: str) -> None:
self.name = name
self.email = email
def __repr__(self) -> str:
return "<User(name={self.name!r})>".format(self=self)
The documentation section Deserializing to Objects suggest the following setup for the schema:
from marshmallow import Schema, fields, post_load
# ... import User
class UserSchema(Schema):
name = fields.Str()
email = fields.Email()
@post_load
def make_user(self, data, **kwargs):
return User(**data)
user_data = {"name": "Ronnie", "email": "ronnie@stones.com"}
schema = UserSchema()
result = schema.load(user_data)
If you were to add reveal_type(result)
and run mypy
over the code, it would give you Any
as the type.
What can we do, to have a static type checker infer result
to be an instance of User
instead?
Answer
As explained in this issue, there is no built-in typing functionality for this in the marshmallow
core. Since deserialization can be hooked into via @post_load
methods at any point, the output type is undetermined. In its purest form (without such a hook) it would simply give us a dictionary. So the Any
return makes sense.
However, if we want to restrict ourselves to only ever deserialize to specific class instances, we can define a subclass of the marshmallow.Schema
and make it generic in terms of the model it deals with, which is what I suggested as a feature for marshmallow
.
Since my proposal to include that in the core was politely rejected, I wrote that extension to the marshmallow
ecosystem myself: marshmallow-generic
(issues welcome!)
With the GenericSchema
you can do the following:
from marshmallow_generic import GenericSchema, fields
# ... import User
class UserSchema(GenericSchema[User]):
name = fields.Str()
email = fields.Email()
user_data = {"name": "Monty", "email": "monty@python.org"}
schema = UserSchema()
single_user = schema.load(user_data)
print(single_user) # <User(name='Monty')>
json_data = '''[
{"name": "Monty", "email": "monty@python.org"},
{"name": "Ronnie", "email": "ronnie@stones.com"}
]'''
multiple_users = schema.loads(json_data, many=True)
print(multiple_users) # [<User(name='Monty')>, <User(name='Ronnie')>]
Adding reveal_type(single_user)
and reveal_type(multiple_users)
at the bottom and running that code through <a href="https://mypy.readthedocs.io/en/stable/" target="_blank">mypy
</a> would yield the following output:
note: Revealed type is "builtins.list[User]"```
That way we get the typing support we wanted and there is not even any need to write an explicit `@post_load` hook. All we need is to pass our `User` model as the type argument to `GenericSchema` and it will auto-magically convert to `User`.
Something as general as the TypeScript way you mentioned is not (currently or in the foreseeable future) possible in Python because the Python type system (unlike that of TypeScript) is fundamentally _nominal_, not _structural_. But we can work with what we have.
[1]: https://github.com/marshmallow-code/marshmallow/issues/2108
[2]: https://marshmallow.readthedocs.io/en/stable/quickstart.html#declaring-schemas
[3]: https://marshmallow.readthedocs.io/en/stable/quickstart.html#deserializing-to-objects
[4]: https://github.com/daniil-berg/marshmallow-generic
<details>
<summary>英文:</summary>
Initially I was not sure what exactly you meant in your question, but I think we cleared that up, so allow me to specify it a bit for the people that did not follow [this issue][1].
---
## Question
Say we have a model class looks like this (taken from the [`marshmallow` docs][2]):
```python
class User:
def __init__(self, name: str, email: str) -> None:
self.name = name
self.email = email
def __repr__(self) -> str:
return "<User(name={self.name!r})>".format(self=self)
The documentation section Deserializing to Objects suggest the following setup for the schema:
from marshmallow import Schema, fields, post_load
# ... import User
class UserSchema(Schema):
name = fields.Str()
email = fields.Email()
@post_load
def make_user(self, data, **kwargs):
return User(**data)
user_data = {"name": "Ronnie", "email": "ronnie@stones.com"}
schema = UserSchema()
result = schema.load(user_data)
If you were to add reveal_type(result)
and run mypy
over the code, it would give you Any
as the type.
What can we do, to have a static type checker infer result
to be an instance of User
instead?
Answer
As explained in this issue, there is no built-in typing functionality for this in the marshmallow
core. Since deserialization can be hooked into via @post_load
methods at any point, the output type is undetermined. In its purest form (without such a hook) it would simply give us a dictionary. So the Any
return makes sense.
However, if we want to restrict ourselves to only ever deserialize to specific class instances, we can define a subclass of the marshmallow.Schema
and make it generic in terms of the model it deals with, which is what I suggested as a feature for marshmallow
.
Since my proposal to include that in the core was politely rejected, I wrote that extension to the marshmallow
ecosystem myself: marshmallow-generic
(issues welcome!)
With the GenericSchema
you can do the following:
from marshmallow_generic import GenericSchema, fields
# ... import User
class UserSchema(GenericSchema[User]):
name = fields.Str()
email = fields.Email()
user_data = {"name": "Monty", "email": "monty@python.org"}
schema = UserSchema()
single_user = schema.load(user_data)
print(single_user) # <User(name='Monty')>
json_data = '''[
{"name": "Monty", "email": "monty@python.org"},
{"name": "Ronnie", "email": "ronnie@stones.com"}
]'''
multiple_users = schema.loads(json_data, many=True)
print(multiple_users) # [<User(name='Monty')>, <User(name='Ronnie')>]
Adding reveal_type(single_user)
and reveal_type(multiple_users)
at the bottom and running that code through <a href="https://mypy.readthedocs.io/en/stable/" target="_blank">mypy
</a> would yield the following output:
note: Revealed type is "User"
note: Revealed type is "builtins.list[User]"
That way we get the typing support we wanted and there is not even any need to write an explicit @post_load
hook. All we need is to pass our User
model as the type argument to GenericSchema
and it will auto-magically convert to User
.
Something as general as the TypeScript way you mentioned is not (currently or in the foreseeable future) possible in Python because the Python type system (unlike that of TypeScript) is fundamentally nominal, not structural. But we can work with what we have.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论