英文:
isinstance based custom validator in pydantic for custom classes from third party libraries
问题
在我的自定义工作包中,我想使用 pydantic
来验证输入。然而,我的大多数函数接受的输入不是原生类型,而是来自其他库的类的实例,例如 pandas.DataFrame
或 sqlalchemy.Engine
。提到这些作为类型提示并添加 pydantic.validate_arguments
装饰器会失败。
假设我的函数的输入应该是来自库 CustomPackage
的 CustomClass
类型。
这将导致以下错误:
RuntimeError: no validator found for <class 'CustomPackage.CustomClass'>, see arbitrary_types_allowed in Config
为了解决这个问题,我可以使用 @pydantic.validate_arguments(config={"arbitrary_types_allowed": True})
,这将允许任何类型。这不是我的意图,因此我按照文档中的自定义类型部分创建了以下内容:
import collections.abc
import typing
import CustomPackage
class CustomClassWithValidator(CustomPackage.CustomClass):
@classmethod
def __get_validators__(cls: typing.Type["CustomClassWithValidator"]) -> collections.abc.Iterable[collections.abc.Callable]:
yield cls.validate_custom_class
@classmethod
def validate_custom_class(cls: typing.Type["CustomClassWithValidator"], passed_value: typing.Any) -> CustomPackage.CustomClass:
if isinstance(passed_value, CustomPackage.CustomClass):
return passed_value
raise ValueError
在此之后,以下内容可以正常工作:
@pydantic.validate_arguments
def custom_function(custom_argument: CustomClassWithValidator) -> None:
pass
但我有很多第三方依赖,每个都有很多我正在使用的自定义类。为每个类创建几乎相同的类,如上所示,可以正常工作,但这似乎不太优化。pydantic
中是否有功能可以减少这种重复性?
英文:
In my custom package for work, I want to validate inputs using pydantic
. However, most of my functions take inputs that are not of native types, but instances of classes from other libraries, e.g. pandas.DataFrame
or sqlalchemy.Engine
. Mentioning these as type hints and adding pydantic.validate_arguments
decorator fails.
Let's suppose the input of my function should be of type CustomClass
from library CustomPackage
.
import CustomPackage
import pydantic
@pydantic.validate_arguments
def custom_function(custom_argument: CustomPackage.CustomClass) -> None:
pass
This will lead to the following error:
> RuntimeError: no validator found for <class 'CustomPackage.CustomClass'>, see arbitrary_types_allowed in Config
To solve this, I can use @pydantic.validate_arguments(config={"arbitrary_types_allowed": True})
instead, which will allow anything. This is not my intention, so I followed the custom type section in documentation, and created this:
import collections.abc
import typing
import CustomPackage
class CustomClassWithValidator(CustomPackage.CustomClass):
@classmethod
def __get_validators__(cls: typing.Type["CustomClassWithValidator"]) -> collections.abc.Iterable[collections.abc.Callable]:
yield cls.validate_custom_class
@classmethod
def validate_custom_class(cls: typing.Type["CustomClassWithValidator"], passed_value: typing.Any) -> CustomPackage.CustomClass:
if isinstance(passed_value, CustomPackage.CustomClass):
return passed_value
raise ValueError
After this, the following works fine:
@pydantic.validate_arguments
def custom_function(custom_argument: CustomClassWithValidator) -> None:
pass
But I have quite a few of third party dependencies, and each have lots of custom classes which I am using. Creating almost identical class for every single one of them as above will work, but that does not seem optimal. Is there any functionality in pydantic
to make this less repetitive?
答案1
得分: 1
我明白了,这里是你需要的代码部分的中文翻译:
我不知道版本1中是否有内置的Pydantic方式来执行此操作,除了将`__get_validators__`生成器函数添加到您已经找到的类型之外。
我认为这样做的一个主要原因是,通常您希望字段类型不仅仅被验证为正确类型,还要进行解析或序列化/反序列化,因为在大多数应用Pydantic的领域中,您要么从Python原始类型或内置类型的对象中实例化Pydantic模型,甚至是从JSON文本中实例化。
因此,Pydantic不仅为最常见的类型提供默认验证器,还为这些类型提供了合理且有些灵活的初始化和(反)序列化函数。
因此,如果您想使用某种自定义/外部类型,您需要告诉Pydantic如何实例化和验证它。
## 解决方法
### 选项A:简单的混合
如果您只想拥有一个简单的“愚蠢”验证器,只需检查给定类型的`isinstance`,那么您可以将您已经有的东西重写为一个不可知的混合类,以最小化重复:
```python
from collections.abc import Callable, Iterator
from typing import Any, Self
class ValidatorMixin:
@classmethod
def __get_validators__(cls) -> Iterator[Callable[..., Any]]:
yield cls.__validate__
@classmethod
def __validate__(cls, v: object) -> Self:
if isinstance(v, cls):
return v
raise TypeError(f"不是{cls.__name__}的实例:{v}")
然后,您可以与任何特定的类一起继承它:
from pydantic import validate_arguments
# ... 导入ValidatorMixin
class Bar: # 来自foo
...
class Eggs: # 来自spam
...
class MyBar(ValidatorMixin, Bar):
pass
class MyEggs(ValidatorMixin, Eggs):
pass
@validate_arguments
def f(bar: MyBar, eggs: MyEggs) -> None:
print(bar, eggs)
现在的问题是,现在您的参数必须是MyBar
和MyEggs
类型,而不能只是Bar
和Eggs
类型。
选项B:通用混合
为了允许指定要验证的确切类,同时最小化重复,我们需要有创意。我建议的一种方法是将ValidatorMixin
以要检查的类为参数进行泛型化。有一些魔法可以自动提取传递给它的类型参数并进行验证:(有关详细信息,请参见此处)
from collections.abc import Callable, Iterator
from typing import Any, Generic, TypeVar, get_args, get_origin
T = TypeVar("T")
class ValidatorMixin(Generic[T]):
_type_arg: type[T] | None = None
@classmethod
def _get_type_arg(cls) -> type[T]:
if cls._type_arg is not None:
return cls._type_arg
raise AttributeError(f"{cls.__name__}是泛型; 未指定类型参数")
@classmethod
def __init_subclass__(cls, **kwargs: Any) -> None:
super().__init_subclass__(**kwargs)
for base in cls.__orig_bases__: # type: ignore[attr-defined]
origin = get_origin(base)
if origin is None or not issubclass(origin, ValidatorMixin):
continue
type_arg = get_args(base)[0]
if not isinstance(type_arg, TypeVar):
cls._type_arg = type_arg
return
@classmethod
def __get_validators__(cls) -> Iterator[Callable[..., Any]]:
yield cls.__validate__
@classmethod
def __validate__(cls, v: object) -> T:
if isinstance(v, cls._get_type_arg()):
return v
raise TypeError(f"不是{cls.__name__}的实例:{v}")
现在您可以像这样使用它:
from pydantic import validate_arguments
# ... 导入ValidatorMixin
class Bar: # 来自foo
...
class Eggs: # 来自spam
...
class MyClass(ValidatorMixin[Eggs], Bar, Eggs):
pass
@validate_arguments
def f(obj: MyClass) -> None:
print(obj)
f(Eggs())
如您所见,这现在还允许您使用多重继承,但仍然可以指定要验证的确切类。最后一行中的f
调用因此将通过验证。
不过,这仍然存在一个问题。静态类型检查器将会抱怨该调用,因为Eggs
不是MyClass
的子类型(正好相反),因此该调用将被视为错误。
选项C:在类本身上进行Monkey-patch
如果这让您困扰,我认为唯一合理的另一种选择就是对您想要使用的类进行Monkey-patch,而不是子类化和混合。类似这样:
from collections.abc import Callable, Iterable, Iterator
from typing import Any, TypeVar
T = TypeVar("T")
def add_validation(
cls: type[T],
validators: Iterable[Callable[..., Any]] = (),
force_patch: bool = False,
) -> type[T]:
method_name = "__get_validators__"
if not force_patch and hasattr(cls, method_name):
raise AttributeError(f"{cls.__name__}已经有了`{method_name}`")
if not validators:
def __validate__(v: object) -> T:
if isinstance(v, cls):
return v
raise TypeError(f"不是{cls.__name__}的实例:{v}")
validators = (__validate__, )
def __get_validators__(_cls: type) -> Iterator[Callable[..., Any]]:
yield from validators
setattr(cls, method_name, classmethod(__get_validators__))
return cls
现在,这可以用作对某些第三方类进行Monkey-patch的简单函数,也可以用作对自己
英文:
I am not aware of any built-in Pydantic way in version 1 to do this other than adding the __get_validators__
generator function to your types you already found.
I think one of the main reasons for this is that usually you want field types to not just be validated as being of the correct type, but also parsed or serialized/deserialized because in most areas, where Pydantic is applied, you either instantiate Pydantic models from Python primitives or objects of built-in types or even from JSON text.
So Pydantic not only provides default validators for the most common types, but also sensible and somewhat flexible initialization and (de-)serialization functions for those types.
The implication then is that if you want to use some custom/foreign type, you are responsible for telling Pydantic how to instantiate and validate it.
Workarounds
Option A: Simple mix-in
If all you want is to have a blanket "dumb" validator simply checking isinstance
for any given type, you can just re-write what you already have as an agnostic mix-in class to minimize repetition:
from collections.abc import Callable, Iterator
from typing import Any, Self
class ValidatorMixin:
@classmethod
def __get_validators__(cls) -> Iterator[Callable[..., Any]]:
yield cls.__validate__
@classmethod
def __validate__(cls, v: object) -> Self:
if isinstance(v, cls):
return v
raise TypeError(f"Not an instance of {cls.__name__}: {v}")
Then you can simply inherit from that with together with any specific class you want:
from pydantic import validate_arguments
# ... import ValidatorMixin
class Bar: # from foo
...
class Eggs: # from spam
...
class MyBar(ValidatorMixin, Bar):
pass
class MyEggs(ValidatorMixin, Eggs):
pass
@validate_arguments
def f(bar: MyBar, eggs: MyEggs) -> None:
print(bar, eggs)
Now the problem with that is your arguments now must be of the types MyBar
and MyEggs
and cannot be just of type Bar
and Eggs
.
Option B: Generic mix-in
To allow specifying which class exactly to validate against, while keeping repetition to a minimum, we need to get creative. An approach I would suggest is making the ValidatorMixin
generic in terms of what class to check against. There is a bit of magic you can do to then automatically extract the type argument passed to it and validate against that: (see here for details)
from collections.abc import Callable, Iterator
from typing import Any, Generic, TypeVar, get_args, get_origin
T = TypeVar("T")
class ValidatorMixin(Generic[T]):
_type_arg: type[T] | None = None
@classmethod
def _get_type_arg(cls) -> type[T]:
if cls._type_arg is not None:
return cls._type_arg
raise AttributeError(f"{cls.__name__} is generic; type arg unspecified")
@classmethod
def __init_subclass__(cls, **kwargs: Any) -> None:
super().__init_subclass__(**kwargs)
for base in cls.__orig_bases__: # type: ignore[attr-defined]
origin = get_origin(base)
if origin is None or not issubclass(origin, ValidatorMixin):
continue
type_arg = get_args(base)[0]
if not isinstance(type_arg, TypeVar):
cls._type_arg = type_arg
return
@classmethod
def __get_validators__(cls) -> Iterator[Callable[..., Any]]:
yield cls.__validate__
@classmethod
def __validate__(cls, v: object) -> T:
if isinstance(v, cls._get_type_arg()):
return v
raise TypeError(f"Not an instance of {cls.__name__}: {v}")
Now you can use it like this:
from pydantic import validate_arguments
# ... import ValidatorMixin
class Bar: # from foo
...
class Eggs: # from spam
...
class MyClass(ValidatorMixin[Eggs], Bar, Eggs):
pass
@validate_arguments
def f(obj: MyClass) -> None:
print(obj)
f(Eggs())
As you can see, this now also allows you to use multiple inheritance, yet still specify exactly what to validate against. That f
call in the last line will therefore pass validation.
There is however still one flaw with this. Static type checkers will complain about that call because Eggs
is not a subtype of MyClass
(it is the other way around), so that call would be seen as an error.
Option C: Monkey-patch the class itself
If that bothers you, the only other reasonable alternative in my opinion would be to just monkey-patch the classes you want to use instead of subclassing and mixing. Something like this:
from collections.abc import Callable, Iterable, Iterator
from typing import Any, TypeVar
T = TypeVar("T")
def add_validation(
cls: type[T],
validators: Iterable[Callable[..., Any]] = (),
force_patch: bool = False,
) -> type[T]:
method_name = "__get_validators__"
if not force_patch and hasattr(cls, method_name):
raise AttributeError(f"{cls.__name__} already has `{method_name}`")
if not validators:
def __validate__(v: object) -> T:
if isinstance(v, cls):
return v
raise TypeError(f"Not an instance of {cls.__name__}: {v}")
validators = (__validate__, )
def __get_validators__(_cls: type) -> Iterator[Callable[..., Any]]:
yield from validators
setattr(cls, method_name, classmethod(__get_validators__))
return cls
Now this can be used both as a simple function for monkey-patching some third-party class and as a decorator for your own classes.
(You are also free to add other validation functions via the optional validators
argument.)
Usage:
from pydantic import validate_arguments
# ... import add_validation
class Bar: # from foo
...
add_validation(Bar)
@add_validation
class SomeCustomClass:
...
@validate_arguments
def f(bar: Bar, obj: SomeCustomClass) -> None:
print(bar, obj)
f(Bar(), SomeCustomClass())
If you want, you can make a more sophisticated version of that decorator allowing usage with or without additional arguments:
from collections.abc import Callable, Iterable, Iterator
from typing import Any, TypeVar, overload
T = TypeVar("T")
@overload
def add_validation(
cls: None = None,
*,
validators: Iterable[Callable[..., Any]] = (),
force_patch: bool = False,
) -> Callable[[type[T]], type[T]]: ...
@overload
def add_validation(cls: type[T]) -> type[T]: ...
def add_validation(
cls: type[T] | None = None,
*,
validators: Iterable[Callable[..., Any]] = (),
force_patch: bool = False,
) -> type[T] | Callable[[type[T]], type[T]]:
def decorator(cls_: type[T]) -> type[T]:
nonlocal validators
method_name = "__get_validators__"
if not force_patch and hasattr(cls_, method_name):
raise AttributeError(f"{cls_.__name__} already has `{method_name}`")
if not validators:
def __validate__(v: object) -> T:
if isinstance(v, cls_):
return v
raise TypeError(f"Not an instance of {cls_.__name__}: {v}")
validators = (__validate__, )
def __get_validators__(_cls: type) -> Iterator[Callable[..., Any]]:
yield from validators
setattr(cls_, method_name, classmethod(__get_validators__))
return cls_
return decorator if cls is None else decorator(cls)
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论