基于`isinstance`的Pydantic自定义验证器,用于来自第三方库的自定义类。

huangapple go评论80阅读模式
英文:

isinstance based custom validator in pydantic for custom classes from third party libraries

问题

在我的自定义工作包中,我想使用 pydantic 来验证输入。然而,我的大多数函数接受的输入不是原生类型,而是来自其他库的类的实例,例如 pandas.DataFramesqlalchemy.Engine。提到这些作为类型提示并添加 pydantic.validate_arguments 装饰器会失败。

假设我的函数的输入应该是来自库 CustomPackageCustomClass 类型。

这将导致以下错误:

RuntimeError: no validator found for <class 'CustomPackage.CustomClass'>, see arbitrary_types_allowed in Config

为了解决这个问题,我可以使用 @pydantic.validate_arguments(config={"arbitrary_types_allowed": True}),这将允许任何类型。这不是我的意图,因此我按照文档中的自定义类型部分创建了以下内容:

import collections.abc
import typing

import CustomPackage

class CustomClassWithValidator(CustomPackage.CustomClass):
    @classmethod
    def __get_validators__(cls: typing.Type["CustomClassWithValidator"]) -> collections.abc.Iterable[collections.abc.Callable]:
        yield cls.validate_custom_class

    @classmethod
    def validate_custom_class(cls: typing.Type["CustomClassWithValidator"], passed_value: typing.Any) -> CustomPackage.CustomClass:
        if isinstance(passed_value, CustomPackage.CustomClass):
            return passed_value

        raise ValueError

在此之后,以下内容可以正常工作:

@pydantic.validate_arguments
def custom_function(custom_argument: CustomClassWithValidator) -> None:
    pass

但我有很多第三方依赖,每个都有很多我正在使用的自定义类。为每个类创建几乎相同的类,如上所示,可以正常工作,但这似乎不太优化。pydantic 中是否有功能可以减少这种重复性?

英文:

In my custom package for work, I want to validate inputs using pydantic. However, most of my functions take inputs that are not of native types, but instances of classes from other libraries, e.g. pandas.DataFrame or sqlalchemy.Engine. Mentioning these as type hints and adding pydantic.validate_arguments decorator fails.

Let's suppose the input of my function should be of type CustomClass from library CustomPackage.

import CustomPackage
import pydantic


@pydantic.validate_arguments
def custom_function(custom_argument: CustomPackage.CustomClass) -&gt; None:
    pass

This will lead to the following error:

> RuntimeError: no validator found for <class 'CustomPackage.CustomClass'>, see arbitrary_types_allowed in Config

To solve this, I can use @pydantic.validate_arguments(config={&quot;arbitrary_types_allowed&quot;: True}) instead, which will allow anything. This is not my intention, so I followed the custom type section in documentation, and created this:

import collections.abc
import typing

import CustomPackage


class CustomClassWithValidator(CustomPackage.CustomClass):
    @classmethod
    def __get_validators__(cls: typing.Type[&quot;CustomClassWithValidator&quot;]) -&gt; collections.abc.Iterable[collections.abc.Callable]:
        yield cls.validate_custom_class

    @classmethod
    def validate_custom_class(cls: typing.Type[&quot;CustomClassWithValidator&quot;], passed_value: typing.Any) -&gt; CustomPackage.CustomClass:
        if isinstance(passed_value, CustomPackage.CustomClass):
            return passed_value

        raise ValueError

After this, the following works fine:

@pydantic.validate_arguments
def custom_function(custom_argument: CustomClassWithValidator) -&gt; None:
    pass

But I have quite a few of third party dependencies, and each have lots of custom classes which I am using. Creating almost identical class for every single one of them as above will work, but that does not seem optimal. Is there any functionality in pydantic to make this less repetitive?

答案1

得分: 1

我明白了,这里是你需要的代码部分的中文翻译:

我不知道版本1中是否有内置的Pydantic方式来执行此操作除了将`__get_validators__`生成器函数添加到您已经找到的类型之外

我认为这样做的一个主要原因是通常您希望字段类型不仅仅被验证为正确类型还要进行解析或序列化/反序列化因为在大多数应用Pydantic的领域中您要么从Python原始类型或内置类型的对象中实例化Pydantic模型甚至是从JSON文本中实例化

因此Pydantic不仅为最常见的类型提供默认验证器还为这些类型提供了合理且有些灵活的初始化和序列化函数

因此如果您想使用某种自定义/外部类型您需要告诉Pydantic如何实例化和验证它

## 解决方法

### 选项A:简单的混合

如果您只想拥有一个简单的愚蠢验证器只需检查给定类型的`isinstance`,那么您可以将您已经有的东西重写为一个不可知的混合类以最小化重复

```python
from collections.abc import Callable, Iterator
from typing import Any, Self


class ValidatorMixin:
    @classmethod
    def __get_validators__(cls) -> Iterator[Callable[..., Any]]:
        yield cls.__validate__

    @classmethod
    def __validate__(cls, v: object) -> Self:
        if isinstance(v, cls):
            return v
        raise TypeError(f"不是{cls.__name__}的实例:{v}")

然后,您可以与任何特定的类一起继承它:

from pydantic import validate_arguments

# ... 导入ValidatorMixin


class Bar:  # 来自foo
    ...

class Eggs:  # 来自spam
    ...


class MyBar(ValidatorMixin, Bar):
    pass

class MyEggs(ValidatorMixin, Eggs):
    pass


@validate_arguments
def f(bar: MyBar, eggs: MyEggs) -> None:
    print(bar, eggs)

现在的问题是,现在您的参数必须是MyBarMyEggs类型,而不能只是BarEggs类型。

选项B:通用混合

为了允许指定要验证的确切类,同时最小化重复,我们需要有创意。我建议的一种方法是将ValidatorMixin以要检查的类为参数进行泛型化。有一些魔法可以自动提取传递给它的类型参数并进行验证:(有关详细信息,请参见此处)

from collections.abc import Callable, Iterator
from typing import Any, Generic, TypeVar, get_args, get_origin

T = TypeVar("T")


class ValidatorMixin(Generic[T]):
    _type_arg: type[T] | None = None

    @classmethod
    def _get_type_arg(cls) -> type[T]:
        if cls._type_arg is not None:
            return cls._type_arg
        raise AttributeError(f"{cls.__name__}是泛型; 未指定类型参数")

    @classmethod
    def __init_subclass__(cls, **kwargs: Any) -> None:
        super().__init_subclass__(**kwargs)
        for base in cls.__orig_bases__:  # type: ignore[attr-defined]
            origin = get_origin(base)
            if origin is None or not issubclass(origin, ValidatorMixin):
                continue
            type_arg = get_args(base)[0]
            if not isinstance(type_arg, TypeVar):
                cls._type_arg = type_arg
                return

    @classmethod
    def __get_validators__(cls) -> Iterator[Callable[..., Any]]:
        yield cls.__validate__

    @classmethod
    def __validate__(cls, v: object) -> T:
        if isinstance(v, cls._get_type_arg()):
            return v
        raise TypeError(f"不是{cls.__name__}的实例:{v}")

现在您可以像这样使用它:

from pydantic import validate_arguments

# ... 导入ValidatorMixin


class Bar:  # 来自foo
    ...


class Eggs:  # 来自spam
    ...


class MyClass(ValidatorMixin[Eggs], Bar, Eggs):
    pass


@validate_arguments
def f(obj: MyClass) -> None:
    print(obj)


f(Eggs())

如您所见,这现在还允许您使用多重继承,但仍然可以指定要验证的确切类。最后一行中的f调用因此将通过验证。

不过,这仍然存在一个问题。静态类型检查器将会抱怨该调用,因为Eggs不是MyClass的子类型(正好相反),因此该调用将被视为错误。

选项C:在类本身上进行Monkey-patch

如果这让您困扰,我认为唯一合理的另一种选择就是对您想要使用的类进行Monkey-patch,而不是子类化和混合。类似这样:

from collections.abc import Callable, Iterable, Iterator
from typing import Any, TypeVar

T = TypeVar("T")


def add_validation(
    cls: type[T],
    validators: Iterable[Callable[..., Any]] = (),
    force_patch: bool = False,
) -> type[T]:
    method_name = "__get_validators__"
    if not force_patch and hasattr(cls, method_name):
        raise AttributeError(f"{cls.__name__}已经有了`{method_name}`")
    if not validators:
        def __validate__(v: object) -> T:
            if isinstance(v, cls):
                return v
            raise TypeError(f"不是{cls.__name__}的实例:{v}")
        validators = (__validate__, )

    def __get_validators__(_cls: type) -> Iterator[Callable[..., Any]]:
        yield from validators
    setattr(cls, method_name, classmethod(__get_validators__))
    return cls

现在,这可以用作对某些第三方类进行Monkey-patch的简单函数,也可以用作对自己

英文:

I am not aware of any built-in Pydantic way in version 1 to do this other than adding the __get_validators__ generator function to your types you already found.

I think one of the main reasons for this is that usually you want field types to not just be validated as being of the correct type, but also parsed or serialized/deserialized because in most areas, where Pydantic is applied, you either instantiate Pydantic models from Python primitives or objects of built-in types or even from JSON text.

So Pydantic not only provides default validators for the most common types, but also sensible and somewhat flexible initialization and (de-)serialization functions for those types.

The implication then is that if you want to use some custom/foreign type, you are responsible for telling Pydantic how to instantiate and validate it.

Workarounds

Option A: Simple mix-in

If all you want is to have a blanket "dumb" validator simply checking isinstance for any given type, you can just re-write what you already have as an agnostic mix-in class to minimize repetition:

from collections.abc import Callable, Iterator
from typing import Any, Self


class ValidatorMixin:
    @classmethod
    def __get_validators__(cls) -&gt; Iterator[Callable[..., Any]]:
        yield cls.__validate__

    @classmethod
    def __validate__(cls, v: object) -&gt; Self:
        if isinstance(v, cls):
            return v
        raise TypeError(f&quot;Not an instance of {cls.__name__}: {v}&quot;)

Then you can simply inherit from that with together with any specific class you want:

from pydantic import validate_arguments

# ... import ValidatorMixin


class Bar:  # from foo
    ...

class Eggs:  # from spam
    ...


class MyBar(ValidatorMixin, Bar):
    pass

class MyEggs(ValidatorMixin, Eggs):
    pass


@validate_arguments
def f(bar: MyBar, eggs: MyEggs) -&gt; None:
    print(bar, eggs)

Now the problem with that is your arguments now must be of the types MyBar and MyEggs and cannot be just of type Bar and Eggs.

Option B: Generic mix-in

To allow specifying which class exactly to validate against, while keeping repetition to a minimum, we need to get creative. An approach I would suggest is making the ValidatorMixin generic in terms of what class to check against. There is a bit of magic you can do to then automatically extract the type argument passed to it and validate against that: (see here for details)

from collections.abc import Callable, Iterator
from typing import Any, Generic, TypeVar, get_args, get_origin

T = TypeVar(&quot;T&quot;)


class ValidatorMixin(Generic[T]):
    _type_arg: type[T] | None = None

    @classmethod
    def _get_type_arg(cls) -&gt; type[T]:
        if cls._type_arg is not None:
            return cls._type_arg
        raise AttributeError(f&quot;{cls.__name__} is generic; type arg unspecified&quot;)

    @classmethod
    def __init_subclass__(cls, **kwargs: Any) -&gt; None:
        super().__init_subclass__(**kwargs)
        for base in cls.__orig_bases__:  # type: ignore[attr-defined]
            origin = get_origin(base)
            if origin is None or not issubclass(origin, ValidatorMixin):
                continue
            type_arg = get_args(base)[0]
            if not isinstance(type_arg, TypeVar):
                cls._type_arg = type_arg
                return

    @classmethod
    def __get_validators__(cls) -&gt; Iterator[Callable[..., Any]]:
        yield cls.__validate__

    @classmethod
    def __validate__(cls, v: object) -&gt; T:
        if isinstance(v, cls._get_type_arg()):
            return v
        raise TypeError(f&quot;Not an instance of {cls.__name__}: {v}&quot;)

Now you can use it like this:

from pydantic import validate_arguments

# ... import ValidatorMixin


class Bar:  # from foo
    ...


class Eggs:  # from spam
    ...


class MyClass(ValidatorMixin[Eggs], Bar, Eggs):
    pass


@validate_arguments
def f(obj: MyClass) -&gt; None:
    print(obj)


f(Eggs())

As you can see, this now also allows you to use multiple inheritance, yet still specify exactly what to validate against. That f call in the last line will therefore pass validation.

There is however still one flaw with this. Static type checkers will complain about that call because Eggs is not a subtype of MyClass (it is the other way around), so that call would be seen as an error.

Option C: Monkey-patch the class itself

If that bothers you, the only other reasonable alternative in my opinion would be to just monkey-patch the classes you want to use instead of subclassing and mixing. Something like this:

from collections.abc import Callable, Iterable, Iterator
from typing import Any, TypeVar

T = TypeVar(&quot;T&quot;)


def add_validation(
    cls: type[T],
    validators: Iterable[Callable[..., Any]] = (),
    force_patch: bool = False,
) -&gt; type[T]:
    method_name = &quot;__get_validators__&quot;
    if not force_patch and hasattr(cls, method_name):
        raise AttributeError(f&quot;{cls.__name__} already has `{method_name}`&quot;)
    if not validators:
        def __validate__(v: object) -&gt; T:
            if isinstance(v, cls):
                return v
            raise TypeError(f&quot;Not an instance of {cls.__name__}: {v}&quot;)
        validators = (__validate__, )

    def __get_validators__(_cls: type) -&gt; Iterator[Callable[..., Any]]:
        yield from validators
    setattr(cls, method_name, classmethod(__get_validators__))
    return cls

Now this can be used both as a simple function for monkey-patching some third-party class and as a decorator for your own classes.

(You are also free to add other validation functions via the optional validators argument.)

Usage:

from pydantic import validate_arguments

# ... import add_validation


class Bar:  # from foo
    ...


add_validation(Bar)


@add_validation
class SomeCustomClass:
    ...


@validate_arguments
def f(bar: Bar, obj: SomeCustomClass) -&gt; None:
    print(bar, obj)


f(Bar(), SomeCustomClass())

If you want, you can make a more sophisticated version of that decorator allowing usage with or without additional arguments:

from collections.abc import Callable, Iterable, Iterator
from typing import Any, TypeVar, overload

T = TypeVar(&quot;T&quot;)


@overload
def add_validation(
    cls: None = None,
    *,
    validators: Iterable[Callable[..., Any]] = (),
    force_patch: bool = False,
) -&gt; Callable[[type[T]], type[T]]: ...


@overload
def add_validation(cls: type[T]) -&gt; type[T]: ...


def add_validation(
    cls: type[T] | None = None,
    *,
    validators: Iterable[Callable[..., Any]] = (),
    force_patch: bool = False,
) -&gt; type[T] | Callable[[type[T]], type[T]]:
    def decorator(cls_: type[T]) -&gt; type[T]:
        nonlocal validators
        method_name = &quot;__get_validators__&quot;
        if not force_patch and hasattr(cls_, method_name):
            raise AttributeError(f&quot;{cls_.__name__} already has `{method_name}`&quot;)
        if not validators:
            def __validate__(v: object) -&gt; T:
                if isinstance(v, cls_):
                    return v
                raise TypeError(f&quot;Not an instance of {cls_.__name__}: {v}&quot;)
            validators = (__validate__, )

        def __get_validators__(_cls: type) -&gt; Iterator[Callable[..., Any]]:
            yield from validators
        setattr(cls_, method_name, classmethod(__get_validators__))
        return cls_
    return decorator if cls is None else decorator(cls)

huangapple
  • 本文由 发表于 2023年5月28日 13:05:57
  • 转载请务必保留本文链接:https://go.coder-hub.com/76350008.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定