2023年1月9日 04:49:35go评论70阅读模式

英文:

Is there a way to convert a file path field to a parsed model in-place?

问题

这是您要翻译的部分：

"如果我有两个模型，其中第二个模型有一个文件路径字段，引用一个文件，其内容由第一个模型描述。是否可能在原地扩展文件内容（将文件路径替换为解析后的模型）？

示例模型：

from pydantic import BaseModel, FilePath


class FirstModel(BaseModel):
    str_data: str
    num_list: list[int | float]


class SecondModel(BaseModel):
    some_other_field: str
    first_model: FilePath

示例数据：

{
  &quot;str_data&quot;: &quot;Some string data up in here&quot;,
  &quot;num_list&quot;: [1, 2, 3.14]
}

期望结果：

&gt;&gt;&gt; SecondModel(some_other_field=&quot;Other field data&quot;, first_model=&quot;path/to/data.json&quot;)
SecondModel(some_other_field=&quot;Other field data&quot;, first_model=FirstModel(str_data=&quot;Some string data up in here&quot;, num_list=[1, 2, 3.14])

因此，最初我希望第一个模型字段表示为文件路径，然后进行解析并将字段设置为类型FirstModel。这可能吗？

我尝试过使用验证器、子类化第一个模型和自定义根类型的不同方法。"

英文:

If I have two models, the second of which has a file path field referencing a file, whose contents are described by the first model. Is it possible to expand the file contents in place (replace the file path with the parsed model)?

Sample models:

from pydantic import BaseModel, FilePath


class FirstModel(BaseModel):
    str_data: str
    num_list: list[int | float]


class SecondModel(BaseModel):
    some_other_field: str
    first_model: FilePath

Sample data:

{
  &quot;str_data&quot;: &quot;Some string data up in here&quot;,
  &quot;num_list&quot;: [1, 2, 3.14]
}

Desired result:

&gt;&gt;&gt; SecondModel(some_other_field=&quot;Other field data&quot;, first_model=&quot;path/to/data.json&quot;)
SecondModel(some_other_field=&quot;Other field data&quot;, first_model=FirstModel(str_data=&quot;Some string data up in here&quot;, num_list=[1, 2, 3.14])

So initially I would like the first model field to be expressed as a file path, but then parsed and the field set to type FirstModel. Is this possible?

I've tried different approaches using validators, subclassing the first model, and custom root types.

答案1

得分: 0

首先，字段类型应该反映出在使用模型解析数据后您实际想要得到的内容。因此，first_model 的注释不应该是 FilePath，而应该是 FirstModel。

然后，您仍然可以通过提供一个包含正确键值对的字典给 first_model 或者一个 FirstModel 的实际实例来 "正常" 初始化 SecondModel。但是，您也可以编写一个具有 pre=True 的自定义字段验证器来处理当有人提供文件路径而不是 "有效" 数据时的情况。

有几种方法可以实现这一点。我想到的最简单的方法是首先假设该值是有效的文件路径，可以打开和读取。如果成功，我们可以假设内容可以直接通过 FirstModel 解析。如果失败，我们只需返回原始值，让默认的验证器处理剩下的事情。

假设我们在当前工作目录中有一个名为 test.json 的文件，其中包含以下数据：

{
  "str_data": "foo",
  "num_list": [1, 2, 3.14]
}

以下是一个可行的实现：

from pathlib import Path

from pydantic import BaseModel, validator


class FirstModel(BaseModel):
    str_data: str
    num_list: list[float]


class SecondModel(BaseModel):
    some_other_field: str
    first_model: FirstModel

    @validator("first_model", pre=True)
    def load_json_to_first_model(cls, v: object) -> object:
        try:
            contents = Path(str(v)).read_text()
        except (TypeError, OSError):
            return v
        return FirstModel.parse_raw(contents)


if __name__ == "__main__":
    obj = SecondModel.parse_obj({
        "some_other_field": "bar",
        "first_model": "test.json",
    })
    print(obj)

输出：

some_other_field='bar' first_model=FirstModel(str_data='foo', num_list=[1.0, 2.0, 3.14])

如果我们提供了无效的路径或文件无法打开，我们将会收到来自默认验证器的错误，告诉我们 first_model 不是一个有效的字典。如果需要的话，您可以在自定义验证器中进一步自定义此行为，例如通过区分如何处理 PermissionError 和 FileNotFoundError 而不是捕获基本的 OSError。

另外，float | int 的类型联合在Python中会归结为 float，尽管从技术上讲它们没有子类关系。这意味着您可以省略 int。所有的值都将被强制转换为 float。（请参阅Pydantic文档中的相关信息。）

英文:

First of all, the field type should reflect what you actually want to end up with after you parse data with your model. So the annotation for first_model should not be FilePath, but FirstModel.

Then it is still possible to "normally" initialize SecondModel by providing either a dictionary with the correct key-value-pairs to first_model or an actual instance of FirstModel. But you can also write a custom field validator with pre=True that takes care of the case, when someone provides a file path instead of "valid" data.

There are a few ways to achieve this. The simplest approach that I can think of is to simply assume first that the value is valid file path that can be opened and read. If that succeeds, we can assume the contents can be directly parsed via FirstModel. If it fails, we just return the value unchanged and let the default validators take care of the rest.

Assume we have the following data in a file called test.json in our current working directory:

{
  &quot;str_data&quot;: &quot;foo&quot;,
  &quot;num_list&quot;: [1, 2, 3.14]
}

Here is a working implementation:

from pathlib import Path

from pydantic import BaseModel, validator


class FirstModel(BaseModel):
    str_data: str
    num_list: list[float]


class SecondModel(BaseModel):
    some_other_field: str
    first_model: FirstModel

    @validator(&quot;first_model&quot;, pre=True)
    def load_json_to_first_model(cls, v: object) -&gt; object:
        try:
            contents = Path(str(v)).read_text()
        except (TypeError, OSError):
            return v
        return FirstModel.parse_raw(contents)


if __name__ == &quot;__main__&quot;:
    obj = SecondModel.parse_obj({
        &quot;some_other_field&quot;: &quot;bar&quot;,
        &quot;first_model&quot;: &quot;test.json&quot;,
    })
    print(obj)

Output:

some_other_field=&#39;bar&#39; first_model=FirstModel(str_data=&#39;foo&#39;, num_list=[1.0, 2.0, 3.14])

If we provide an invalid path or the file cannot be opened, the error we get will simply come from the default validator telling us that first_model is not a valid dictionary. You can customize this further in your custom validator if you want, for example by differentiating how you handle PermissionError and FileNotFoundError instead of catching the base OSError.

Side note, a type union of float | int reduces to float in Python even though there is technically no subclass relationship. This means you can omit the int. All values will be cast to float then. (See the Pydantic documentation on that matter.)

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

有没有一种方法可以将文件路径字段转换为原地解析的模型？

问题

答案1

无法导入langchain.agents.load_tools

TensorFlow简单的累积积和产品循环神经网络单元

为什么这个 Python 正则表达式没有忽略重音符号？

Scrape href from td – can’t get it to work correctly.

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论