英文:
Is there a way to convert a file path field to a parsed model in-place?
问题
这是您要翻译的部分:
"如果我有两个模型,其中第二个模型有一个文件路径字段,引用一个文件,其内容由第一个模型描述。是否可能在原地扩展文件内容(将文件路径替换为解析后的模型)?
示例模型:
from pydantic import BaseModel, FilePath
class FirstModel(BaseModel):
str_data: str
num_list: list[int | float]
class SecondModel(BaseModel):
some_other_field: str
first_model: FilePath
示例数据:
{
"str_data": "Some string data up in here",
"num_list": [1, 2, 3.14]
}
期望结果:
>>> SecondModel(some_other_field="Other field data", first_model="path/to/data.json")
SecondModel(some_other_field="Other field data", first_model=FirstModel(str_data="Some string data up in here", num_list=[1, 2, 3.14])
因此,最初我希望第一个模型字段表示为文件路径,然后进行解析并将字段设置为类型FirstModel
。这可能吗?
我尝试过使用验证器、子类化第一个模型和自定义根类型的不同方法。"
英文:
If I have two models, the second of which has a file path field referencing a file, whose contents are described by the first model. Is it possible to expand the file contents in place (replace the file path with the parsed model)?
Sample models:
from pydantic import BaseModel, FilePath
class FirstModel(BaseModel):
str_data: str
num_list: list[int | float]
class SecondModel(BaseModel):
some_other_field: str
first_model: FilePath
Sample data:
{
"str_data": "Some string data up in here",
"num_list": [1, 2, 3.14]
}
Desired result:
>>> SecondModel(some_other_field="Other field data", first_model="path/to/data.json")
SecondModel(some_other_field="Other field data", first_model=FirstModel(str_data="Some string data up in here", num_list=[1, 2, 3.14])
So initially I would like the first model field to be expressed as a file path, but then parsed and the field set to type FirstModel
. Is this possible?
I've tried different approaches using validators, subclassing the first model, and custom root types.
答案1
得分: 0
首先,字段类型应该反映出在使用模型解析数据后您实际想要得到的内容。因此,first_model
的注释不应该是 FilePath
,而应该是 FirstModel
。
然后,您仍然可以通过提供一个包含正确键值对的字典给 first_model
或者一个 FirstModel
的实际实例来 "正常" 初始化 SecondModel
。但是,您也可以编写一个具有 pre=True
的自定义字段验证器来处理当有人提供文件路径而不是 "有效" 数据时的情况。
有几种方法可以实现这一点。我想到的最简单的方法是首先假设该值是有效的文件路径,可以打开和读取。如果成功,我们可以假设内容可以直接通过 FirstModel
解析。如果失败,我们只需返回原始值,让默认的验证器处理剩下的事情。
假设我们在当前工作目录中有一个名为 test.json
的文件,其中包含以下数据:
{
"str_data": "foo",
"num_list": [1, 2, 3.14]
}
以下是一个可行的实现:
from pathlib import Path
from pydantic import BaseModel, validator
class FirstModel(BaseModel):
str_data: str
num_list: list[float]
class SecondModel(BaseModel):
some_other_field: str
first_model: FirstModel
@validator("first_model", pre=True)
def load_json_to_first_model(cls, v: object) -> object:
try:
contents = Path(str(v)).read_text()
except (TypeError, OSError):
return v
return FirstModel.parse_raw(contents)
if __name__ == "__main__":
obj = SecondModel.parse_obj({
"some_other_field": "bar",
"first_model": "test.json",
})
print(obj)
输出:
some_other_field='bar' first_model=FirstModel(str_data='foo', num_list=[1.0, 2.0, 3.14])
如果我们提供了无效的路径或文件无法打开,我们将会收到来自默认验证器的错误,告诉我们 first_model
不是一个有效的字典。如果需要的话,您可以在自定义验证器中进一步自定义此行为,例如通过区分如何处理 PermissionError
和 FileNotFoundError
而不是捕获基本的 OSError
。
另外,float | int
的类型联合在Python中会归结为 float
,尽管从技术上讲它们没有子类关系。这意味着您可以省略 int
。所有的值都将被强制转换为 float
。 (请参阅Pydantic文档中的相关信息。)
英文:
First of all, the field type should reflect what you actually want to end up with after you parse data with your model. So the annotation for first_model
should not be FilePath
, but FirstModel
.
Then it is still possible to "normally" initialize SecondModel
by providing either a dictionary with the correct key-value-pairs to first_model
or an actual instance of FirstModel
. But you can also write a custom field validator with pre=True
that takes care of the case, when someone provides a file path instead of "valid" data.
There are a few ways to achieve this. The simplest approach that I can think of is to simply assume first that the value is valid file path that can be opened and read. If that succeeds, we can assume the contents can be directly parsed via FirstModel
. If it fails, we just return the value unchanged and let the default validators take care of the rest.
Assume we have the following data in a file called test.json
in our current working directory:
{
"str_data": "foo",
"num_list": [1, 2, 3.14]
}
Here is a working implementation:
from pathlib import Path
from pydantic import BaseModel, validator
class FirstModel(BaseModel):
str_data: str
num_list: list[float]
class SecondModel(BaseModel):
some_other_field: str
first_model: FirstModel
@validator("first_model", pre=True)
def load_json_to_first_model(cls, v: object) -> object:
try:
contents = Path(str(v)).read_text()
except (TypeError, OSError):
return v
return FirstModel.parse_raw(contents)
if __name__ == "__main__":
obj = SecondModel.parse_obj({
"some_other_field": "bar",
"first_model": "test.json",
})
print(obj)
Output:
some_other_field='bar' first_model=FirstModel(str_data='foo', num_list=[1.0, 2.0, 3.14])
If we provide an invalid path or the file cannot be opened, the error we get will simply come from the default validator telling us that first_model
is not a valid dictionary. You can customize this further in your custom validator if you want, for example by differentiating how you handle PermissionError
and FileNotFoundError
instead of catching the base OSError
.
Side note, a type union of float | int
reduces to float
in Python even though there is technically no subclass relationship. This means you can omit the int
. All values will be cast to float
then. (See the Pydantic documentation on that matter.)
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论