英文:
Programmatically measure database query complexity in Python SQLAlchemy
问题
Is this possible in python/sqlalchemy?
当我编写一个检索记录列表的端点时,我可能会不小心使我的查询变得非常低效,而不自觉。
是否有一种方法可以在方法/单元测试中测量数据库查询的复杂性,并在发生太多事务时引发错误?
在我的示例中,我正在使用strawberry提供graphql路由。不止一次,我犯了以下错误,每次在列表中检索ParentModel
以检索ChildModel
时都会进行额外的数据库查询。为了避免这种情况,我可以在初始查询中急切加载ChildModel
。我希望能够很明显地知道如果我的方法将导致大量的数据库查询。
import strawberry
@strawberry.type
class ChildGQLSchema:
id: int
@classmethod
def from_model(cls, model: ChildModel):
return cls(
id=model.id,
)
@strawberry.type
class ParentGQLSchema:
id: int
@strawberry.field
def children(
self, info, page: int = 1, limit: int = 20
) -> list[ChildGQLSchema]:
# 除非显式加载子项,否则每个父项都会导致数据库查询。
models = (
session.query(ChildModel)
.filter(ChildModel.parent_id == self.id)
.all()
)
@strawberry.type
class Query:
@strawberry.field
def parent(self, info, id: int) -> ParentGQLSchema | None:
model = session.query(ParentModel).filter(ParentModel.id == id).first()
if not model:
return None
return ParentGQLSchema.from_model(model)
英文:
Is this possible in python/sqlalchemy?
When I write an endpoint which retrieves a list of records, I might accidentally make my query very inefficient without realizing.
Is there a way to measure the complexity of database queries in a method/unit test and throw an error if too many transactions take place?
In my example, I am using strawberry for providing a graphql router. On more than one occasion, I've made the following mistake, which involves an additional database query being made for each ParentModel
in the list to retrieve the ChildModel
. To get around this, I can make the ChildModel
be loaded eagerly in the initial query. I would like to be able to make it very obvious to myself if my method will result in a large number of database queries.
import strawberry
@strawberry.type
class ChildGQLSchema:
id: int
@classmethod
def from_model(cls, model: ChildModel):
return cls(
id=model.id,
)
@strawberry.type
class ParentGQLSchema:
id: int
@strawberry.field
def children(
self, info, page: int = 1, limit: int = 20
) -> list[ChildGQLSchema]:
# Unless explicitly loading the children, this will result in a
# query to the database for each parent.
models = (
session.query(ChildModel)
.filter(ChildModel.parent_id == self.id)
.all()
)
@strawberry.type
class Query:
@strawberry.field
def parent(self, info, id: int) -> ParentGQLSchema | None:
model = session.query(ParentModel).filter(ParentModel.id == id).first()
if not model:
return None
return ParentGQLSchema.from_model(model)
</details>
# 答案1
**得分**: 2
你正在描述一个N+1查询问题的示例。你可以在网上找到很多关于这个问题的资源和示例,只需使用它作为搜索词。
像https://github.com/jmcarp/nplusone 这样的库可以帮助你检测它们。
了解更多关于N+1的知识仍然是一个好主意,因为像那样的库通常只能捕捉到容易/明显的情况。
<details>
<summary>英文:</summary>
You're describing an example of a N+1 query problem. You'll find a lot of resources/examples online using that as a search term.
Libraries like https://github.com/jmcarp/nplusone can help detect them for you.
It's still a good idea to learn more about N+1's b/c libraries like that usually can only catch the easy/obvious instances.
</details>
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论