英文:
Programmatically measure database query complexity in Python SQLAlchemy
问题
Is this possible in python/sqlalchemy?
当我编写一个检索记录列表的端点时,我可能会不小心使我的查询变得非常低效,而不自觉。
是否有一种方法可以在方法/单元测试中测量数据库查询的复杂性,并在发生太多事务时引发错误?
在我的示例中,我正在使用strawberry提供graphql路由。不止一次,我犯了以下错误,每次在列表中检索ParentModel以检索ChildModel时都会进行额外的数据库查询。为了避免这种情况,我可以在初始查询中急切加载ChildModel。我希望能够很明显地知道如果我的方法将导致大量的数据库查询。
import strawberry
@strawberry.type
class ChildGQLSchema:
id: int
@classmethod
def from_model(cls, model: ChildModel):
return cls(
id=model.id,
)
@strawberry.type
class ParentGQLSchema:
id: int
@strawberry.field
def children(
self, info, page: int = 1, limit: int = 20
) -> list[ChildGQLSchema]:
# 除非显式加载子项,否则每个父项都会导致数据库查询。
models = (
session.query(ChildModel)
.filter(ChildModel.parent_id == self.id)
.all()
)
@strawberry.type
class Query:
@strawberry.field
def parent(self, info, id: int) -> ParentGQLSchema | None:
model = session.query(ParentModel).filter(ParentModel.id == id).first()
if not model:
return None
return ParentGQLSchema.from_model(model)
英文:
Is this possible in python/sqlalchemy?
When I write an endpoint which retrieves a list of records, I might accidentally make my query very inefficient without realizing.
Is there a way to measure the complexity of database queries in a method/unit test and throw an error if too many transactions take place?
In my example, I am using strawberry for providing a graphql router. On more than one occasion, I've made the following mistake, which involves an additional database query being made for each ParentModel in the list to retrieve the ChildModel. To get around this, I can make the ChildModel be loaded eagerly in the initial query. I would like to be able to make it very obvious to myself if my method will result in a large number of database queries.
import strawberry
@strawberry.type
class ChildGQLSchema:
id: int
@classmethod
def from_model(cls, model: ChildModel):
return cls(
id=model.id,
)
@strawberry.type
class ParentGQLSchema:
id: int
@strawberry.field
def children(
self, info, page: int = 1, limit: int = 20
) -> list[ChildGQLSchema]:
# Unless explicitly loading the children, this will result in a
# query to the database for each parent.
models = (
session.query(ChildModel)
.filter(ChildModel.parent_id == self.id)
.all()
)
@strawberry.type
class Query:
@strawberry.field
def parent(self, info, id: int) -> ParentGQLSchema | None:
model = session.query(ParentModel).filter(ParentModel.id == id).first()
if not model:
return None
return ParentGQLSchema.from_model(model)
</details>
# 答案1
**得分**: 2
你正在描述一个N+1查询问题的示例。你可以在网上找到很多关于这个问题的资源和示例,只需使用它作为搜索词。
像https://github.com/jmcarp/nplusone 这样的库可以帮助你检测它们。
了解更多关于N+1的知识仍然是一个好主意,因为像那样的库通常只能捕捉到容易/明显的情况。
<details>
<summary>英文:</summary>
You're describing an example of a N+1 query problem. You'll find a lot of resources/examples online using that as a search term.
Libraries like https://github.com/jmcarp/nplusone can help detect them for you.
It's still a good idea to learn more about N+1's b/c libraries like that usually can only catch the easy/obvious instances.
</details>
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。


评论