Programmatically measure database query complexity in Python SQLAlchemy.

huangapple go评论59阅读模式
英文:

Programmatically measure database query complexity in Python SQLAlchemy

问题

Is this possible in python/sqlalchemy?

当我编写一个检索记录列表的端点时,我可能会不小心使我的查询变得非常低效,而不自觉。

是否有一种方法可以在方法/单元测试中测量数据库查询的复杂性,并在发生太多事务时引发错误?

在我的示例中,我正在使用strawberry提供graphql路由。不止一次,我犯了以下错误,每次在列表中检索ParentModel以检索ChildModel时都会进行额外的数据库查询。为了避免这种情况,我可以在初始查询中急切加载ChildModel。我希望能够很明显地知道如果我的方法将导致大量的数据库查询。

import strawberry

@strawberry.type
class ChildGQLSchema:
    id: int

    @classmethod
    def from_model(cls, model: ChildModel):
        return cls(
            id=model.id,
        )

@strawberry.type
class ParentGQLSchema:
    id: int

    @strawberry.field
    def children(
        self, info, page: int = 1, limit: int = 20
    ) -> list[ChildGQLSchema]:
        # 除非显式加载子项,否则每个父项都会导致数据库查询。
        models = (
            session.query(ChildModel)
            .filter(ChildModel.parent_id == self.id)
            .all()
        )

@strawberry.type
class Query:
    @strawberry.field
    def parent(self, info, id: int) -> ParentGQLSchema | None:
        model = session.query(ParentModel).filter(ParentModel.id == id).first()
        if not model:
            return None
        return ParentGQLSchema.from_model(model)
英文:

Is this possible in python/sqlalchemy?

When I write an endpoint which retrieves a list of records, I might accidentally make my query very inefficient without realizing.

Is there a way to measure the complexity of database queries in a method/unit test and throw an error if too many transactions take place?

In my example, I am using strawberry for providing a graphql router. On more than one occasion, I've made the following mistake, which involves an additional database query being made for each ParentModel in the list to retrieve the ChildModel. To get around this, I can make the ChildModel be loaded eagerly in the initial query. I would like to be able to make it very obvious to myself if my method will result in a large number of database queries.

import strawberry


@strawberry.type
class ChildGQLSchema:
    id: int

    @classmethod
    def from_model(cls, model: ChildModel):
        return cls(
            id=model.id,
        )


@strawberry.type
class ParentGQLSchema:
    id: int

    @strawberry.field
    def children(
        self, info, page: int = 1, limit: int = 20
    ) -> list[ChildGQLSchema]:
        # Unless explicitly loading the children, this will result in a
        # query to the database for each parent.
        models = (
            session.query(ChildModel)
            .filter(ChildModel.parent_id == self.id)
            .all()
        )


@strawberry.type
class Query:
    @strawberry.field
    def parent(self, info, id: int) -> ParentGQLSchema | None:
        model = session.query(ParentModel).filter(ParentModel.id == id).first()
        if not model:
            return None
        return ParentGQLSchema.from_model(model)



</details>


# 答案1
**得分**: 2

你正在描述一个N+1查询问题的示例。你可以在网上找到很多关于这个问题的资源和示例,只需使用它作为搜索词。

像https://github.com/jmcarp/nplusone 这样的库可以帮助你检测它们。

了解更多关于N+1的知识仍然是一个好主意,因为像那样的库通常只能捕捉到容易/明显的情况。

<details>
<summary>英文:</summary>

You&#39;re describing an example of a N+1 query problem. You&#39;ll find a lot of resources/examples online using that as a search term.

Libraries like https://github.com/jmcarp/nplusone can help detect them for you.

It&#39;s still a good idea to learn more about N+1&#39;s b/c libraries like that usually can only catch the easy/obvious instances.

</details>



huangapple
  • 本文由 发表于 2023年5月18日 03:21:58
  • 转载请务必保留本文链接:https://go.coder-hub.com/76275555.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定