Limit and order_by in subquery sqlalchemy

huangapple go评论79阅读模式
英文:

Limit and order_by in subquery sqlalchemy

问题

以下是您提供的代码的翻译部分:

我正在尝试在fastapi端点中检索一个pydantic模型的列表该模型由`Subcategory` sqlalchemy模型的某些属性和一个与`Commerce` sqlalchemy模型匹配的`ShowCommerceToSubcategory` pydantic模型的属性组成我希望能够限制和按照内部`Commerce`查询的结果进行排序

到目前为止我的`Category`模型中有一个类方法它具有一个非常简单的查询它接受某个类别ID并检索所有相关的子类别

@classmethod
def get_subcategory_commerce(cls, id: int, db: Session, number_of_commerces: int = 5):
    subcategories = db.query(Subcategory)\
        .filter(Subcategory.main_category_id == id)\
        .all()

我的响应模型如下:

class ShowSubcategoryCommerce(SQLModel):
    id: int
    main_category_id: int
    name: str
    description: str | None
    icon: str | None
    commerces: list[ShowCommerceToSubcategory]

    class Config:
        orm_mode = True

当我使用上面描述的类方法时,我得到了一些预期的结果。但是,我希望能够限制ShowCommerceToSubcategory模型的结果为5个,并按照我的Commerce sqlalchemy模型具有的一个名为"importance"的属性以降序显示它们。

我尝试了以下查询以实现此目标,但它失败了,并未显示各自子类别内的所有商家:

subq = db.query(Commerce)\
    .join(Commerce.subcategories)\
    .order_by(Subcategory.id, Commerce.importance.desc())\
    .limit(number_of_commerces)\
    .subquery()

subcategories = db.query(Subcategory)\
    .filter(Subcategory.main_category_id == id)\
    .outerjoin(subq, Subcategory.commerces)\
    .options(contains_eager(Subcategory.commerces, alias=subq))\
    .order_by(Subcategory.id, subq.c.importance.desc())\
    .all()

请注意,CommerceSubcategory表/模型之间存在多对多的关系。


<details>
<summary>英文:</summary>

I am trying to retrieve in a fastapi endpoint a list of a pydantic model that consists in some attributes of a `Subcategory` sqlalchemy model and an attribute that is also a list of a `ShowCommerceToSubcategory` pydantic model that matches the respective `Commerce` sqlalchemy model.  I want to be able to limit and order_by the results of the inner `Commerce` query.

So far I have a class method in my `Category` model with a very simple query that takes a certain category id and retrieves all the associated subcategories:

@classmethod
def get_subcategory_commerce(cls, id: int, db: Session, number_of_commerces: int = 5):
subcategories = db.query(Subcategory)
.filter(Subcategory.main_category_id == id)
.all()


My response model looks like this:

class ShowSubcategoryCommerce(SQLModel):
id: int
main_category_id: int
name: str
description: str | None
icon: str | None
commerces: list[ShowCommerceToSubcategory]

class Config:
    orm_mode = True

When I use the class method described above, I get somewhat the expected result. However, I want to be able to limit the results of the `ShowCommerceToSubcategory` model to 5 and show them in descendent order  by an attribute that my `Commerce` sqlalchemy model has, and that it is called importance.

I have tried this query to achieve this, but it is failing and not showing all the commerces within the respective subcategories:

subq = db.query(Commerce)
.join(Commerce.subcategories)
.order_by(Subcategory.id, Commerce.importance.desc())
.limit(number_of_commerces)
.subquery()

subcategories = db.query(Subcategory)
.filter(Subcategory.main_category_id == id)
.outerjoin(subq, Subcategory.commerces)
.options(contains_eager(Subcategory.commerces, alias=subq))
.order_by(Subcategory.id, subq.c.importance.desc())
.all()

Note that there is a many to many relationship between the `Commerce` and `Subcategory` tables/models 

</details>


# 答案1
**得分**: 1

你遇到的问题可能是由于 SQL 在子查询中处理 `LIMIT` 的方式。当你在子查询中指定 `LIMIT` 时,它会限制返回的 `Commerce` 对象的总数,而不是每个子类别中 `Commerce` 对象的数量。因此,如果你有多个 `Subcategory` 对象,你当前的查询可能会返回一个 `Subcategory` 的所有 `Commerce` 对象,而对其他 `Subcategory` 则不会返回任何对象。

为了实现你想要的结果,你需要使用窗口函数对每个子类别中的 `Commerce` 对象进行排名,然后根据这个排名进行筛选。在 `SQLAlchemy` 中,你可以使用 `func.row_number()` 来做到这一点。

以下是你可以调整查询的一个粗略示例:

```python
from sqlalchemy import func, desc

# 创建一个窗口函数,对每个子类别中的 Commerce 对象进行排名
window = func.row_number().over(
    partition_by=Subcategory.id, order_by=Commerce.importance.desc()
).label("rn")

# 创建一个包含排名的子查询
subq = (
    db.query(Commerce, window)
    .join(Commerce.subcategories)
    .filter(Subcategory.main_category_id == id)
    .subquery()
)

# 查询 Subcategory 对象并与子查询连接,在排名的基础上进行筛选
subcategories = (
    db.query(Subcategory)
    .outerjoin(subq, Subcategory.commerces)
    .options(contains_eager(Subcategory.commerces, alias=subq))
    .filter(subq.c.rn <= number_of_commerces)
    .order_by(Subcategory.id, subq.c.importance.desc())
    .all()
)

这段代码会根据 importance 对每个子类别中的 Commerce 对象进行排名,然后过滤掉不在前 number_of_commerces 中的 Commerce 对象。这样,你应该可以得到每个 Subcategory 的前 number_of_commercesCommerce 对象。

你可能需要根据你的模型定义/关系来调整代码,因为我没有你的模型的完整上下文。

英文:

The issue you're experiencing is likely due to the way SQL handles LIMIT in subqueries. When you specify LIMIT in your subquery, it's limiting the total number of Commerce objects returned, not the number of Commerce objects per Subcategory. Thus, if you have multiple Subcategory objects, your current query could return all the Commerce objects for one Subcategory and none for the others.

To achieve the result you want, you will need to use a window function to rank the Commerce objects within each Subcategory, and then filter based on this rank. In SQLAlchemy, you can use func.row_number() to do this.

Here is a rough example of how you could adjust your query:

from sqlalchemy import func, desc

# Create a window function that ranks Commerce objects within each Subcategory
window = func.row_number().over(
    partition_by=Subcategory.id, order_by=Commerce.importance.desc()
).label(&quot;rn&quot;)

# Create a subquery that includes the rank
subq = (
    db.query(Commerce, window)
    .join(Commerce.subcategories)
    .filter(Subcategory.main_category_id == id)
    .subquery()
)

# Query for Subcategory objects and join with the subquery, filtering based on the rank
subcategories = (
    db.query(Subcategory)
    .outerjoin(subq, Subcategory.commerces)
    .options(contains_eager(Subcategory.commerces, alias=subq))
    .filter(subq.c.rn &lt;= number_of_commerces)
    .order_by(Subcategory.id, subq.c.importance.desc())
    .all()
)

This code ranks the Commerce objects within each Subcategory based on their importance, then filters out the Commerce objects that don't fall within the top number_of_commerces. This way, you should get the top number_of_commerces, Commerce objects for each Subcategory.

You might need to adjust the code to fit your model definition/relationships, since I don't have the full context of your models.

huangapple
  • 本文由 发表于 2023年6月19日 10:39:29
  • 转载请务必保留本文链接:https://go.coder-hub.com/76503314.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定