尝试优化/改进我的查询,其中我获取订单数量和应用的每个筛选器的列。

huangapple go评论61阅读模式
英文:

Try to optimize/improve my query, where I bring the number of orders and a column for each filter that I apply

问题

我试图获得不同企业的订单数量

所以,我有两个表格:

表格1:

订单编号 订单日期 商店编号 客户编号 是否有效
1 2020年01月01日 55 25 有效
2 2020年05月24日 90 34 无效
3 2021年09月03日 56 25 有效
4 2021年10月04日 210 100 无效

表格2:

订单编号 订单日期 商店编号 客户编号 是否有效 产品 IIM
1 2020年01月01日 55 25 有效 可口可乐
1 2020年01月01日 55 25 有效 一些饮料
1 2020年01月01日 55 25 有效 一些饮料
1 2020年01月01日 55 25 有效 一些饮料
2 2020年05月24日 90 34 无效 一些饮料
3 2021年09月03日 56 25 有效 一些饮料
3 2021年09月03日 56 25 有效 一些饮料
4 2021年10月04日 210 100 无效 一些饮料
4 2021年10月04日 210 100 无效 一些饮料

所以表格2与表格1非常相似,不同之处在于表格2中,对于每个订单,都创建了一个包含该订单中每个产品的行。此外,我有一个名为IIM的列,如果对于一个订单至少有一个产品的IIM = True,整个订单就被视为IIM = True。

下面是我的问题/问题:

我想在单个查询中获取以下订单数量:

  • 商店编号 = 25
  • 商店编号 = 25 并且 IIM = True
  • 商店编号不包括商店编号 1、2、3
  • 商店编号不包括 1、2、3 且 IIM = True

(仅考虑有效订单)

为了实现这个目标,我做了以下操作:

SELECT
    yearmonth,
    date_,
    SUM(cant_store_25) AS cant_store_25,
    SUM(cant_store_25_iim) AS cant_store_25_iim,
    SUM(cant_store_plus) AS cant_store_plus,
    SUM(cant_store_plus_iim) AS cant_store_plus_iim
FROM
    (SELECT
        yearmonth,
        date_,
        order_,
        MAX(cant_store_25) AS cant_store_25,
        MAX(cant_store_25_iim) AS cant_store_25_iim,
        MAX(cant_store_plus) AS cant_store_plus,
        MAX(cant_store_plus_iim) AS cant_store_plus_iim
    FROM
        (SELECT DISTINCT
            t1.Order_Number AS order_,
            t1.Order_Date::DATE AS date_,
            YEAR(t1.Order_Date::DATE) * 100 + MONTH(t1.Order_Date::DATE) AS yearmonth,
            CASE WHEN t1.Store_ID = 25 THEN 1 ELSE 0 END AS cant_store_25,
            CASE WHEN t1.Store_ID = 25 AND IIM = TRUE THEN 1 ELSE 0 END AS cant_store_25_iim,
            CASE WHEN t1.Store_ID NOT IN (1, 2, 3) THEN 1 ELSE 0 END AS cant_store_plus,
            CASE WHEN t1.Store_ID NOT IN (1, 2, 3) AND IIM THEN 1 ELSE 0 END AS cant_store_plus_iim,
            MAX(IIM) AS IIM_
        FROM
            Table1 t1
        LEFT JOIN
            Table2 t2 ON t1.Customer_ID = t2.Customer_ID AND t1.Order_Number = t2.Order_Number
        WHERE
            t1.Is_Valid = 'Valid'
        GROUP BY
            t1.Order_Number, date_, yearmonth, t1.Store_ID, IIM)
    GROUP BY
        1, 2, 3)
WHERE
    yearmonth = 202208
GROUP BY
    1, 2;

所以是否有更好的方法来实现这个目标?我的数据库很小,所以这个查询对我来说有效,但也许我可以改进一些东西,使其更有效率。老实说,我是新手,我想学习,以防将来需要处理更大的数据库 尝试优化/改进我的查询,其中我获取订单数量和应用的每个筛选器的列。

我不知道是否相关,但我使用的是Snowflake

谢谢!

英文:

I'm trying to get the number of orders for different businesses

So, I have two tables:

Table1:

Order_Number Order_Date Store_ID Customer_ID Is_Valid
1 01-01-2020 55 25 Valid
2 24-05-2020 90 34 Invalid
3 03-09-2021 56 25 Valid
4 04-10-2021 210 100 Invalid

Table2:

Order_Number Order_Date Store_ID Customer_ID Is_Valid Product IIM
1 01-01-2020 55 25 Valid Coca-Cola False
1 01-01-2020 55 25 Valid some drink False
1 01-01-2020 55 25 Valid some drink True
1 01-01-2020 55 25 Valid some drink True
2 24-05-2020 90 34 Invalid some drink False
3 03-09-2021 56 25 Valid some drink True
3 03-09-2021 56 25 Valid some drink True
4 04-10-2021 210 100 Invalid some drink True
4 04-10-2021 210 100 Invalid some drink False

So table 2 is very similar to table 1, the difference is that in table 2, I have, for an order, a row created for each product that was in that order.
Also, I have the IIM column, where if for an order there is at least one product where IIM = True, the entire order is considered IIM = True

Here comes my question/problem:

I want, in a single query, to get the number of orders where:

  • store_id = 25
  • store_id = 25 and IIM = True
  • store_Id excluding store_Id 1,2,3
  • store_id excluding 1,2,3 and IIM = True

(Only taking into account valid orders)

To achieve this, I did this:

SELECT
    yearmonth,
    date_,
    SUM(cant_store_25) AS cant_store_25,
    SUM(cant_store_25_iim) AS cant_store_25_iim,
    SUM(cant_store_plus) AS cant_store_plus,
    SUM(cant_store_plus_iim) AS cant_store_plus_iim
FROM
    (SELECT
        yearmonth,
        date_,
        order_,
        MAX(cant_store_25) AS cant_store_25,
        MAX(cant_store_25_iim) AS cant_store_25_iim,
        MAX(cant_store_plus) AS cant_store_plus,
        MAX(cant_store_plus_iim) AS cant_store_plus_iim
    FROM
        (SELECT DISTINCT
            t1.Order_Number AS order_,
            t1.Order_Date::DATE AS date_,
            YEAR(t1.Order_Date::DATE) * 100 + MONTH(t1.Order_Date::DATE) AS yearmonth,
            CASE WHEN t1.Store_ID = 25 THEN 1 ELSE 0 END AS cant_store_25,
            CASE WHEN t1.Store_ID = 25 AND IIM = TRUE THEN 1 ELSE 0 END AS cant_store_25_iim,
            CASE WHEN t1.Store_ID NOT IN (1, 2, 3) THEN 1 ELSE 0 END AS cant_store_plus,
            CASE WHEN t1.Store_ID NOT IN (1, 2, 3) AND IIM THEN 1 ELSE 0 END AS cant_store_plus_iim,
            MAX(IIM) AS IIM_
        FROM
            Table1 t1
        LEFT JOIN
            Table2 t2 ON t1.Customer_ID = t2.Customer_ID AND t1.Order_Number = t2.Order_Number
        WHERE
            t1.Is_Valid = 'Valid'
        GROUP BY
            t1.Order_Number, date_, yearmonth, t1.Store_ID, IIM)
    GROUP BY
        1, 2, 3)
WHERE
    yearmonth = 202208
GROUP BY
    1, 2;

So is there a better way to achieve this? My database is small so this query works for me, but maybe I can improve several things to make it more efficient. Honestly, I'm new and I want to learn just in case in the future I have to work with larger databases 尝试优化/改进我的查询,其中我获取订单数量和应用的每个筛选器的列。

I don't know if it's relevant, but I use Snowflake

Thanks!

答案1

得分: 0

难以检查,但第一个查询和子查询可以用以下更简单的方式完成:

SELECT t1.Order_Number AS 订单号,
       t1.Order_Date::DATE AS 日期,
       MAX(CASE WHEN t1.Store_ID = 25 THEN 1 ELSE 0 END) AS 不能存储25,
       MAX(CASE WHEN t1.Store_ID = 25 AND IIM = TRUE THEN 1 ELSE 0 END) AS 不能存储25_iim,
       MAX(CASE WHEN t1.Store_ID NOT IN (1, 2, 3) THEN 1 ELSE 0 END) AS 不能存储加,
       MAX(CASE WHEN t1.Store_ID NOT IN (1, 2, 3) AND IIM THEN 1 ELSE 0 END) AS 不能存储加_iim
FROM 1 t1
LEFT JOIN 2 t2 
    ON t1.Customer_ID = t2.Customer_ID 
    AND t1.Order_Number = t2.Order_Number
WHERE t1.Is_Valid = 'Valid'
    AND YEAR(t1.Order_Date::DATE) = 2022
    AND MONTH(t1.Order_Date::DATE) = 8
GROUP BY t1.Order_Number, 日期

然后外部查询将只是对数据进行SUM。主要思想是你可以在应用MAX函数之前计算每个订单的条件。好处是我们不需要DISTINCT,因为我们正在分组。

此外,我删除了MAX(IIM) AS IIM_,因为它没有用,并提前添加了年月过滤。由于你正在筛选特定的年月,因此没必要计算所有数据然后只获取它们。

英文:

Difficult to check but the first query and subquery can be done with something simpler like the following:

SELECT t1.Order_Number AS order_,
       t1.Order_Date::DATE AS date_,
       MAX(CASE WHEN t1.Store_ID = 25 THEN 1 ELSE 0 END) AS cant_store_25,
       MAX(CASE WHEN t1.Store_ID = 25 AND IIM = TRUE THEN 1 ELSE 0 END) AS cant_store_25_iim,
       MAX(CASE WHEN t1.Store_ID NOT IN (1, 2, 3) THEN 1 ELSE 0 END) AS cant_store_plus,
       MAX(CASE WHEN t1.Store_ID NOT IN (1, 2, 3) AND IIM THEN 1 ELSE 0 END) AS cant_store_plus_iim
FROM Table1 t1
LEFT JOIN Table2 t2 
	ON t1.Customer_ID = t2.Customer_ID 
	AND t1.Order_Number = t2.Order_Number
WHERE t1.Is_Valid = 'Valid'
	AND YEAR(t1.Order_Date::DATE) = 2022
	AND MONTH(t1.Order_Date::DATE) = 8
GROUP BY t1.Order_Number
	    ,date_

and then the outer one will just SUM the data. The main idea is that you can calculate the conditions for each order earlies applying the MAX function. The good part is we do not need a DISTINCT as we are grouping.

Also, I have remove MAX(IIM) AS IIM_ as it is not used, and add the year-month filtering earlier. As you are filtering for particular year-month, there is no point to calculate all the data and then get only its.

huangapple
  • 本文由 发表于 2023年5月30日 11:29:11
  • 转载请务必保留本文链接:https://go.coder-hub.com/76361452.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定