避免在SQL中使用相同条件的多个子查询。

huangapple go评论78阅读模式
英文:

Avoid Multiple subqueries in SQL with the same conditions

问题

我有一个类似这样的调查数据数据库:

```sql
create table Products(
    Id int primary key,
    Name nvarchar(max)
)

create table Opinions(
    Id int primary key,
    ProductId int foreign key references Products(Id),
    IsEasyToUse bit default((0)),
    IsGoodPrice bit default((0)),
    IsDurable bit default((0)),
-- 还有其他列...
)

我想创建一个报告,显示每个产品以及来自Opinions表中所有意见列的百分比。

到目前为止,我一直以天真的方式进行操作:

select 
    p.Name,
    1.0 *
        (select count(*) from Opinions o where o.ProductId = p.Id and o.IsEasyToUse = 1) / 
        (select count(*) from Opinions o where o.ProductId = p.Id) as EasyToUse,
    1.0 *
        (select count(*) from Opinions o where o.ProductId = p.Id and o.IsGoodPrice = 1) /
        (select count(*) from Opinions o where o.ProductId = p.Id) as GoodPrice,
    1.0 *
        (select count(*) from Opinions o where o.ProductId = p.Id and o.IsDurable = 1) /
        (select count(*) from Opinions o where o.ProductId = p.Id) as Durable
from
(select Id, Name from Products) p

这在技术上可以工作,但我不喜欢条件的重复。而且Opinions表的大小正在增长,性能问题开始变得明显(现在在100万行时需要大约1分钟)。

是否有一种更快速(以及更容易修改以便将来应用更多过滤条件)的重写方法?


<details>
<summary>英文:</summary>

I have a DB of survey data that looks like this:

create table Products(
Id int primary key,
Name nvarchar(max)
)

create table Opinions(
Id int primary key,
ProductId int foreign key references Products(Id),
IsEasyToUse bit default((0)),
IsGoodPrice bit default((0)),
IsDurable bit default((0)),
-- and so on...
)


I want to create a report that shows every product and a percentage of all the opinion columns from the `Opinions` table.

So far, I have been doing it the naive way:


select
p.Name,
1.0 *
(select count() from Opinions o where o.ProductId = p.Id and o.IsEasyToUse = 1) /
(select count(
) from Opinions o where o.ProductId = p.Id) as EasyToUse,
1.0 *
(select count() from Opinions o where o.ProductId = p.Id and o.IsGoodPrice = 1) /
(select count(
) from Opinions o where o.ProductId = p.Id) as GoodPrice,
1.0 *
(select count() from Opinions o where o.ProductId = p.Id and o.IsDurable = 1) /
(select count(
) from Opinions o where o.ProductId = p.Id) as Durable
from
(select Id, Name from Products) p


This technically works, but I am not happy with the repetition of the conditions. Also the Opinions table is growing in size so the performance issues are starting to become noticeable (it now takes around a minute with 1m rows). 

Is there a way to rewrite it in a way that&#39;s faster (and easier to modify in the future in case I want to apply more filters)?

</details>


# 答案1
**得分**: 1

The problem isn't just the repetition of conditions. Each subquery will be executed separately.

A better option would be to join the two tables, group by product then calculate the counts. It's not possible to count based on a condition. The same thing can be done using `SUM(IIF(condition,1,0))` though:

```sql
SELECT p.ID, 
       MAX(p.Name) as Name,
       SUM(IIF(IsEasyToUse,1.0,0))/COUNT(*) as EasyToUse,
       SUM(IIF(IsGoodPrice,1.0,0))/COUNT(*) as GoodPrice,
       SUM(IIF(IsDurable,1.0,0))/COUNT(*) as Durable
FROM Products p 
INNER JOIN Opinions o on p.ID=o.ProductID
GROUP BY p.ID
英文:

The problem isn't just the repetition of conditions. Each subquery will be executed separately.

A better option would be to join the two tables, group by product then calculate the counts. It's not possible to count based on a condition. The same thing can be done using SUM(IIF(condition,1,0)) though:

SELECT p.ID, 
       MAX(p.Name) as Name,
       SUM(IIF(IsEasyToUse,1.0,0))/COUNT(*) as EasyToUse,
       SUM(IIF(IsGoodPrice,1.0,0))/COUNT(*) as GoodPrice,
       SUM(IIF(IsDurable,1.0,0))/COUNT(*) as Durable
FROM Products p 
INNER JOIN Opinions o on p.ID=o.ProductID
GROUP BY p.ID

答案2

得分: 1

以下是翻译好的部分:

作为入门:我们可以连接和聚合,而不是执行多个相关子查询。

然后:我们不需要计算总数;相反,我们可以只使用 avg();减少聚合函数的调用应该提高查询的效率:

从产品 p
内部连接意见 o
按 p.id,p.name 分组

根据你的数据如何分布在两个表之间,apply 可能提供了一个很好的替代方法,因为它避免了外部聚合:

从产品 p
交叉应用 (
选择
avg(case when IsEasyToUse = 1 then 1.0 else 0 end) as EasyToUse,
avg(case when IsGoodPrice = 1 then 1.0 else 0 end) as GoodPrice,
avg(case when IsDurable = 1 then 1.0 else 0 end) as Durable
从意见 o
在 o.productID = p.id
) x

英文:

As a starter: we can join and aggregate instead of executing multiples correlated subqueries.

Then: we don't need to compute the total count; instead, we can just juse avg(); less calls to aggregate functions should improve the efficiency of the query:

select p.id, p.name
    avg(case when IsEasyToUse = 1 then 1.0 else 0 end) as EasyToUse,
    avg(case when IsGoodPrice = 1 then 1.0 else 0 end) as GoodPrice,
    avg(case when IsDurable   = 1 then 1.0 else 0 end) as Durable 
from products p
inner join opinions o on o.productID = p.id
group by p.id, p.name

Depending on how your data is spread across the two tables, apply might offer a good alternative, since it avoids outer aggregation:

select p.id, p.name, x.*
from products p
cross apply (
    select 
        avg(case when IsEasyToUse = 1 then 1.0 else 0 end) as EasyToUse,
        avg(case when IsGoodPrice = 1 then 1.0 else 0 end) as GoodPrice,
        avg(case when IsDurable   = 1 then 1.0 else 0 end) as Durable 
    from opinions o 
    where o.productID = p.id
) x

答案3

得分: 0

你可以尝试这个查询 避免在SQL中使用相同条件的多个子查询。

SELECT 
    ProductId = p.Id
,   ProductName = p.Name
,   op.GoodPrice
,   op.EasyToUse
,   op.Durable
FROM    dbo.Products p
INNER JOIN 
(
    SELECT  o.ProductId
        ,   EasyToUse   =   1.0 *  SUM(CASE o.IsEasyToUse WHEN 0 THEN 0 ELSE 1 END ) / COUNT(o.IsEasyToUse)
        ,   GoodPrice   =   1.0 *  SUM(CASE o.IsGoodPrice WHEN 0 THEN 0 ELSE 1 END ) / COUNT(o.IsGoodPrice)
        ,   Durable     =   1.0 *  SUM(CASE o.IsDurable WHEN 0 THEN 0 ELSE 1 END ) / COUNT(o.IsDurable)
    FROM    dbo.Opinions o
    GROUP BY o.ProductId
) op
ON p.Id = op.ProductId
英文:

you could try this query 避免在SQL中使用相同条件的多个子查询。

	SELECT 
		ProductId = p.Id
	,	ProductName = p.Name
	,	op.GoodPrice
	,	op.EasyToUse
	,	op.Durable
	FROM	dbo.Products p
	INNER JOIN 
	(
		SELECT	o.ProductId
			,	EasyToUse	=	1.0 *  SUM(CASE o.IsEasyToUse WHEN 0 THEN 0 ELSE 1 END ) / COUNT(o.IsEasyToUse)
			,	GoodPrice	=	1.0 *  SUM(CASE o.IsGoodPrice WHEN 0 THEN 0 ELSE 1 END ) / COUNT(o.IsGoodPrice)
			,	Durable		=	1.0 *  SUM(CASE o.IsDurable WHEN 0 THEN 0 ELSE 1 END )	/ COUNT(o.IsDurable)
		FROM	dbo.Opinions o
		GROUP BY o.ProductId
	) op
	ON p.Id = op.ProductId

答案4

得分: 0

另一种计算百分比的方法,其中总数只能计算一次:

SELECT Name, 1.0 * EasyToUse/total AS EasyToUse,
             1.0 * GoodPrice/total AS GoodPrice,
             1.0 * Durable/total AS Durable
FROM (
  SELECT p.Name,
    COUNT(CASE WHEN IsEasyToUse = 1 THEN 1 END) AS EasyToUse,
    COUNT(CASE WHEN IsGoodPrice = 1 THEN 1 END) AS GoodPrice,
    COUNT(CASE WHEN IsDurable = 1 THEN 1 END) AS Durable,
    COUNT(p.Id) AS total
  FROM Products p
  INNER JOIN Opinions o on o.ProductID = p.Id
  GROUP BY p.Id, p.Name
) AS S
英文:

Another method for calculating percentages in which the total count can only be calculated once:

SELECT Name, 1.0 * EasyToUse/total AS EasyToUse,
             1.0 * GoodPrice/total AS GoodPrice,
             1.0 * Durable/total AS Durable
FROM (
  SELECT p.Name,
    COUNT(CASE WHEN IsEasyToUse = 1 THEN 1 END) AS EasyToUse,
    COUNT(CASE WHEN IsGoodPrice = 1 THEN 1 END) AS GoodPrice,
    COUNT(CASE WHEN IsDurable = 1 THEN 1 END) AS Durable,
    COUNT(p.Id) AS total
  FROM Products p
  INNER JOIN Opinions o on o.ProductID = p.Id
  GROUP BY p.Id, p.Name
) AS S

huangapple
  • 本文由 发表于 2023年6月26日 15:08:07
  • 转载请务必保留本文链接:https://go.coder-hub.com/76554280.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定