在两个具有一对多关系的表之间筛选数据。

huangapple go评论75阅读模式
英文:

Filter data between two tables having one to many relationship

问题

以下是您的翻译请求:

`Client`表:

[CID]|[Name]|[Age]
-----|------|------
 1   |  ABC |  12
 2   |  ACC |  15
 3   |  BBB |  12

`Status`表:

[SID]|[CID] |[Status]
-----|------|---------
 1   | 1    |Active
 2   | 1    |Not Active
 3   | 1    |Unknown
 4   | 2    |Active
 5   | 3    |Active
 6   | 3    |Unknown

客户表与状态表通过\[CID\]连接。我想要没有任何状态设置为“Not Active”的客户的\[CID\]

这是我的尝试:

SELECT
A.[CID], B.[Status]
FROM
[Client表] AS A
INNER JOIN
[Status表] B ON A.[CID] = B.[CID]
WHERE
B.[Status] = 'Not Active'


在这种情况下,它应该显示CID为2和3的客户,而不显示CID为1的客户。然而,当我添加条件Status = Not Active时,我得到了所有三个CID。 

我应该如何改进我的查询以仅显示CID为2和3,我在这里做错了什么?
英文:

Client table:

[CID] [Name] [Age]
1 ABC 12
2 ACC 15
3 BBB 12

Status table:

[SID] [CID] [Status]
1 1 Active
2 1 Not Active
3 1 Unknown
4 2 Active
5 3 Active
6 3 Unknown

The client table is connected to status table on [CID]. I want the [CID] of the clients who doesn't have any status set to "Not Active".

Here's my attempt:

SELECT 
    A.[CID], B.[Status] 
FROM
    [Client table] AS A
INNER JOIN 
    [Status table] B ON A.[CID] = B.[CID]
WHERE 
    B.[Status] = 'Not Active' 

In this case it should show client with CID 2 & 3 and not show client with CID 1. However, I'm getting all CID 1, 2 and 3 when in putting the condition that Status = Not Active. With the above query I'm getting all the three CID's.

How can I improve my query to show only CID 2 & 3 and what am I doing wrong here?

答案1

得分: 2

Your inner join will returns ALL ROWS matching your criteria so you will get all CID with status that is at least once not active.
你的内连接将返回所有符合条件的行,因此您将获得所有CID的状态至少有一次不活跃。

You should force the join and check for nulls.
您应该强制连接并检查空值。

You could also do a GROUP BY or a COUNT()
您还可以使用GROUP BY或COUNT(
)。

英文:

Your inner join will returns ALL ROWS matching your criteria so you will get all CID with status that is at least once not active.
You should force the join and check for nulls.

SELECT A.[CID], 
       B.[Status] 
FROM [Client table] as A
JOIN [Status table] B ON A.[CID] = B.[CID]
LEFT JOIN [Status table] C ON C.[CID] = B.[CID] AND C.[Status] = 'Not Active' 
WHERE C.[CID] IS NULL

You could also do a GROUP BY or a COUNT(*)

SELECT A.[CID], 
       B.[Status] 
FROM [Client table] as A 
JOIN [Status table] B ON A.[CID] = B.[CID] 
WHERE (SELECT COUNT(*) 
       FROM [Status table] C WHERE A.[CID] = C.[CID] 
                            AND C.[Status] = 'Not Active') = 0

答案2

得分: 2

NOT EXISTS 是执行此操作的最有效方法。

不幸的是,SQL Server 需要子查询中至少有一列,所以您可以使用 SELECT 1SELECT NULL。无论您做什么都会被忽略,它只是在子查询中寻找行的存在(或不存在)。请注意内部 WHERE 中与外部查询的相关性。

如果您实际上想要使用连接查看所有的 Status 行,那么窗口函数可能是最好的选择。

在这里,我们使用了一个窗口函数,它可以同时计算多行,但仍然返回所有行,而不是聚合它们。PARTITION BY 在这里的作用类似于 GROUP BY,而 CASE WHEN 使它在匹配条件的情况下仅计算行数。

英文:

NOT EXISTS is the most efficient way to do this.

Unfortunately, SQL Server requires at least one colum in the subquery, so you can just do SELECT 1 or SELECT NULL. Whatever you do gets ignored, it's just looking for the existence (or not) of a row in the subquery. Note the correlation to the outer query in the inner WHERE.

SELECT
  c.CID
FROM Client c
WHERE NOT EXISTS (SELECT 1
    FROM Status s
    WHERE s.CID = c.CID   -- correlation
      AND s.Status = 'Not Active'
);

If you actually want to see all Status rows using a join then a window function might be the best option.

Here we use a window function which can calculate over multiple rows at once, while still returning all of them rather than aggregating. The PARTITION BY works like GROUP BY here, and the CASE WHEN makes it conditional so it only counts rows which match the condition.

SELECT
  c.CID
FROM Client c
JOIN (
    SELECT s.*,
      countNotActive = COUNT(CASE WHEN s.Status = 'Not Active' THEN 1 END) OVER (PARTITION BY s.CID)
    FROM Status s
) s ON s.CID = c.CID
WHERE s.countNotActive = 0;

答案3

得分: 0

我想要没有任何状态设置为 "Not Active" 的客户的 [CID]。

对我来说,做到这一点最清晰的方法是使用 EXCEPT,这是 SQL 中用于(单边)集合差异的运算符:

SELECT CID FROM Client
EXCEPT
SELECT CID FROM Status WHERE Status = 'Not Active'

这意味着“给我所有在第一个 SELECT 中的 CID,但不在第二个 SELECT 中的 CID”。

现在,如果你需要比仅仅相关客户的ID更多的信息,你可能会想要使用 NOT EXISTS 方法。但如果你只想要ID,这就足够了。

英文:

> I want the [CID] of the clients who doesn't have any status set to "Not Active".

For me the clearest way to do this is to use EXCEPT, the SQL operator for (one-sided) set difference:

SELECT CID FROM Client
EXCEPT
SELECT CID FROM Status WHERE Status = 'Not Active'

This means "give me all the CIDs that are in the first SELECT but not in the second SELECT"

Now, if you need more than just the IDs of the relevant Clients, you'd probably want the NOT EXISTS approach. But if you just want the IDs, this will do.

huangapple
  • 本文由 发表于 2023年5月18日 01:57:43
  • 转载请务必保留本文链接:https://go.coder-hub.com/76274963.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定