2023年7月13日 23:07:55go评论72阅读模式

英文:

Count duplicates using count() and count(distinct)

问题

我不清楚为什么这个SQL查询可以用来计算特定字段的重复值数量：

select sum(country_count)
from (
select Country, count(Country) as country_count
from Customers
group by Country
having count(Country) > 1
);

返回结果为 88

而这个查询却不能：

SELECT count(Country) - count(Distinct Country) 
FROM Customers;

返回结果为 70

另外，为什么它们会产生不同的结果呢？

*FYI，这些查询在W3Schools的trySQL界面中可行。

https://www.w3schools.com/sql/trysql.asp?filename=trysql_asc#:~:text=If%20you%20switch%20to%20a,your%20browser%2C%20for%20testing%20purposes.

我尝试过的方法：

我运行了这些查询，它们给出了不同的结果（分别返回88和70）。我预期它们应该产生相同的结果。

英文:

I am unclear why this SQL query can be used to count duplicate values for a particular field:

select sum(country_count)
from (
select Country, count(Country) as country_count
from Customers
group by Country
having count(Country) &gt; 1
);

returns 88**

while this query cannot:

SELECT count(Country) - count(Distinct Country) 
FROM Customers;

returns 70**

As a side note, why do they give different results?

*FYI, these queries work in the W3Schools trySQL interface.

https://www.w3schools.com/sql/trysql.asp?filename=trysql_asc#:~:text=If%20you%20switch%20to%20a,your%20browser%2C%20for%20testing%20purposes.

What I have tried:

I have run these queries and they give different results (returning 88 vs 70). I expected them to produce the same result.

答案1

得分: 0

考虑你的Customers数据库中的这些行，它们位于Country列中：

印度
印度
印度
美国
美国
澳大利亚

现在考虑你的第一个查询：

select sum(country_count)
from (
select Country, count(Country) as country_count
from Customers
group by Country
having count(Country) > 1
);

首先让我们考虑你正在使用的内部查询：

select Country, count(Country) as country_count
from Customers
group by Country
having count(Country) > 1

这个查询的输出（Output1）将是：

Country	country_count
印度	3
美国	2

现在考虑将外部查询应用于这个输出：

select sum(country_count)
FROM Output1

答案应该是5。

但是对于你的第二个查询：

SELECT count(Country) - count(Distinct Country) 
FROM Customers;

输出应该是6-3 = 3。

所以，这两个查询做了不同的事情。这就是你得到不同结果的原因。

第一个查询：忽略只有1次计数的国家，然后将其他国家的计数相加。

第二个查询：忽略所有不同国家的第一次计数，然后将其余计数相加。

希望这有所帮助。

英文:

Consider these rows in your Customers database in Country column

India
India
India
USA
USA
Australia

Now consider your 1st query:-

select sum(country_count)
from (
select Country, count(Country) as country_count
from Customers
group by Country
having count(Country) &gt; 1
);

First let's consider the inner query that you are using:-

select Country, count(Country) as country_count
from Customers
group by Country
having count(Country) &gt; 1

The output (Output1) of this query will be:-

Country | country_count

India | 3

USA | 2

Now consider applying the outer query on this output:-

select sum(country_count)
FROM Output1

The answer should be 5 for this

But for your 2nd query:-

SELECT count(Country) - count(Distinct Country) 
FROM Customers;

The output should be 6-3 = 3

So, both are doing different things. That's the reason you are getting different outputs.

1st query : Ignore the Countries which have only 1 count and then add up all the counts for the other ones.

2nd query : Ignore the first count from all the distinct Countries and add up rest of the counts.

Hope this helps.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

使用count()和count(distinct)来计算重复项的数量。

问题

答案1

无法在 MySQL 8.0 中创建表格。

如何在Java中迭代表格行？

如何在golang中使用.Format将time.Time对象转换为格式化字符串？

在PHP Laravel中提交数据到数据库时出现错误。

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论