2023年2月27日 01:39:17go评论55阅读模式

英文:

How to select records by condition in postgresql , and from this set select records with the same value of some field?

问题

这里有一个包含数百万条记录的表。

如下所示

select c.id, c.mobile_phone, c.last_name, c.email
from user c
group by c.id, c.mobile_phone, c.last_name, c.email
HAVING count(*) &gt; 1;

需要几分钟。

需要选择具有相同的手机号码的记录，并从结果中选择具有相同的电子邮件的记录。

例如

我已经这样做了：

select * from (select * from user where mobile_phone = &#39;00222334422222&#39;) ou
where (select count(*) from user inr
       where inr.email = ou.email) &gt; 1;

以及

SELECT
    *,
    ROW_NUMBER() OVER (
        PARTITION BY email, mobile_phone
        ORDER BY email, mobile_phone) AS Row_Number
FROM (select * from user where mobile_phone = &#39;2225776676788&#39;) as &quot;c*&quot;;

例如

select * from employee where
        mobile_phone = &#39;75777302722&#39;;

在图中显示的字段应该在最终数据样本中。

这是一个版本

select * from (select * from employee where
                                          mobile_phone = &#39;75777302722&#39;) ou
where (select count(*) from employee inr
       where inr.email = ou.email) &gt; 1;

这段代码在某些版本的PostgreSQL上不起作用。

PostgreSQL 14.7 - 它可以工作。

PostgreSQL 13.7 - 不起作用。

我得到了具有相同电话号码的行，但我无法仅选择那些电子邮件相同的行（同时，具有电子邮件 = null的行不应包括在内）。

有没有关于如何做到这一点的想法？

英文:

There is a table with million records.

like so

select c.id, c.mobile_phone, c.last_name, c.email
from user c
group by c.id, c.mobile_phone, c.last_name, c.email
HAVING count(*) &gt; 1;

takes minutes.

It is necessary to select records with the same mobile_phone, and from the resulting selection, select records that have the same Email.

for example

I done that:

select * from (select * from user where mobile_phone = &#39;00222334422222&#39;) ou
where (select count(*) from user inr
       where inr.email = ou.email) &gt; 1;

and so:

SELECT
    *,
    ROW_NUMBER() OVER (
        PARTITION BY email, mobile_phone
        ORDER BY email, mobile_phone) AS Row_Number
FROM (select * from user where mobile_phone = &#39;2225776676788&#39;) as &quot;c*&quot;;

for example

select * from employee where
        mobile_phone = &#39;75777302722&#39;;

the fields shown in Fig. should be in the final data sample.

It's a version

select * from (select * from employee where
                                          mobile_phone = &#39;75777302722&#39;) ou
where (select count(*) from employee inr
       where inr.email = ou.email) &gt; 1;

This code does not work on some versions of postgresql.

PostgreSQL 14.7 - it works.

PostgreSQL 13.7 - doen't work.

I get rows with the same phone number, but I can't select only those rows (from this dataset) where the email is the same (at the same time, rows that have Email = null should not be included)

5 users have the same phone number.
Of these 5 users - 2 have the same Email - they should be included in the final data set.

Are there any ideas how to do this?

答案1

得分: 1

在内部查询中，`row_number()` 函数为每个 `mobile_phone, email` 组合分配一个编号，然后外部查询选择具有多个匹配项的记录（请注意，不应包括具有 `rn = 3` 及更高的行，因为 `rn = 2` 已经提供了该组合）。

此解决方案不使用臭名昭著的 `count(*)` 函数，因为它通常很慢。如果性能是一个问题，您应该在 `(mobile_phone, email)` 上添加一个索引。任何具有超过几千条记录的表（或者更准确地说：具有超过几个物理数据存储页面的表）都将受益于针对典型查询进行调整的适当索引。

英文:

SELECT id, mobile_phone, last_name, email
FROM (
    SELECT *, row_number() OVER (PARTITION BY mobile_phone, email) AS rn
    FROM user ) user_rn
WHERE mobile_phone = &#39;75777302722&#39;
  AND rn = 2;

In the inner query the row_number() function assigns a number to each combination of mobile_phone, email and the outer query then selects those records with multiple hits (note that rows with rn = 3 and higher should not be included because rn = 2 already supplies the combination).

This solution does not use the count(*) function, which is notoriously slow. If performance is an issue, you should add an index on (mobile_phone, email). Any table with more than a few thousand records (or, more precisely: with more than a few physical pages for data storage) will benefit from appropriate indexes tuned to the typical queries.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

How to select records by condition in postgresql , and from this set select records with the same value of some field?

问题

答案1

为什么Go将PostgreSQL的numeric和decimal列视为[]uint8类型？

2-parameter PreparedStatement is throwing syntax error, SQL State 42601, near second argument

SQL – %分割销售类型

PostgreSQL序列不按1递增

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论