2020年1月6日 22:57:35go评论76阅读模式

英文:

Selecting each first unique tuple of columns from a MySQL table

问题

以下是您要翻译的内容：

在MySQL（5.7.14）中，我有一个日志表，其架构如下：

CREATE TABLE logs
(
  id INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
  entry_date DATE NOT NULL,
  original_date DATE NOT NULL,
  ref_no VARCHAR(30) NOT NULL
) Engine=InnoDB;

INSERT INTO logs VALUES
(1,'2020-01-01','2020-01-01','XYZ'),
(2,'2020-01-01','2020-01-01','ABC'),
(3,'2020-01-02','2020-01-01','XYZ'),
(4,'2020-01-02','2020-01-01','ABC'),
(5,'2020-01-03','2020-01-02','XYZ'),
(6,'2020-01-03','2020-01-01','ABC');

我想要返回每个唯一的(original_date, ref_no)对应的第一行，其中'第一行' 定义为 '最低 id'。

例如，如果我有以下数据：

id|entry_date|original_date|ref_no
--+----------+-------------+------
1 |2020-01-01|2020-01-01   |XYZ
2 |2020-01-01|2020-01-01   |ABC
3 |2020-01-02|2020-01-01   |XYZ
4 |2020-01-02|2020-01-01   |ABC
5 |2020-01-03|2020-01-02   |XYZ
6 |2020-01-03|2020-01-01   |ABC

我想要查询返回：

id|entry_date|original_date|ref_no
--+----------+-------------+------
1 |2020-01-01|2020-01-01   |XYZ
2 |2020-01-01|2020-01-01   |ABC
5 |2020-01-03|2020-01-02   |XYZ

换句话说：

行1被返回，因为我们之前没有看到2020-01-01,XYZ。
行2被返回，因为我们之前没有看到2020-01-01,ABC。
行3不会返回，因为我们之前看到了2020-01-01,XYZ（行1）。
行4不会返回，因为我们之前看到了2020-01-01,ABC（行2）。
行5被返回，因为我们之前没有看到2020-01-02,XYZ。
行6不会返回，因为我们之前看到了2020-01-01,ABC（行2）。

是否有直接在SQL中执行此操作的方法？我考虑过DISTINCT，但我认为它只返回不同的列，而我想要整行。

英文:

I have a log table in MySQL (5.7.14) with the following schema:

CREATE TABLE logs
(
  id INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
  entry_date DATE NOT NULL,
  original_date DATE NOT NULL,
  ref_no VARCHAR(30) NOT NULL
) Engine=InnoDB;

INSERT INTO logs VALUES
(1,&#39;2020-01-01&#39;,&#39;2020-01-01&#39;,&#39;XYZ&#39;),
(2,&#39;2020-01-01&#39;,&#39;2020-01-01&#39;,&#39;ABC&#39;),
(3,&#39;2020-01-02&#39;,&#39;2020-01-01&#39;,&#39;XYZ&#39;),
(4,&#39;2020-01-02&#39;,&#39;2020-01-01&#39;,&#39;ABC&#39;),
(5,&#39;2020-01-03&#39;,&#39;2020-01-02&#39;,&#39;XYZ&#39;),
(6,&#39;2020-01-03&#39;,&#39;2020-01-01&#39;,&#39;ABC&#39;);

I want to return the first row for each unique (original_date, ref_no) pairing, where 'first' is defined as 'lowest id'.

For example, if I had the following data:

id|entry_date|original_date|ref_no
--+----------+-------------+------
1 |2020-01-01|2020-01-01   |XYZ
2 |2020-01-01|2020-01-01   |ABC
3 |2020-01-02|2020-01-01   |XYZ
4 |2020-01-02|2020-01-01   |ABC
5 |2020-01-03|2020-01-02   |XYZ
6 |2020-01-03|2020-01-01   |ABC

I would want the query to return:

id|entry_date|original_date|ref_no
--+----------+-------------+------
1 |2020-01-01|2020-01-01   |XYZ
2 |2020-01-01|2020-01-01   |ABC
5 |2020-01-03|2020-01-02   |XYZ

In other words:

Row 1 is returned because we haven't seen 2020-01-01,XYZ before.
Row 2 is returned because we haven't seen 2020-01-01,ABC before.
Row 3 is not returned because we have seen 2020-01-01,XYZ before (row 1).
Row 4 is not returned because we have seen 2020-01-01,ABC before (row 2).
Row 5 is returned because we haven't seen 2020-01-02,XYZ before.
Row 6 is not returned because we have seen 2020-01-01,ABC before (row 2).

Is there a way to do this directly in SQL? I've considered DISTINCT but I think that only returns the distinct columns, whereas I want the full row.

答案1

得分: 1

可以使用相关子查询：

select l.*
from logs l
where l.id = (select min(l2.id)
              from logs l2
              where l2.original_date = l.original_date and
                    l2.ref_no = l.ref_no
             );

为了性能，你需要在 logs(original_date, ref_no, id) 上创建索引。

英文:

You can use a correlated subquery:

select l.*
from logs l
where l.id = (select min(l2.id)
              from logs l2
              where l2.original_date = l.original_date and
                    l2.ref_no = l.ref_no
             );

For performance, you want an index on logs(original_date, ref_no, id).

答案2

得分: 1

为了避免相关子查询，您可以执行以下操作：

select l.*
from logs l
join (
  select original_date, ref_no, min(id) as min_id
  from logs
  group by original_date, ref_no
) x on l.id = x.min_id

英文:

To avoid a correlated subquery you can do:

select l.*
from logs l
join (
  select original_date, ref_no, min(id) as min_id
  from logs
  group by original_date, ref_no
) x on l.id = x.min_id

答案3

得分: 0

尝试这个：

选择 t1.*
从日志作为 t1
左连接日志作为 t2 在
(
  t2.original_date = t1.original_date 和
  t2.ref_no = t1.ref_no 和
  t2.id < t1.id
)
其中
    t2.original_date 是空的 并且
    t2.ref_no 是空的

英文:

Try this:

select t1.*
from logs AS t1
left join logs AS t2 on 
(
  t2.original_date = t1.original_date and
  t2.ref_no = t1.ref_no and
  t2.id &lt; t1.id
)
where
    t2.original_date is null and
    t2.ref_no is null

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

从MySQL表中选择每个第一个唯一元组的列。

问题

答案1

答案2

答案3

Golang SQL rows.Scan函数用于通用类型的所有字段。

Golang，mysql：错误 1040：连接过多

查询以包括Spring Batch作业参数值和作业执行数据。

如何在Python 3.x中使用pyparsing从Oracle SQL脚本中删除注释？

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论