从MySQL表中选择每个第一个唯一元组的列。

huangapple go评论76阅读模式
英文:

Selecting each first unique tuple of columns from a MySQL table

问题

以下是您要翻译的内容:

在MySQL(5.7.14)中,我有一个日志表,其架构如下:

CREATE TABLE logs
(
  id INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
  entry_date DATE NOT NULL,
  original_date DATE NOT NULL,
  ref_no VARCHAR(30) NOT NULL
) Engine=InnoDB;

INSERT INTO logs VALUES
(1,'2020-01-01','2020-01-01','XYZ'),
(2,'2020-01-01','2020-01-01','ABC'),
(3,'2020-01-02','2020-01-01','XYZ'),
(4,'2020-01-02','2020-01-01','ABC'),
(5,'2020-01-03','2020-01-02','XYZ'),
(6,'2020-01-03','2020-01-01','ABC');

我想要返回每个唯一的(original_date, ref_no)对应的第一行,其中'第一行' 定义为 '最低 id'。

例如,如果我有以下数据:

id|entry_date|original_date|ref_no
--+----------+-------------+------
1 |2020-01-01|2020-01-01   |XYZ
2 |2020-01-01|2020-01-01   |ABC
3 |2020-01-02|2020-01-01   |XYZ
4 |2020-01-02|2020-01-01   |ABC
5 |2020-01-03|2020-01-02   |XYZ
6 |2020-01-03|2020-01-01   |ABC

我想要查询返回:

id|entry_date|original_date|ref_no
--+----------+-------------+------
1 |2020-01-01|2020-01-01   |XYZ
2 |2020-01-01|2020-01-01   |ABC
5 |2020-01-03|2020-01-02   |XYZ

换句话说:

  • 行1被返回,因为我们之前没有看到2020-01-01,XYZ
  • 行2被返回,因为我们之前没有看到2020-01-01,ABC
  • 行3不会返回,因为我们之前看到了2020-01-01,XYZ(行1)。
  • 行4不会返回,因为我们之前看到了2020-01-01,ABC(行2)。
  • 行5被返回,因为我们之前没有看到2020-01-02,XYZ
  • 行6不会返回,因为我们之前看到了2020-01-01,ABC(行2)。

是否有直接在SQL中执行此操作的方法?我考虑过DISTINCT,但我认为它只返回不同的列,而我想要整行。

英文:

I have a log table in MySQL (5.7.14) with the following schema:

CREATE TABLE logs
(
  id INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
  entry_date DATE NOT NULL,
  original_date DATE NOT NULL,
  ref_no VARCHAR(30) NOT NULL
) Engine=InnoDB;

INSERT INTO logs VALUES
(1,'2020-01-01','2020-01-01','XYZ'),
(2,'2020-01-01','2020-01-01','ABC'),
(3,'2020-01-02','2020-01-01','XYZ'),
(4,'2020-01-02','2020-01-01','ABC'),
(5,'2020-01-03','2020-01-02','XYZ'),
(6,'2020-01-03','2020-01-01','ABC');

I want to return the first row for each unique (original_date, ref_no) pairing, where 'first' is defined as 'lowest id'.

For example, if I had the following data:

id|entry_date|original_date|ref_no
--+----------+-------------+------
1 |2020-01-01|2020-01-01   |XYZ
2 |2020-01-01|2020-01-01   |ABC
3 |2020-01-02|2020-01-01   |XYZ
4 |2020-01-02|2020-01-01   |ABC
5 |2020-01-03|2020-01-02   |XYZ
6 |2020-01-03|2020-01-01   |ABC

I would want the query to return:

id|entry_date|original_date|ref_no
--+----------+-------------+------
1 |2020-01-01|2020-01-01   |XYZ
2 |2020-01-01|2020-01-01   |ABC
5 |2020-01-03|2020-01-02   |XYZ

In other words:

  • Row 1 is returned because we haven't seen 2020-01-01,XYZ before.
  • Row 2 is returned because we haven't seen 2020-01-01,ABC before.
  • Row 3 is not returned because we have seen 2020-01-01,XYZ before (row 1).
  • Row 4 is not returned because we have seen 2020-01-01,ABC before (row 2).
  • Row 5 is returned because we haven't seen 2020-01-02,XYZ before.
  • Row 6 is not returned because we have seen 2020-01-01,ABC before (row 2).

Is there a way to do this directly in SQL? I've considered DISTINCT but I think that only returns the distinct columns, whereas I want the full row.

答案1

得分: 1

可以使用相关子查询:

select l.*
from logs l
where l.id = (select min(l2.id)
              from logs l2
              where l2.original_date = l.original_date and
                    l2.ref_no = l.ref_no
             );

为了性能,你需要在 logs(original_date, ref_no, id) 上创建索引。

英文:

You can use a correlated subquery:

select l.*
from logs l
where l.id = (select min(l2.id)
              from logs l2
              where l2.original_date = l.original_date and
                    l2.ref_no = l.ref_no
             );

For performance, you want an index on logs(original_date, ref_no, id).

答案2

得分: 1

为了避免相关子查询,您可以执行以下操作:

select l.*
from logs l
join (
  select original_date, ref_no, min(id) as min_id
  from logs
  group by original_date, ref_no
) x on l.id = x.min_id
英文:

To avoid a correlated subquery you can do:

select l.*
from logs l
join (
  select original_date, ref_no, min(id) as min_id
  from logs
  group by original_date, ref_no
) x on l.id = x.min_id

答案3

得分: 0

尝试这个:

选择 t1.*
从日志作为 t1
左连接日志作为 t2 
(
  t2.original_date = t1.original_date 
  t2.ref_no = t1.ref_no 
  t2.id < t1.id
)
其中
    t2.original_date 是空的 并且
    t2.ref_no 是空的
英文:

Try this:

select t1.*
from logs AS t1
left join logs AS t2 on 
(
  t2.original_date = t1.original_date and
  t2.ref_no = t1.ref_no and
  t2.id &lt; t1.id
)
where
    t2.original_date is null and
    t2.ref_no is null

huangapple
  • 本文由 发表于 2020年1月6日 22:57:35
  • 转载请务必保留本文链接:https://go.coder-hub.com/59614274.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定