2023年6月22日 04:27:34go评论65阅读模式

英文:

How to include columns from two tables based on a common column value but include rows only from the left table?

问题

我有两个表格 tb1 和 tb2。tb2 通过一个名为 tk_id 的外键与 tb1 相连接。以下是我的两个表格的样子：

tb1

     列名      |            数据类型             | 校对规则 | 允许空值 |                     默认值                      
-----------------+-----------------------------+-----------+----------+--------------------------------------------------
 id              | 整数                     |           | 非空 | 下一个值(&#39;tb2_id_seq&#39;::regclass)
 创建时间      | 无时区时间戳 |           | 非空 | 
 修改时间     | 无时区时间戳 |           | 非空 | 
 状态          | 双精度浮点数            |           | 非空 | 
 tk_id           | UUID                        |           | 非空 |

tb2

     列名     |            数据类型             | 校对规则 | 允许空值 | 默认值 
----------------+-----------------------------+-----------+----------+---------
 id             | UUID                        |           | 非空 | 
 创建时间     | 无时区时间戳 |           | 非空 | 
 修改时间    | 无时区时间戳 |           | 非空 | 
 目标ID | UUID                        |           | 非空 | 
 源ID      | UUID                        |           | 非空 | 
 tk_id          | UUID                        |           | 非空 |

现在我需要从 tb1 获取所有行，这些行具有来自 tb1 和 tb2 的列，当 tk_id 值匹配两个行时。

这是我尝试过的：

select tb1.created_at, tb1.status, tb2.source_id, tb2.destination_id from tb1 
inner join tb2 on tb1.tk_id = tb2.tk_id where 
tb1.created_at &gt; timezone(&#39;utc&#39;, now()) - interval &#39;40 minutes&#39;;

但我得到了太多的行。通常在 40分钟 的时间间隔内，会有大约 800 条记录，但在连接后，我得到了大约 100,000+ 条记录。

编辑: 经过一些阅读和几次尝试，我在查询中进行了一些更改，并成功将行数减少到了预期的行数。这是我的查询现在：

SELECT count(*) FROM tb1 LEFT OUTER JOIN (SELECT DISTINCT tk_id FROM tb2) t2 
ON tb1.tk_id = t2.tk_id where tb1.created_at &gt; timezone(&#39;utc&#39;, now()) - 
interval &#39;40 minutes&#39;;

但现在我无法在我的选择查询中获取 tb2 的列。

我做错了什么？

编辑2: 如果我之前没有表达清楚，我很抱歉。连接条件应该基于右表中 tk_id 的最新出现。因此，对于左表的每一行 (tb1)，它应该与右表的 tk_id 的最新出现匹配，并提取右表的列。

英文:

I have two tables tb1 and tb2. tb2 is connected to tb1 via a foreign key named tk_id. Here is how my two tables look like

tb1

     Column      |            Type             | Collation | Nullable |                     Default                      
-----------------+-----------------------------+-----------+----------+--------------------------------------------------
 id              | integer                     |           | not null | nextval(&#39;tb2_id_seq&#39;::regclass)
 created_at      | timestamp without time zone |           | not null | 
 modified_at     | timestamp without time zone |           | not null | 
 status          | double precision            |           | not null | 
 tk_id           | uuid                        |           | not null |

tb2

     Column     |            Type             | Collation | Nullable | Default 
----------------+-----------------------------+-----------+----------+---------
 id             | uuid                        |           | not null | 
 created_at     | timestamp without time zone |           | not null | 
 modified_at    | timestamp without time zone |           | not null | 
 destination_id | uuid                        |           | not null | 
 source_id      | uuid                        |           | not null | 
 tk_id          | uuid                        |           | not null |

Now I need to get all rows from tb1 which has columns from both tb1 and tb2 when the tk_id value matches for both the rows.

This is what I tried:

select tb1.created_at, tb1.status, tb2.source_id, tb2.destination_id from tb1 
inner join tb2 on tb1.tk_id = tb2.tk_id where 
tb1.created_at &gt; timezone(&#39;utc&#39;, now()) - interval &#39;40 minutes&#39;;

But I am getting way too many rows. Usually in a 40min interval, there would be around 800 records but after the join, I am getting around 100,000+ records.

EDIT: After some reading and a few tries, I made some changes in my query and managed to get the rows down to the expected number of rows. This is my query now

SELECT count(*) FROM tb1 LEFT OUTER JOIN (SELECT DISTINCT tk_id FROM tb2) t2 
ON tb1.tk_id = t2.tk_id where tb1.created_at &gt; timezone(&#39;utc&#39;, now()) - 
interval &#39;40 minutes&#39;;

But now I can't get the columns of tb2 in my select query.

What am I doing wrong?

EDIT2: Sorry if I couldn't make this clearer earlier. The join condition should be based on the latest occurrence of tk_id of the right table. So for every row of left table (tb1), it should match against the latest occurrence of tk_id of right table (tb2) and fetch the right table's columns.

答案1

得分: 1

以下是翻译好的内容：

由于在 tb1 中有明显的多行与 tb1 中的单行匹配，您需要定义要选择哪一行。或者需要一些汇总？

此查询返回所有来自 tb1 的符合条件的行，并从 tb2 中最后创建的一个匹配行添加列（如果有的话）：

SELECT tb1.created_at, tb1.status, tb2.*
FROM   tb1
LEFT   JOIN LATERAL (
   SELECT tb2.source_id, tb2.destination_id
   FROM   tb2
   WHERE  tb2.tk_id = tb1.tk_id
   ORDER  BY created_at DESC, id DESC
   LIMIT  1
   ) tb2 ON true
WHERE  tb1.created_at > timezone('utc', now()) - interval '40 minutes';

在这里添加了 id DESC 作为补充，因为 created_at 可能不是唯一的。根据您未透露的需求进行调整。

如何根据共同的列值从两个表中包括列，但只包括左表中的行？

问题

答案1

在 SQL Server 中，如何选择位于 SQL 字符串中某个字符之间的所有单词？

将结果分组成连续的时间段（日期）。

Golang SSH隧道连接到远程Postgres数据库

为什么即使我拥有所有权限，也无法制作MySQL数据库的转储？

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论