英文:
How to get not duplicate rows in join?
问题
我遇到了一些关于重复行的问题,我不想得到它们。
嗨!
我有两个表 - tab1、tab2,我想将tab2连接到tab1,如下所示:
SELECT t1.column_A1, t2.column_B2
FROM tab1 t1
JOIN
tab2 t2
ON t1.column_A1=t2.column_A2
tab1
Column A1 | Column B1 | Column C1 |
---|---|---|
Z1 | Cell 2 | Cell 3 |
Z2 | Cell 5 | Cell 6 |
tab2
Column A2 | Column B2 | Column C2 |
---|---|---|
Z1 | PW | Cell 3 |
Z1 | RW | Cell 6 |
对于tab1中的某些行,在tab2中有多于1行的情况。
结果将是:
Column A2 | Column B2 | Column C2 |
---|---|---|
Z1 | PW | RE |
Z1 | RW | KS |
我想要的是:
如果是PW - 只显示一行PW;
如果不是PW - 只显示一行RW
结果应该是:
Column A2 | Column B2 | Column C2 |
---|---|---|
Z1 | PW | RE |
英文:
I got some problems with duplicate rows which I don't wanna get.
Hi!
I got two tables - tab1, tab2 and I want to join tab2 to tab1 like:
SELECT t1.column_A1, t2.column_B2
FROM tab1 t1
JOIN
tab2 t2
ON t1.column_A1=t2.column_A2
tab1
| Column A1 | Column B1 | Column C1 |
| -------- | -------- | -------- |
| Z1 | Cell 2 | Cell 3 |
| Z2 | Cell 5 | Cell 6 |
tab2
| Column A2 | Column B2 | Column C2 |
| -------- | -------- | -------- |
| Z1 | PW | Cell 3 |
| Z1 | RW | Cell 6 |
For some rows in tab1 there are more than 1 rows in tab2.
The result will be:
| Column A2 | Column B2 | Column C2 |
| -------- | -------- | -------- |
| Z1 | PW | RE |
| Z1 | RW | KS |
I want to get:
if PW - show only one row with PW;
if not PW - show only one row with RW
The result should be:
| Column A2 | Column B2 | Column C2 |
| -------- | -------- | -------- |
| Z1 | PW | RE |
答案1
得分: 1
一个选择是根据存储在column_b2
中的值来对每个column_a1
的行进行“排序”,然后返回排名最高的行。
示例数据:
SQL> WITH
2 tab1 (column_a1, column_b1, column_c1)
3 AS
4 (SELECT 'Z1', 'cell 2', 'cell 3' FROM DUAL
5 UNION ALL
6 SELECT 'Z2', 'cell 5', 'cell 6' FROM DUAL),
7 tab2 (column_a2, column_b2, column_c2)
8 AS
9 (SELECT 'Z1', 'PW', 'cell 3' FROM DUAL
10 UNION ALL
11 SELECT 'Z1', 'RW', 'cell 6' FROM DUAL
12 UNION ALL
13 SELECT 'Z2', 'RW', 'cell 8' FROM DUAL),
Query begins here:
14 temp
15 AS
16 (SELECT t1.column_A1,
17 t2.column_B2,
18 ROW_NUMBER () OVER (PARTITION BY t1.column_a1 ORDER BY t2.column_b2) rn
19 FROM tab1 t1 JOIN tab2 t2 ON t1.column_A1 = t2.column_A2)
20 SELECT column_a1, column_b2
21 FROM temp
22 WHERE rn = 1;
COLUMN_A1 COLUMN_B2
------------ ------------
Z1 PW
Z2 RW
SQL>;
英文:
One option is to "sort" rows per each column_a1
by value stored in column_b2
and return rows that rank as the highest.
Sample data:
SQL> WITH
2 tab1 (column_a1, column_b1, column_c1)
3 AS
4 (SELECT 'Z1', 'cell 2', 'cell 3' FROM DUAL
5 UNION ALL
6 SELECT 'Z2', 'cell 5', 'cell 6' FROM DUAL),
7 tab2 (column_a2, column_b2, column_c2)
8 AS
9 (SELECT 'Z1', 'PW', 'cell 3' FROM DUAL
10 UNION ALL
11 SELECT 'Z1', 'RW', 'cell 6' FROM DUAL
12 UNION ALL
13 SELECT 'Z2', 'RW', 'cell 8' FROM DUAL),
Query begins here:
14 temp
15 AS
16 (SELECT t1.column_A1,
17 t2.column_B2,
18 ROW_NUMBER () OVER (PARTITION BY t1.column_a1 ORDER BY t2.column_b2) rn
19 FROM tab1 t1 JOIN tab2 t2 ON t1.column_A1 = t2.column_A2)
20 SELECT column_a1, column_b2
21 FROM temp
22 WHERE rn = 1;
COLUMN_A1 COLUMN_B2
------------ ------------
Z1 PW
Z2 RW
SQL>
答案2
得分: 0
这是一个典型的涉及没有主键的表的任务,即存在重复记录,但有一些规则来获取正确的唯一行。
在您的情况下,规则是:
> 如果是PW - 只显示一个具有PW的行;如果不是PW - 只显示一个具有RW的行
您可以使用row_number函数来实现它,使用您的(重复的)键列上的partition by
,并使用order by
来实现您的规则(使用decode),以便排序提供所需的行作为第一行。
示例
select
COLUMN_A2, COLUMN_B2, COLUMN_C2,
row_number() over (partition by COLUMN_A2
order by decode (COLUMN_B2,'PW',1,'RW',2,3),COLUMN_B2) as rn
from tab2;
CO CO COLUMN RN
-- -- ------ ----------
Z1 PW cell 3 1
Z1 RW cell 6 2
连接与您以前使用的相同,只需将rn = 1
谓词添加到on
子句中。
请注意,我将COLUMN_B2
作为第二排序列添加;这是为了当您的两个字符串都不存在时使用最低值的情况。
您应该始终使用这样的order by
列列表,以便它们与partition by
列一起构成唯一键。然后,查询将提供确定性的结果。
英文:
This is a typical task on tables without primary keys, i.e. with duplication where there is some rule how to fetch the proper unique row.
In your case the rule is
> if PW - show only one row with PW; if not PW - show only one row with RW
You implement it using row_number function, partition by
on your (duplicated) key column and order by
implementing your rule (using decode) so that the order provides the required row as first.
Example
select
COLUMN_A2, COLUMN_B2, COLUMN_C2,
row_number() over (partition by COLUMN_A2
order by decode (COLUMN_B2,'PW',1,'RW',2,3),COLUMN_B2) as rn
from tab2;
CO CO COLUMN RN
-- -- ------ ----------
Z1 PW cell 3 1
Z1 RW cell 6 2
The join is the same as you used only adding the rn = 1
predicate to the on
clause.
Note that I added COLUMN_B2
as a second order by column; this is for the case when neither of your two strings are present so the lowest value is used.
You should always use such order by
column list that they together with the partition by
column(s) makes a unique key. Than the query provides a deterministic result.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论