英文:
Join two sql tables using multiple possible combinations
问题
我需要一种方法来连接两个表,可以处理默认连接方法失败的情况。
假设我有两个表,Customers表和Orders表,Customers表包含以下字段:
CustomerId | First Name | Last Name | DOB | Address |
---|---|---|---|---|
1 | John | Smith | 01/01/2000 | 1 Main Street |
而Orders表包含以下字段:
OrderId | CustomerId | Customer FName | Customer LName | Delivery Address | Other Order Info |
---|---|---|---|---|---|
5054 | 1 | John | Smith | 1 Main Street | blah |
显然,连接这两个表的一种方式是使用Customer Id。
但是,如何构建一个查询,以便在没有匹配的Customer Id的情况下,切换到第二种连接方法,即根据地址进行连接,然后如果这也失败了,再使用First Name + Last Name作为备用连接方法(我知道这不是一个好的连接方式,但在这个示例中,请忽略这一点)?
谢谢。
英文:
I need a way to join two tables that can handle the case where the default join method fails.
Lets say I have two tables, Customers with the fields you'd expect
CustomerId | First Name | Last Name | DOB | Address |
---|---|---|---|---|
1 | John | Smith | 01/01/2000 | 1 Main Street |
And Orders
OrderId | CustomerId | Customer FName | Customer LName | Delivery Address | Other Order Info |
---|---|---|---|---|---|
5054 | 1 | John | Smith | 1 Main Street | blah |
The obvious way to join these two tables is on Customer Id.
However, how do I structure a query so that in the case when there is no matching Customer Id, it switches to a second join method, i.e. on address, then if that fails, another fall back of First Name + Last Name (I know that's a bad join but roll with it for this example)
Thanks
答案1
得分: 1
我可能会这样做:
with
first_q as (
SELECT
c.CustomerId as customer_id
,o.OrderId as order_id
FROM
Customers AS c
join Orders AS o
on c.CustomerId = o.CustomerId
)
,second_q as (
SELECT
c.CustomerId as customer_id
,o.OrderId as order_id
FROM
Customers AS c
join Orders AS o
on c.[Address] = o.[Delivery Address]
WHERE
o.OrderId NOT IN (SELECT order_id FROM first_q)
)
,third_q as (
SELECT
c.CustomerId as customer_id
,o.OrderId as order_id
FROM
Customers AS c
join Orders AS o
on concat(c.[First Name], ' ', c.[Last Name]) = concat(o.[Customer FName], ' ', o.[Customer LName])
WHERE
o.OrderId NOT IN (SELECT order_id FROM first_q)
and o.OrderId NOT IN (SELECT order_id FROM second_q)
)
SELECT
customer_id
,order_id
FROM
first_q
UNION
SELECT
customer_id
,order_id
FROM
second_q
UNION
SELECT
customer_id
,order_id
FROM
third_q
;
假设后两个连接可能有多个匹配项,我会添加一些排名函数以仅检索 OrderId 的第一个结果。
英文:
I would probably do something like this:
with
first_q as (
SELECT
c.CustomerId as customer_id
,o.OrderId as order_id
FROM
Customers AS c
join Orders AS o
on c.CustomerId = o.CustomerId
)
,second_q as (
SELECT
c.CustomerId as customer_id
,o.OrderId as order_id
FROM
Customers AS c
join Orders AS o
on c.[Address] = o.[Delivery Address]
WHERE
o.OrderId NOT IN (SELECT order_id FROM first_q)
)
,third_q as (
SELECT
c.CustomerId as customer_id
,o.OrderId as order_id
FROM
Customers AS c
join Orders AS o
on concat(c.[First Name], ' ', c.[Last Name]) = concat(o.[Customer FName], ' ', o.[Customer LName])
WHERE
o.OrderId NOT IN (SELECT order_id FROM first_q)
and o.OrderId NOT IN (SELECT order_id FROM second_q)
)
SELECT
customer_id
,order_id
FROM
first_q
UNION
SELECT
customer_id
,order_id
FROM
second_q
UNION
SELECT
customer_id
,order_id
FROM
third_q
;
Assuming that there might be multiple matches by the two latter joins, I would add some ranking functions to retrieve only the first result by OrderId.
答案2
得分: 1
以下是您要翻译的内容的翻译部分:
如果您不想使用默认的连接方法失败时的CTE,可以使用备用连接条件来处理情况,可以使用带有coalesce语句的左连接来优先考虑连接条件。我会尝试以下查询:
我在订单表上执行了多次左连接,使用不同的连接条件。COALESCE函数用于选择每个相应字段的第一个非空值。
第一个左连接尝试使用默认条件o.CustomerId = c.CustomerId进行连接。
如果第一个连接失败(c.CustomerId IS NULL),则第二个左连接尝试匹配地址o.[Delivery Address] = c2.Address。
如果第一个和第二个连接都失败(c.CustomerId IS NULL和c2.Address IS NULL),则第三个左连接将回退到使用名字和姓氏的连接条件o.[Customer FName] = c3.FirstName和o.[Customer LName] = c3.LastName。
通过这种方式,查询将按顺序尝试不同的连接条件,并返回匹配行的第一个非空值。
英文:
In case you don't want to use a CTE where the default join method fails, you can handle the scenario with alternative join conditions, you can use left joins with coalesce statements to prioritize the join conditions. I would try
SELECT
o.OrderId,
o.CustomerId,
COALESCE(c.FirstName, c2.FirstName) AS FirstName,
COALESCE(c.LastName, c2.LastName) AS LastName,
COALESCE(c.DOB, c2.DOB) AS DOB,
COALESCE(c.Address, c2.Address) AS Address,
o.[Delivery Address],
o.[Other Order Info]
FROM
Orders AS o
LEFT JOIN
Customers AS c ON o.CustomerId = c.CustomerId
LEFT JOIN
Customers AS c2 ON o.[Delivery Address] = c2.Address
AND c.CustomerId IS NULL
LEFT JOIN
Customers AS c3 ON o.[Customer FName] = c3.FirstName
AND o.[Customer LName] = c3.LastName
AND c.CustomerId IS NULL
AND c2.Address IS NULL;
I performed left joins on the Customers table multiple times with different join conditions. The COALESCE function is used to select the first non-null value for each respective field.
The first left join attempts to join on the default condition o.CustomerId = c.CustomerId.
If the first join fails (c.CustomerId IS NULL), the second left join tries to match the address o.[Delivery Address] = c2.Address.
If both the first and second join fails (c.CustomerId IS NULL and c2.Address IS NULL), the third left join falls back to the join condition using the first name and last name o.[Customer FName] = c3.FirstName and o.[Customer LName] = c3.LastName.
In this way, the query will try different join conditions in sequence and return the first non-null values from the matching rows.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论