英文:
LEFT JOIN generating the wrong SUM
问题
对于您提供的内容,以下是翻译好的部分:
Query 1:
查询1:
选择
a.Stock_Tracker
,SUM(a.Qty_Invoiced)作为Qty_Invoiced
,SUM(a.Qty_Received)作为Qty_Received
从 `strange-bird-366417.financial_data.stock_tracker_max` a
其中 (a.stock_tracker) ='WK06 GABONA'
按1组
Query 2:
查询2:
选择
b.Stock_Tracker
,SUM(b.Qty_Shipped)作为Qty_Shipped
从 `strange-bird-366417.financial_data.tbl_shipment_data_final` b
其中 (b.stock_tracker) ='WK06 GABONA'
按1组
Query 3:
查询3:
选择
a.Stock_Tracker
,b.Stock_Tracker
,SUM(a.Qty_Invoiced)作为Qty_Invoiced
,SUM(a.Qty_Received)作为Qty_Received
,SUM(b.Qty_Shipped)作为Qty_Shipped
从 `strange-bird-366417.financial_data.stock_tracker_max` a 左连接
(选择Stock_Tracker,Sum(Qty_Shipped)作为Qty_Shipped从 `strange-bird-366417.financial_data.tbl_shipment_data_final` 按1组) b
on a.Stock_Tracker = b.Stock_Tracker
其中 (a.stock_tracker) ='WK06 GABONA'
按1组
这是您的翻译结果,没有包含任何额外的内容。
英文:
I am trying to do a simple join of two different tables stock_tracker_max
and tbl_shipment_data_final
. However, one of the SUMs I get post join is completely wrong.
Query 1:
SELECT
a.Stock_Tracker
,SUM(a.Qty_Invoiced) as Qty_Invoiced
,SUM(a.Qty_Received) as Qty_Received
FROM `strange-bird-366417.financial_data.stock_tracker_max` a
where (a.stock_tracker) ='WK06 GABONA'
group by 1
Query 2:
SELECT
b.Stock_Tracker
,SUM(b.Qty_Shipped) as Qty_Shipped
FROM `strange-bird-366417.financial_data.tbl_shipment_data_final` b
where (b.stock_tracker) ='WK06 GABONA'
group by 1
Query 3:
SELECT
a.Stock_Tracker
,b.Stock_Tracker
,SUM(a.Qty_Invoiced) as Qty_Invoiced
,SUM(a.Qty_Received) as Qty_Received
,SUM(b.Qty_Shipped) as Qty_Shipped
FROM `strange-bird-366417.financial_data.stock_tracker_max` a LEFT JOIN (SELECT Stock_Tracker,Sum(Qty_Shipped) as Qty_Shipped from `strange-bird-366417.financial_data.tbl_shipment_data_final` group by 1) b
on a.Stock_Tracker = b.Stock_Tracker
where (a.stock_tracker) ='WK06 GABONA'
group by 1,2
I have simplified the task for the purposes of this exercise (this affects more than just one Stock_Tracker
but I just filtered to make it simpler to explain). The first two queries in the above are when I query each table separately - I get the right data in both cases.
However, in the third query (in the image above), when I try join the two tables, I get a completely incorrect value for column Qty_Shipped. The second query gives Qty_Shipped as 4338 but when I do the join this jumps up to 78084.
Both the first query and second query generate just one row (with the two having identical values for Stock_Tracker
).
This is the third query's result:
I am struggling to understand what's causing the issue here.
答案1
得分: 2
以下是您要翻译的部分:
"对于每个 stock_tracker,您正在将每个 stock_tracker_max 行与其 tbl_shipment_data_final 的 sum 进行连接。因此,对于您示例中的 stock_tracker,会获得 shipment sum 十八次。通过首先对 stock_tracker 对两个表进行汇总,然后再加入结果。
SELECT
m.stock_tracker,
m.sum_qty_invoiced,
m.sum_qty_received,
f.sum_qty_shipped
FROM
(
SELECT
stock_tracker,
SUM(qty_invoiced) AS sum_qty_invoiced,
SUM(qty_received) AS sum_qty_received
FROM strange-bird-366417.financial_data.stock_tracker_max
WHERE stock_tracker = 'WK06 GABONA'
GROUP BY stock_tracker
) m
LEFT JOIN
(
SELECT
stock_tracker,
SUM(qty_shipped) AS sum_qty_shipped
FROM strange-bird-366417.financial_data.tbl_shipment_data_final
WHERE stock_tracker = 'WK06 GABONA'
GROUP BY stock_tracker
) f ON f.stock_tracker = m.stock_tracker
ORDER BY m.stock_tracker;
详细解释您的查询
这是为什么您的查询不起作用的原因。假设您有以下数据:
stock_tracker | qty_invoiced | qty_received |
---|---|---|
WK06 GABONA | 10 | 10 |
WK06 GABONA | 20 | 20 |
和
stock_tracker | qty_shipped |
---|---|
WK06 GABONA | 5 |
WK06 GABONA | 7 |
现在,您首先对第二个表进行聚合:
stock_tracker | qty_shipped |
---|---|
WK06 GABONA | 12 |
然后,您将此结果与第一个表连接:
stock_tracker | qty_invoiced | qty_received | qty_shipped |
---|---|---|---|
WK06 GABONA | 10 | 10 | 12 |
WK06 GABONA | 20 | 20 | 12 |
现在,再次进行聚合并构建最终的求和:
stock_tracker | qty_invoiced | qty_received | qty_shipped |
---|---|---|---|
WK06 GABONA | 30 | 30 | 24 |
您已经将第二个表的总和乘以第一个表中匹配行的数量。
英文:
For each stock_tracker, you are joining every stock_tracker_max row with their tbl_shipment_data_final sum. So, with eighteen stock_tracker_max rows for the stock_tracker in your example, you'll get the shipment sum eighteen times. By aggregating again down to one row for the stock_tracker, the resulting sum is hence eighteen times as big as the original sum.
Instead aggregate both tables per stock_tracker first and then join the results.
SELECT
m.stock_tracker,
m.sum_qty_invoiced,
m.sum_qty_received,
f.sum_qty_shipped
FROM
(
SELECT
stock_tracker,
SUM(qty_invoiced) AS sum_qty_invoiced,
SUM(qty_received) AS sum_qty_received
FROM `strange-bird-366417.financial_data.stock_tracker_max`
WHERE stock_tracker = 'WK06 GABONA'
GROUP BY stock_tracker
) m
LEFT JOIN
(
SELECT
stock_tracker,
SUM(qty_shipped) AS sum_qty_shipped
FROM `strange-bird-366417.financial_data.tbl_shipment_data_final`
WHERE stock_tracker = 'WK06 GABONA'
GROUP BY stock_tracker
) f ON f.stock_tracker = m.stock_tracker
ORDER BY m.stock_tracker;
Explaining your query in detail
Here is why your query does not work. Let's say you have this data:
stock_tracker | qty_invoiced | qty_received |
---|---|---|
WK06 GABONA | 10 | 10 |
WK06 GABONA | 20 | 20 |
and
stock_tracker | qty_shipped |
---|---|
WK06 GABONA | 5 |
WK06 GABONA | 7 |
Now you aggregate your second table thus:
stock_tracker | qty_shipped |
---|---|
WK06 GABONA | 12 |
Then you join this result to the first table:
stock_tracker | qty_invoiced | qty_received | qty_shipped |
---|---|---|---|
WK06 GABONA | 10 | 10 | 12 |
WK06 GABONA | 20 | 20 | 12 |
Now you aggregate again and build the final sums:
stock_tracker | qty_invoiced | qty_received | qty_shipped |
---|---|---|---|
WK06 GABONA | 30 | 30 | 24 |
You have multiplied the second table's sum by the number of matching rows in the first table.
答案2
得分: 0
这是翻译好的部分:
"Query 3 中出现这种情况是因为 strange-bird-366417.financial_data.stock_tracker_max
中的每个字符串都与 strange-bird-366417.financial_data.tbl_shipment_data_final
中的 Qty_Shipped
总和相连接。
78084/4338=18,所以您有 18 倍的总和,但如果我理解正确,您需要 18 个值的总和。正确的查询是:
SELECT
a.Stock_Tracker
,b.Stock_Tracker
,SUM(a.Qty_Invoiced) as Qty_Invoiced
,SUM(a.Qty_Received) as Qty_Received
,SUM(b.Qty_Shipped) as Qty_Shipped
FROM `strange-bird-366417.financial_data.stock_tracker_max` a
LEFT JOIN `strange-bird-366417.financial_data.tbl_shipment_data_final` b
on a.Stock_Tracker = b.Stock_Tracker
where (a.stock_tracker) ='WK06 GABONA'
group by 1,2
```"
<details>
<summary>英文:</summary>
It happens because in Query 3 every string from `strange-bird-366417.financial_data.stock_tracker_max` joins with string with sum(Qty_Shipped) from `strange-bird-366417.financial_data.tbl_shipment_data_final`.
78084/4338=18, so you have 18 * sum, but if i understand correctly you need sum of 18 values. Correct query is
SELECT
a.Stock_Tracker
,b.Stock_Tracker
,SUM(a.Qty_Invoiced) as Qty_Invoiced
,SUM(a.Qty_Received) as Qty_Received
,SUM(b.Qty_Shipped) as Qty_Shipped
FROM strange-bird-366417.financial_data.stock_tracker_max
a
LEFT JOIN strange-bird-366417.financial_data.tbl_shipment_data_final
b
on a.Stock_Tracker = b.Stock_Tracker
where (a.stock_tracker) ='WK06 GABONA'
group by 1,2
</details>
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论