LEFT JOIN生成了错误的SUM

huangapple go评论52阅读模式
英文:

LEFT JOIN generating the wrong SUM

问题

对于您提供的内容,以下是翻译好的部分:

Query 1:

查询1:
选择
a.Stock_Tracker
,SUM(a.Qty_Invoiced)作为Qty_Invoiced
,SUM(a.Qty_Received)作为Qty_Received

从 `strange-bird-366417.financial_data.stock_tracker_max` a

其中 (a.stock_tracker) ='WK06 GABONA'
按1组

Query 2:

查询2:
选择
b.Stock_Tracker
,SUM(b.Qty_Shipped)作为Qty_Shipped

从 `strange-bird-366417.financial_data.tbl_shipment_data_final`  b 

其中 (b.stock_tracker) ='WK06 GABONA'
按1组

Query 3:

查询3:
选择
a.Stock_Tracker
,b.Stock_Tracker
,SUM(a.Qty_Invoiced)作为Qty_Invoiced
,SUM(a.Qty_Received)作为Qty_Received
,SUM(b.Qty_Shipped)作为Qty_Shipped

从 `strange-bird-366417.financial_data.stock_tracker_max` a 左连接
(选择Stock_Tracker,Sum(Qty_Shipped)作为Qty_Shipped从 `strange-bird-366417.financial_data.tbl_shipment_data_final` 按1组) b 
on a.Stock_Tracker = b.Stock_Tracker 

其中 (a.stock_tracker) ='WK06 GABONA'
按1组

这是您的翻译结果,没有包含任何额外的内容。

英文:

I am trying to do a simple join of two different tables stock_tracker_max and tbl_shipment_data_final. However, one of the SUMs I get post join is completely wrong.

Query 1:

SELECT 

a.Stock_Tracker
,SUM(a.Qty_Invoiced) as Qty_Invoiced
,SUM(a.Qty_Received) as Qty_Received


FROM `strange-bird-366417.financial_data.stock_tracker_max` a

where (a.stock_tracker) ='WK06 GABONA'
group by 1

Query 2:

SELECT 

b.Stock_Tracker
,SUM(b.Qty_Shipped) as Qty_Shipped


FROM `strange-bird-366417.financial_data.tbl_shipment_data_final`  b 

where (b.stock_tracker) ='WK06 GABONA'
group by 1

Query 3:

SELECT 

a.Stock_Tracker
,b.Stock_Tracker
,SUM(a.Qty_Invoiced) as Qty_Invoiced
,SUM(a.Qty_Received) as Qty_Received
,SUM(b.Qty_Shipped) as Qty_Shipped


FROM `strange-bird-366417.financial_data.stock_tracker_max` a LEFT JOIN (SELECT Stock_Tracker,Sum(Qty_Shipped) as Qty_Shipped from `strange-bird-366417.financial_data.tbl_shipment_data_final` group by 1) b 
on a.Stock_Tracker = b.Stock_Tracker 

where (a.stock_tracker) ='WK06 GABONA'
group by 1,2

I have simplified the task for the purposes of this exercise (this affects more than just one Stock_Tracker but I just filtered to make it simpler to explain). The first two queries in the above are when I query each table separately - I get the right data in both cases.

However, in the third query (in the image above), when I try join the two tables, I get a completely incorrect value for column Qty_Shipped. The second query gives Qty_Shipped as 4338 but when I do the join this jumps up to 78084.

Both the first query and second query generate just one row (with the two having identical values for Stock_Tracker).

This is the third query's result:

LEFT JOIN生成了错误的SUM

I am struggling to understand what's causing the issue here.

答案1

得分: 2

以下是您要翻译的部分:

"对于每个 stock_tracker,您正在将每个 stock_tracker_max 行与其 tbl_shipment_data_final 的 sum 进行连接。因此,对于您示例中的 stock_tracker,会获得 shipment sum 十八次。通过首先对 stock_tracker 对两个表进行汇总,然后再加入结果。

SELECT
m.stock_tracker,
m.sum_qty_invoiced,
m.sum_qty_received,
f.sum_qty_shipped
FROM
(
SELECT
stock_tracker,
SUM(qty_invoiced) AS sum_qty_invoiced,
SUM(qty_received) AS sum_qty_received
FROM strange-bird-366417.financial_data.stock_tracker_max
WHERE stock_tracker = 'WK06 GABONA'
GROUP BY stock_tracker
) m
LEFT JOIN
(
SELECT
stock_tracker,
SUM(qty_shipped) AS sum_qty_shipped
FROM strange-bird-366417.financial_data.tbl_shipment_data_final
WHERE stock_tracker = 'WK06 GABONA'
GROUP BY stock_tracker
) f ON f.stock_tracker = m.stock_tracker
ORDER BY m.stock_tracker;

详细解释您的查询

这是为什么您的查询不起作用的原因。假设您有以下数据:

stock_tracker qty_invoiced qty_received
WK06 GABONA 10 10
WK06 GABONA 20 20

stock_tracker qty_shipped
WK06 GABONA 5
WK06 GABONA 7

现在,您首先对第二个表进行聚合:

stock_tracker qty_shipped
WK06 GABONA 12

然后,您将此结果与第一个表连接:

stock_tracker qty_invoiced qty_received qty_shipped
WK06 GABONA 10 10 12
WK06 GABONA 20 20 12

现在,再次进行聚合并构建最终的求和:

stock_tracker qty_invoiced qty_received qty_shipped
WK06 GABONA 30 30 24

您已经将第二个表的总和乘以第一个表中匹配行的数量。

英文:

For each stock_tracker, you are joining every stock_tracker_max row with their tbl_shipment_data_final sum. So, with eighteen stock_tracker_max rows for the stock_tracker in your example, you'll get the shipment sum eighteen times. By aggregating again down to one row for the stock_tracker, the resulting sum is hence eighteen times as big as the original sum.

Instead aggregate both tables per stock_tracker first and then join the results.

SELECT
  m.stock_tracker,
  m.sum_qty_invoiced,
  m.sum_qty_received,
  f.sum_qty_shipped
FROM
(
  SELECT 
    stock_tracker,
    SUM(qty_invoiced) AS sum_qty_invoiced,
    SUM(qty_received) AS sum_qty_received
  FROM `strange-bird-366417.financial_data.stock_tracker_max`
  WHERE stock_tracker = 'WK06 GABONA'
  GROUP BY stock_tracker
) m
LEFT JOIN
(
  SELECT 
    stock_tracker,
    SUM(qty_shipped) AS sum_qty_shipped
  FROM `strange-bird-366417.financial_data.tbl_shipment_data_final`
  WHERE stock_tracker = 'WK06 GABONA'
  GROUP BY stock_tracker
) f ON f.stock_tracker = m.stock_tracker
ORDER BY m.stock_tracker;

Explaining your query in detail

Here is why your query does not work. Let's say you have this data:

stock_tracker qty_invoiced qty_received
WK06 GABONA 10 10
WK06 GABONA 20 20

and

stock_tracker qty_shipped
WK06 GABONA 5
WK06 GABONA 7

Now you aggregate your second table thus:

stock_tracker qty_shipped
WK06 GABONA 12

Then you join this result to the first table:

stock_tracker qty_invoiced qty_received qty_shipped
WK06 GABONA 10 10 12
WK06 GABONA 20 20 12

Now you aggregate again and build the final sums:

stock_tracker qty_invoiced qty_received qty_shipped
WK06 GABONA 30 30 24

You have multiplied the second table's sum by the number of matching rows in the first table.

答案2

得分: 0

这是翻译好的部分:

"Query 3 中出现这种情况是因为 strange-bird-366417.financial_data.stock_tracker_max 中的每个字符串都与 strange-bird-366417.financial_data.tbl_shipment_data_final 中的 Qty_Shipped 总和相连接。
78084/4338=18,所以您有 18 倍的总和,但如果我理解正确,您需要 18 个值的总和。正确的查询是:

SELECT 
a.Stock_Tracker
,b.Stock_Tracker
,SUM(a.Qty_Invoiced) as Qty_Invoiced
,SUM(a.Qty_Received) as Qty_Received
,SUM(b.Qty_Shipped) as Qty_Shipped
FROM `strange-bird-366417.financial_data.stock_tracker_max` a 
LEFT JOIN `strange-bird-366417.financial_data.tbl_shipment_data_final` b
on a.Stock_Tracker = b.Stock_Tracker 
where (a.stock_tracker) ='WK06 GABONA'
group by 1,2
```"

<details>
<summary>英文:</summary>

It happens because in Query 3 every string from `strange-bird-366417.financial_data.stock_tracker_max` joins with string with sum(Qty_Shipped) from `strange-bird-366417.financial_data.tbl_shipment_data_final`.
78084/4338=18, so you have 18 * sum, but if i understand correctly you need sum of 18 values. Correct query is

SELECT
a.Stock_Tracker
,b.Stock_Tracker
,SUM(a.Qty_Invoiced) as Qty_Invoiced
,SUM(a.Qty_Received) as Qty_Received
,SUM(b.Qty_Shipped) as Qty_Shipped
FROM strange-bird-366417.financial_data.stock_tracker_max a
LEFT JOIN strange-bird-366417.financial_data.tbl_shipment_data_final b
on a.Stock_Tracker = b.Stock_Tracker
where (a.stock_tracker) ='WK06 GABONA'
group by 1,2


</details>



huangapple
  • 本文由 发表于 2023年3月4日 03:58:04
  • 转载请务必保留本文链接:https://go.coder-hub.com/75631388.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定