SQL查询以连接具有重复行的表并获取唯一行

huangapple go评论107阅读模式
英文:

SQL query to join tables with duplicate rows and getting unique rows

问题

以下是翻译好的部分:

SELECT 
    item.id, ich.id, ich.existing_stores, mid.missing_stores
FROM 
    `hd-merch-prod.merch_item_cache_validation.item` item 
JOIN
    `hd-merch-prod.merch_item_cache.item_change_history` ich 
         ON item.date = "2023-08-03" 
         AND DATE(ich.createdTime) = item.date 
         AND ich.id = item.id  
JOIN
    `hd-merch-prod.merch_item_cache.mic_item_discrepancy` mid 
         ON item.id = mid.id
GROUP BY
    item.id, ich.id, ich.existing_stores, mid.missing_stores;

希望这能帮助您。

英文:

I got the following three tables in BigQuery along with the data below.

I'd like to write a SQL query using id and current date columns and get the following row as the output.

Expected result set:

id            existing_stores                             missing_stores
1003812607    "3640,0130,0131,2306,3638,0127,2789,2305"   "3102,2681,2686,2670,2682,3101,2673,2669,3103,2668"

These are the tables:

item table:

id            date
------------------------
1003812607    2023-08-03
1003812607    2023-08-01
1003812607    2023-07-23
1003812607    2023-06-30

item_change_history:

createdTime	                    docType	   id	    existing_stores
---------------------------------------------------------------------------------------------
2023-08-03 11:01:10.139617 UTC	Item	1003812607	"3640,0130,0131,2306,3638,0127,2789,2305"
2023-07-01 09:01:10.139617 UTC	Item	1003812607	"3640,0130,0131,2306,3638,0127,2789,2301"

mic_item_discrepancy:

ID	        MISSING_STORE
-------------------------
1003812607	3102
1003812607	2681
1003812607	2686
1003812607	2670
1003812607	2682
1003812607	3101
1003812607	2673
1003812607	2669
1003812607	3103
1003812607	2668

I tried to come up with this query and it is not working as expected or giving me the wrong data given that duplicate id rows in
mic_item_discrepancy table.

SELECT 
    item.id, ich.id, ich.existing_stores, mid.missing_stores
FROM 
    `hd-merch-prod.merch_item_cache_validation.item` item 
JOIN
    `hd-merch-prod.merch_item_cache.item_change_history` ich 
         ON item.date = "2023-08-03" 
         AND DATE(ich.createdTime) = item.date 
         AND ich.id = item.id  
JOIN
    `hd-merch-prod.merch_item_cache.mic_item_discrepancy` mid 
         ON item.id = mid.id
GROUP BY
    item.id, ich.id, ich.existing_stores, mid.missing_stores;

答案1

得分: 2

尝试在item_change_historymic_item_discrepancy表上都使用LEFT JOIN,以确保结果中包括所有来自项目表的行,并在STRING_AGG函数中添加DISTINCT

SELECT 
    item.id, ich.id, ich.existing_stores, COALESCE(mid.missing_stores, '') AS missing_stores
FROM 
    `hd-merch-prod.merch_item_cache_validation.item` item 
LEFT JOIN
    (
      SELECT id, existing_stores, MAX(createdTime) as latest_createdTime
      FROM 
        `hd-merch-prod.merch_item_cache.item_change_history`
      WHERE 
        DATE(createdTime) = '2023-08-03'
      GROUP BY 
        id, existing_stores
    ) ich ON item.id = ich.id  
LEFT JOIN
    (
      SELECT ID, STRING_AGG(DISTINCT MISSING_STORE, ',') AS missing_stores 
      FROM 
        `hd-merch-prod.merch_item_cache.mic_item_discrepancy`
      GROUP BY 
        ID
    ) mid ON item.id = mid.ID;
英文:

Try to use LEFT JOIN for both the item_change_history and mic_item_discrepancy tables to ensure that all rows from the item table are included in the result, and add a DISTINCT in the STRING_AGG function:

SELECT 
    item.id, ich.id, ich.existing_stores, COALESCE(mid.missing_stores, '') AS missing_stores
FROM 
    `hd-merch-prod.merch_item_cache_validation.item` item 
LEFT JOIN
    (
      SELECT id, existing_stores, MAX(createdTime) as latest_createdTime
      FROM 
        `hd-merch-prod.merch_item_cache.item_change_history`
      WHERE 
        DATE(createdTime) = '2023-08-03'
      GROUP BY 
        id, existing_stores
    ) ich ON item.id = ich.id  
LEFT JOIN
    (
      SELECT ID, STRING_AGG(DISTINCT MISSING_STORE, ',') AS missing_stores 
      FROM 
        `hd-merch-prod.merch_item_cache.mic_item_discrepancy`
      GROUP BY 
        ID
    ) mid ON item.id = mid.ID;

huangapple
  • 本文由 发表于 2023年8月4日 06:22:33
  • 转载请务必保留本文链接:https://go.coder-hub.com/76831920.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定