在每个分组中使用Presto或MySQL添加缺失值。

huangapple go评论127阅读模式
英文:

Add missing values in each group using presto or mysql

问题

我有以下的表格:

prod_name - quantity - group
tv        - 30      - base 
microwave - 10      - base 
watch     - 5       - base  
phone     - 25      - base 
washer    - 15      - base 
dryer.    - 14      - base  
microwave - 7       - inventory_2021 
phone     - 16      - inventory_2021  
tv        - 30      - inventory_2022 
watch     - 5       - inventory_2022 
phone     - 25      - inventory_2022 

我想要将缺失的记录从base(group)填充/输入到所有其他组中。

期望的输出是:

tv        - 30      - base 
microwave - 10      - base 
watch     - 5       - base 
phone     - 25      - base 
washer    - 15      - base 
dryer.    - 14      - base 
tv        - 30      - inventory_2021 
microwave - 7       - inventory_2021 
watch     - 5       - inventory_2021 
phone     - 16      - inventory_2021 
washer    - 15      - inventory_2021 
dryer.    - 14      - inventory_2021 
tv        - 30      - inventory_2022 
microwave - 10      - inventory_2022 
watch     - 5       - inventory_2022 
phone     - 25      - inventory_2022 
washer    - 15      - inventory_2022 
dryer.    - 14      - inventory_2022 

因此,在输出中,每个组中有相同数量的条目,总条目数应与基础组匹配。

英文:

I have a following table :

prod_name - quantity - group
tv        - 30  - base 
microwave - 10  - base 
watch     - 5   - base  
phone     - 25  - base 
washer    - 15  - base 
dryer.    - 14  - base  
microwave - 7   - inventory_2021 
phone     - 16  - inventory_2021  
tv        - 30  - inventory_2022 
watch     - 5   - inventory_2022 
phone     - 25  - inventory_2022 

I want to fill/ enter missing records from base(group) to all other groups

Expected output :

tv        - 30  - base 
microwave - 10  - base 
watch     - 5   - base 
phone     - 25  - base 
washer    - 15  - base 
dryer.    - 14  - base 
tv        - 30  - inventory_2021 
microwave - 7   - inventory_2021 
watch     - 5   - inventory_2021 
phone     - 16  - inventory_2021 
washer    - 15  - inventory_2021 
dryer.    - 14  - inventory_2021 
tv        - 30  - inventory_2022 
microwave - 10  - inventory_2022 
watch     - 5   - inventory_2022 
phone     - 25  - inventory_2022 
washer    - 15  - inventory_2022 
dryer.    - 14  - inventory_2022 

So in the output I have same number of entries in each group and total number of entries should match the base group.

答案1

得分: 0

使用CROSS JOIN来创建所有缺失的行。然后,您可以与原始表格进行LEFT JOIN以填补空缺。

WITH all_prods_and_groups AS (
    SELECT p1.prod_name, p1.quantity, p2.group
    FROM products AS p1
    CROSS JOIN (
        SELECT DISTINCT `group`
        FROM products
    ) AS p2
    WHERE p1.group = 'base'
)
SELECT t1.prod_name, 
    IFNULL(t2.quantity, t1.quantity) AS quantity, 
    IFNULL(t2.group, t1.group) AS `group`
FROM all_prods_and_groups AS t1
LEFT JOIN products AS t2 ON t1.prod_name = t2.prod_name AND t1.group = t2.group

演示链接

英文:

Use CROSS JOIN to create all the missing rows. Then you can LEFT JOIN this with the original table to fill in the gaps.

WITH all_prods_and_groups AS (
    SELECT p1.prod_name, p1.quantity, p2.group
    FROM products AS p1
    CROSS JOIN (
        SELECT DISTINCT `group`
        FROM products
    ) AS p2
    WHERE p1.group = 'base'
)
SELECT t1.prod_name, 
    IFNULL(t2.quantity, t1.quantity) AS quantity, 
    IFNULL(t2.group, t1.group) AS `group`
FROM all_prods_and_groups AS t1
LEFT JOIN products AS t2 ON t1.prod_name = t2.prod_name AND t1.group = t2.group

DEMO

答案2

得分: 0

MySQL和Presto/Trino是两种稍微不同的SQL方言。基本思想是相同的 - 构建产品和组的“基本”笛卡尔积。如果您确信所有产品都存在于基本表中,您可以简单地使用以下Presto/Trino查询:

-- 示例数据
WITH dataset(prod_name, quantity, "group") as (
    values ('tv'       , 30, 'base'),
        ('microwave', 10, 'base'),
        ('watch'    , 5 , 'base'),
        ('phone'    , 25, 'base'),
        ('washer'   , 15, 'base'),
        ('dryer'   , 14, 'base'),
        ('microwave', 7 , 'inventory_2021'),
        ('phone'    , 16, 'inventory_2021'),
        ('tv'       , 30, 'inventory_2022'),
        ('watch'    , 5 , 'inventory_2022'),
        ('phone'    , 25, 'inventory_2022')
),
-- 查询部分
base_products AS (
    SELECT prod_name, quantity
    FROM dataset AS p1
    WHERE "group" = 'base' -- 选择所有基本产品
),
all_groups as (
    SELECT distinct "group" grp
    FROM dataset
),
base_all as(
    SELECT b.prod_name, b.quantity, grp -- 构建基本查找表
    FROM base_products b
    CROSS JOIN all_groups
)

select b.prod_name,
    coalesce(d.quantity, b.quantity) quantity, -- 使用“原始”数量
    b.grp "group"
from base_all b
left join dataset d on b.prod_name = d.prod_name and b.grp = d."group"
;

输出:

prod_name quantity group
tv 30 inventory_2021
microwave 7 inventory_2021
watch 5 inventory_2021
phone 16 inventory_2021
washer 15 inventory_2021
dryer 14 inventory_2021
tv 30 base
microwave 10 base
watch 5 base
phone 25 base
washer 15 base
dryer 14 base
tv 30 inventory_2022
microwave 10 inventory_2022
watch 5 inventory_2022
phone 25 inventory_2022
washer 15 inventory_2022
dryer 14 inventory_2022
英文:

MySQL and Presto/Trino are two a bit SQL different dialects. The basic idea is the same - build the "base" cartesian product of products and groups. If you are sure that all products are present in base you can go simply with the following for Presto/Trino:

-- sample data
WITH dataset(prod_name, quantity, "group") as (
    values ('tv'       , 30, 'base'),
        ('microwave', 10, 'base'),
        ('watch'    , 5 , 'base'),
        ('phone'    , 25, 'base'),
        ('washer'   , 15, 'base'),
        ('dryer'   , 14, 'base'),
        ('microwave', 7 , 'inventory_2021'),
        ('phone'    , 16, 'inventory_2021'),
        ('tv'       , 30, 'inventory_2022'),
        ('watch'    , 5 , 'inventory_2022'),
        ('phone'    , 25, 'inventory_2022')
),
-- query parts
base_products AS (
    SELECT prod_name, quantity
    FROM dataset AS p1
    WHERE "group" = 'base' -- select all base products
),
all_groups as (
    SELECT distinct "group" grp
    FROM dataset
),
base_all as(
    SELECT b.prod_name, b.quantity, grp -- build base lookup table
    FROM base_products b
    CROSS JOIN all_groups
)

select b.prod_name,
    coalesce(d.quantity, b.quantity) quantity, -- use "original" quantity
    b.grp "group"
from base_all b
left join dataset d on b.prod_name = d.prod_name and b.grp = d."group"
;

Output:

prod_name quantity group
tv 30 inventory_2021
microwave 7 inventory_2021
watch 5 inventory_2021
phone 16 inventory_2021
washer 15 inventory_2021
dryer 14 inventory_2021
tv 30 base
microwave 10 base
watch 5 base
phone 25 base
washer 15 base
dryer 14 base
tv 30 inventory_2022
microwave 10 inventory_2022
watch 5 inventory_2022
phone 25 inventory_2022
washer 15 inventory_2022
dryer 14 inventory_2022

huangapple
  • 本文由 发表于 2023年7月7日 08:08:37
  • 转载请务必保留本文链接:https://go.coder-hub.com/76633187.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定