英文:
fill the gaps and assign the most recent available values in mysql
问题
我们想要"填补记录ID的空白",并使用最近可用的值填充缺失的值。如果没有可用的值,就使用第一个可用的值。结果应该如下所示:
category | record_id | value |
---|---|---|
A | 1 | 0.01 |
A | 2 | 0.23 |
A | 3 | 0.23 |
A | 4 | 0.23 |
A | 5 | 0.15 |
A | 6 | 0.20 |
A | 7 | 0.08 |
B | 1 | 1.00 |
B | 2 | 1.00 |
B | 3 | 0.75 |
B | 4 | 0.75 |
B | 5 | 0.75 |
B | 6 | 0.93 |
B | 7 | 0.87 |
我尝试使用CTE来满足上述要求,并创建了另一个带有序列的表,然后尝试找到缺失的值。我只是试图获取最近可用的值并分配给缺失的数字。有人可以提出建议吗?
使用以下代码来实现:
with table_1 as (
select 'A' as category, 1 as record_id, 0.01 as value
union all
select 'A', 2, 0.23
union all
select 'A', 5, 0.15
union all
select 'A', 6, 0.20
union all
select 'A', 7, 0.08
union all
select 'B', 2, 1.00
union all
select 'B', 3, 0.75
union all
select 'B', 6, 0.93
union all
select 'B', 7, 0.87
), data as (
select t.*, lead(record_id) over(partition by category order by record_id) lead_record_id
from table_1 t
union all
select category, 1, null, min(record_id) from table_1 group by category having min(record_id) > 1
), rcte as (
select category, record_id, value, lead_record_id from data
union all
select category, record_id + 1, value, lead_record_id from data where record_id + 1 < lead_record_id
)
select category, record_id, value from rcte order by category, record_id;
使用以上代码,你可以得到所需的结果。
英文:
we want to "fill in the gaps" of the record_ids and fill in missing values with the most recent available value. If there is no value available, use the first available value instead. The result should look like:
category | record_id | value |
---|---|---|
A | 1 | 0.01 |
A | 2 | 0.23 |
A | 3 | 0.23 |
A | 4 | 0.23 |
A | 5 | 0.15 |
A | 6 | 0.20 |
A | 7 | 0.08 |
B | 1 | 1.00 |
B | 2 | 1.00 |
B | 3 | 0.75 |
B | 4 | 0.75 |
B | 5 | 0.75 |
B | 6 | 0.93 |
B | 7 | 0.87 |
I tried to approach the above requirement with cte, and created an another table with sequence and from there I tried to find the missing values. I am just trying to get the most recent available values and assign to the missing numbers. Can someone suggest on this
with table_1
as (
select 'A' as category
,1 as record_id
,0.01 as value
union all
select 'A', 2, 0.23 union all
select 'A', 5, 0.15 union all
select 'A', 6, 0.20 union all
select 'A', 7, 0.08 union all
select 'B', 2, 1.00 union all
select 'B', 3, 0.75 union all
select 'B', 6, 0.93 union all
select 'B', 7, 0.87
),seq_num as(
select 1 as record_id union all
select 2 union all
select 3 union all
select 4 union all
select 5 union all
select 6 union all
select 7),missing_seq
as (
select t1.record_id from seq_num t1 where t1.record_id not in (select t2.record_id from table_1 t2 where t2.category = 'A')
union all
select t1.record_id from seq_num t1 where t1.record_id not in (select t2.record_id from table_1 t2 where t2.category = 'B')
)
select * from missing_seq
with this i am missing 4 from A and 5 from B category
with table_1
as (
select 'A' as category
,1 as record_id
,0.01 as value
union all
select 'A', 2, 0.23 union all
select 'A', 5, 0.15 union all
select 'A', 6, 0.20 union all
select 'A', 7, 0.08 union all
select 'B', 2, 1.00 union all
select 'B', 3, 0.75 union all
select 'B', 6, 0.93 union all
select 'B', 7, 0.87
),
data as (
select t.*, lead(record_id) over(partition by category order by record_id) lead_record_id
from table_1 t
union all
select category, 1, null, min(record_id) from table_1 group by category having min(record_id) > 1
),
rcte as (
select category, record_id, value, lead_record_id from data
union all
select category, record_id + 1, value, lead_record_id from data where record_id + 1 < lead_record_id
)
select category, record_id, value from rcte order by category, record_id;
答案1
得分: 1
以下是代码的翻译部分:
一种选项使用递归查询:
with recursive
data as (
select t.*, lead(record_id) over(partition by category order by record_id) lead_record_id
from mytable t
),
rcte as (
select category, record_id, value, lead_record_id from data
union all
select category, record_id + 1, value, lead_record_id from rcte where record_id + 1 < lead_record_id
)
select category, record_id, value from rcte order by category, record_id
第一个公共表达式(CTE)计算每行的“下一个”记录 ID。然后,递归 CTE 生成所需数量的行以填充间隙(如果有的话),同时为新行分配“最新”的值。
如果您希望所有类别从记录 ID 1
开始(即使该类别的最小记录 ID 大于该值),那么我们可以调整第一个 CTE:
with recursive
data as (
select t.*, lead(record_id) over(partition by category order by record_id) lead_record_id
from mytable t
union all
select category, 1, null, min(record_id) from mytable group by category having min(record_id) > 1
),
rcte as (
select category, record_id, value, lead_record_id from data
union all
select category, record_id + 1, value, lead_record_id from rcte where record_id + 1 < lead_record_id
)
select category, record_id, value from rcte order by category, record_id
请注意,这将在“第一个”记录上分配 null
值,因为在那一点上没有可用的前一个值。
category | record_id | value |
---|---|---|
A | 1 | 0.01 |
A | 2 | 0.23 |
A | 3 | 0.23 |
A | 4 | 0.23 |
A | 5 | 0.15 |
A | 6 | 0.20 |
A | 7 | 0.08 |
B | 1 | null |
B | 2 | 1.00 |
B | 3 | 0.75 |
B | 4 | 0.75 |
B | 5 | 0.75 |
B | 6 | 0.93 |
B | 7 | 0.87 |
英文:
One option uses a recursive query:
with recursive
data as (
select t.*, lead(record_id) over(partition by category order by record_id) lead_record_id
from mytable t
),
rcte as (
select category, record_id, value, lead_record_id from data
union all
select category, record_id + 1, value, lead_record_id from rcte where record_id + 1 < lead_record_id
)
select category, record_id, value from rcte order by category, record_id
The first CTE computes the "next" record id to each row. Then, the recursive CTE generates as many rows as needed to fill the gaps (if any), while assigning the "latest" values to new rows.
If you want all categories to start from record id 1
(even if the minimum record id for that category is greater than that), then we can adapt the first CTE:
with recursive
data as (
select t.*, lead(record_id) over(partition by category order by record_id) lead_record_id
from mytable t
union all
select category, 1, null, min(record_id) from mytable group by category having min(record_id) > 1
),
rcte as (
select category, record_id, value, lead_record_id from data
union all
select category, record_id + 1, value, lead_record_id from rcte where record_id + 1 < lead_record_id
)
select category, record_id, value from rcte order by category, record_id
Note that this will assign a null
values to the "first" records, since there is no previous value available at that point.
category | record_id | value |
---|---|---|
A | 1 | 0.01 |
A | 2 | 0.23 |
A | 3 | 0.23 |
A | 4 | 0.23 |
A | 5 | 0.15 |
A | 6 | 0.20 |
A | 7 | 0.08 |
B | 1 | null |
B | 2 | 1.00 |
B | 3 | 0.75 |
B | 4 | 0.75 |
B | 5 | 0.75 |
B | 6 | 0.93 |
B | 7 | 0.87 |
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论