英文:
How to get date that comes after a date column in redshift
问题
我有一个包含每个状态的日期列的表格:
我想要获取接下来的日期,以及通过查看所有日期列来获取```ant_date```之后的下一个状态。到目前为止,我尝试使用```case when```:
这个查询返回:
不幸的是,这个查询给出了错误的结果,```ab124```和```ab125```的状态应该是```dlv```而不是```cc```。这是因为```cc_date```大于```ant_date```。
如果我想要在所有日期列中获取```ant_date```之后的状态,该如何修复我的查询?我的表格在Amazon Redshift上。有人可以帮帮我吗?
非常感谢。
英文:
I have this table full of date columns of each status:
awb manifest_date out_branch_date in_branch_date ant_date dlv_date cc_date
ab123 2023-07-01 2023-07-02 2023-07-03 2023-07-04 2023-07-05 2023-07-04
ab124 2023-07-03 2023-07-03 2023-07-04 2023-07-05 2023-07-06 2023-07-08
ab125 2023-07-04 2023-07-09 2023-07-05 2023-07-06 2023-07-08 2023-07-09
ab127 2023-07-04 2023-07-05 2023-07-06 2023-07-07 2023-07-08 2023-07-05
I want to the next date and also the next status that comes after ant_date
by looking all the date columns. What I have tried so far is using case when
:
select awb,
case when manifest_date > ant_date then 'manifest'
when out_branch_date > ant_date then 'out branch'
when in_branch_date > ant_date then 'in branch'
when dlv_date > ant_date then 'dlv'
when cc_date > ant_date then 'cc' else null end as status_after_ant,
case when manifest_date > ant_date then manifest_date
when out_branch_date > ant_date then out_branch_date
when in_branch_date > ant_date then in_branch_date
when dlv_date > ant_date then dlv_date
when cc_date > ant_date then cc_date else null end as date_status_after_ant
from data
This query returns:
awb status_after_ant date_status_after_ant
ab123 dlv 2023-07-05
ab124 cc 2023-07-08
ab125 cc 2023-07-09
ab127 dlv 2023-07-08
Unfortunately the query gives the wrong results, the status for ab124
and ab125
supposed to be dlv
instead of cc
. It's because the cc_date
is more than ant_date
.
How should I fix my query if I want to the get the status that comes right after ant_date
out of all the date columns. My table is on amazone redshift. Can someone help me?
Thank you in advance.
答案1
得分: 1
您面临的问题是由于数据的枢轴格式引起的。如果数据是"非枢轴",那将会更容易得多。
SELECT
awb,
status,
status_date
FROM data
UNPIVOT (
status FOR status_date IN (manifest_date, out_branch_date, in_branch_date, ant_date, dlv_date, cc_date)
)
这将使数据看起来像这样:
awb status status_date
-------- ------------------- --------------
ab123 manifest_date 2023-07-01
ab123 out_branch_date 2023-07-02
ab123 in_branch_date 2023-07-03
ab123 ant_date 2023-07-04
ab123 dlv_date 2023-07-05
ab123 cc_date 2023-07-04
ab124 manifest_date 2023-07-03
ab124 out_branch_date 2023-07-03
ab124 in_branch_date 2023-07-04
ab124 ant_date 2023-07-05
ab124 dlv_date 2023-07-06
ab124 cc_date 2023-07-08
现在,要推断下一个状态和相关日期将更容易,因为我们可以按日期排序:
SELECT
awb,
status,
status_date,
LEAD(status, 1) OVER (PARTITION BY awb ORDER BY status_date) AS next_status,
LEAD(status_date, 1) OVER (PARTITION BY awb ORDER BY status_date) AS next_status_date
FROM data
UNPIVOT (
status FOR status_date IN (manifest_date, out_branch_date, in_branch_date, ant_date, dlv_date, cc_date)
)
WHERE status = 'ant_date'
结果如下:
awb status status_date next_status next_status_date
-------- ------------ --------------- --------------- -------------------
ab123 ant_date 2023-07-04 dlv_date 2023-07-05
ab124 ant_date 2023-07-05 dlv_date 2023-07-06
ab127 ant_date 2023-07-06 cc_date 2023-07-07
如果Redshift不支持UNPIVOT,那么可以使用以下方式来处理,尽管有点繁琐:
SELECT awb, 'manifest_date' as status, manifest_date as status_date FROM data
UNION ALL
SELECT awb, 'out_branch_date' as status, out_branch_date as status_date FROM data
UNION ALL
SELECT awb, 'in_branch_date' as status, in_branch_date as status_date FROM data
UNION ALL
SELECT awb, 'ant_date' as status, ant_date as status_date FROM data
UNION ALL
SELECT awb, 'dlv_date' as status, dlv_date as status_date FROM data
UNION ALL
SELECT awb, 'cc_date' as status, cc_date as status_date FROM data
英文:
The problem you are facing is due to the pivoted format of your data. It will be ever so much easier if the data is "unpivoted"
SELECT
awb,
status,
status_date
FROM data
UNPIVOT (
status FOR status_date IN (manifest_date, out_branch_date, in_branch_date, ant_date, dlv_date, cc_date)
)
Which will make the data look like this
awb status status_date
-------- ------------------- --------------
ab123 manifest_date 2023-07-01
ab123 out_branch_date 2023-07-02
ab123 in_branch_date 2023-07-03
ab123 ant_date 2023-07-04
ab123 dlv_date 2023-07-05
ab123 cc_date 2023-07-04
ab124 manifest_date 2023-07-03
ab124 out_branch_date 2023-07-03
ab124 in_branch_date 2023-07-04
ab124 ant_date 2023-07-05
ab124 dlv_date 2023-07-06
ab124 cc_date 2023-07-08
Now it is far easier to deduce the next status and associated date, because we can order the dates:
SELECT
awb,
status,
status_date,
LEAD(status, 1) OVER (PARTITION BY awb ORDER BY status_date) AS next_status,
LEAD(status_date, 1) OVER (PARTITION BY awb ORDER BY status_date) AS next_status_date
FROM data
UNPIVOT (
status FOR status_date IN (manifest_date, out_branch_date, in_branch_date, ant_date, dlv_date, cc_date)
)
WHERE status = 'ant_date'
awb status status_date next_status next_status_date
-------- ------------ --------------- --------------- -------------------
ab123 ant_date 2023-07-04 dlv_date 2023-07-05
ab124 ant_date 2023-07-05 dlv_date 2023-07-06
ab127 ant_date 2023-07-06 cc_date 2023-07-07
If Redshift doesn't support UNPIVOT then it's a bit clumsier to do that bit e.g:
SELECT awb, 'manifest_date' as status, manifest_date as status_date FROM data
UNION ALL
SELECT awb, 'out_branch_date' as status, out_branch_date as status_date FROM data
UNION ALL
SELECT awb, 'in_branch_date' as status, in_branch_date as status_date FROM data
UNION ALL
SELECT awb, 'ant_date' as status, ant_date as status_date FROM data
UNION ALL
SELECT awb, 'dlv_date' as status, dlv_date as status_date FROM data
UNION ALL
SELECT awb, 'cc_date' as status, cc_date as status_date FROM data
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论