如何获取Redshift中日期列后的日期。

huangapple go评论88阅读模式
英文:

How to get date that comes after a date column in redshift

问题

我有一个包含每个状态的日期列的表格:

我想要获取接下来的日期,以及通过查看所有日期列来获取```ant_date```之后的下一个状态。到目前为止,我尝试使用```case when```:

这个查询返回:

不幸的是,这个查询给出了错误的结果,```ab124```和```ab125```的状态应该是```dlv```而不是```cc```。这是因为```cc_date```大于```ant_date```。

如果我想要在所有日期列中获取```ant_date```之后的状态,该如何修复我的查询?我的表格在Amazon Redshift上。有人可以帮帮我吗?

非常感谢。
英文:

I have this table full of date columns of each status:

awb    manifest_date  out_branch_date  in_branch_date  ant_date    dlv_date      cc_date
ab123   2023-07-01      2023-07-02      2023-07-03    2023-07-04   2023-07-05    2023-07-04
ab124   2023-07-03      2023-07-03      2023-07-04    2023-07-05   2023-07-06    2023-07-08
ab125   2023-07-04      2023-07-09      2023-07-05    2023-07-06   2023-07-08    2023-07-09
ab127   2023-07-04      2023-07-05      2023-07-06    2023-07-07   2023-07-08    2023-07-05

I want to the next date and also the next status that comes after ant_date by looking all the date columns. What I have tried so far is using case when:

select awb,
       case when manifest_date > ant_date then 'manifest'
            when out_branch_date > ant_date then 'out branch'
            when in_branch_date > ant_date then 'in branch'
            when dlv_date > ant_date then 'dlv'
            when cc_date > ant_date then 'cc' else null end as status_after_ant,
       case when manifest_date > ant_date then manifest_date
            when out_branch_date > ant_date then out_branch_date
            when in_branch_date > ant_date then in_branch_date
            when dlv_date > ant_date then dlv_date
            when cc_date > ant_date then cc_date else null end as date_status_after_ant
from data

This query returns:

awb      status_after_ant     date_status_after_ant
ab123      dlv                  2023-07-05
ab124      cc                   2023-07-08
ab125      cc                   2023-07-09
ab127      dlv                  2023-07-08

Unfortunately the query gives the wrong results, the status for ab124 and ab125 supposed to be dlv instead of cc. It's because the cc_date is more than ant_date.

How should I fix my query if I want to the get the status that comes right after ant_date out of all the date columns. My table is on amazone redshift. Can someone help me?

Thank you in advance.

答案1

得分: 1

您面临的问题是由于数据的枢轴格式引起的。如果数据是"非枢轴",那将会更容易得多。

SELECT
  awb,
  status,
  status_date
FROM data
UNPIVOT (
  status FOR status_date IN (manifest_date, out_branch_date, in_branch_date, ant_date, dlv_date, cc_date)
)

这将使数据看起来像这样:

  awb          status          status_date  
 -------- ------------------- -------------- 
  ab123     manifest_date       2023-07-01   
  ab123     out_branch_date     2023-07-02   
  ab123     in_branch_date      2023-07-03   
  ab123     ant_date            2023-07-04   
  ab123     dlv_date            2023-07-05   
  ab123     cc_date             2023-07-04   
  ab124     manifest_date       2023-07-03   
  ab124     out_branch_date     2023-07-03   
  ab124     in_branch_date      2023-07-04   
  ab124     ant_date            2023-07-05   
  ab124     dlv_date            2023-07-06   
  ab124     cc_date             2023-07-08 

现在,要推断下一个状态和相关日期将更容易,因为我们可以按日期排序:

SELECT
  awb,
  status,
  status_date,
  LEAD(status, 1) OVER (PARTITION BY awb ORDER BY status_date) AS next_status,
  LEAD(status_date, 1) OVER (PARTITION BY awb ORDER BY status_date) AS next_status_date
FROM data
UNPIVOT (
  status FOR status_date IN (manifest_date, out_branch_date, in_branch_date, ant_date, dlv_date, cc_date)
)
WHERE status = 'ant_date'

结果如下:

 awb       status      status_date     next_status     next_status_date  
 -------- ------------ --------------- --------------- ------------------- 
  ab123     ant_date     2023-07-04      dlv_date        2023-07-05        
  ab124     ant_date     2023-07-05      dlv_date        2023-07-06        
  ab127     ant_date     2023-07-06      cc_date         2023-07-07

如果Redshift不支持UNPIVOT,那么可以使用以下方式来处理,尽管有点繁琐:

SELECT awb, 'manifest_date' as status, manifest_date as status_date FROM data
UNION ALL
SELECT awb, 'out_branch_date' as status, out_branch_date as status_date FROM data
UNION ALL
SELECT awb, 'in_branch_date' as status, in_branch_date as status_date FROM data
UNION ALL
SELECT awb, 'ant_date' as status, ant_date as status_date FROM data
UNION ALL
SELECT awb, 'dlv_date' as status, dlv_date as status_date FROM data
UNION ALL
SELECT awb, 'cc_date' as status, cc_date as status_date FROM data
英文:

The problem you are facing is due to the pivoted format of your data. It will be ever so much easier if the data is "unpivoted"

SELECT
  awb,
  status,
  status_date
FROM data
UNPIVOT (
  status FOR status_date IN (manifest_date, out_branch_date, in_branch_date, ant_date, dlv_date, cc_date)
)

Which will make the data look like this

  awb          status          status_date  
 -------- ------------------- -------------- 
  ab123     manifest_date       2023-07-01   
  ab123     out_branch_date     2023-07-02   
  ab123     in_branch_date      2023-07-03   
  ab123     ant_date            2023-07-04   
  ab123     dlv_date            2023-07-05   
  ab123     cc_date             2023-07-04   
  ab124     manifest_date       2023-07-03   
  ab124     out_branch_date     2023-07-03   
  ab124     in_branch_date      2023-07-04   
  ab124     ant_date            2023-07-05   
  ab124     dlv_date            2023-07-06   
  ab124     cc_date             2023-07-08 

Now it is far easier to deduce the next status and associated date, because we can order the dates:

    SELECT
      awb,
      status,
      status_date,
      LEAD(status, 1) OVER (PARTITION BY awb ORDER BY status_date) AS next_status,
      LEAD(status_date, 1) OVER (PARTITION BY awb ORDER BY status_date) AS next_status_date
    FROM data
    UNPIVOT (
      status FOR status_date IN (manifest_date, out_branch_date, in_branch_date, ant_date, dlv_date, cc_date)
    )
    WHERE status = 'ant_date'


   awb       status      status_date     next_status     next_status_date  
 -------- ------------ --------------- --------------- ------------------- 
  ab123     ant_date     2023-07-04      dlv_date        2023-07-05        
  ab124     ant_date     2023-07-05      dlv_date        2023-07-06        
  ab127     ant_date     2023-07-06      cc_date         2023-07-07    

If Redshift doesn't support UNPIVOT then it's a bit clumsier to do that bit e.g:

SELECT awb, 'manifest_date' as status, manifest_date as status_date FROM data
UNION ALL
SELECT awb, 'out_branch_date' as status, out_branch_date as status_date FROM data
UNION ALL
SELECT awb, 'in_branch_date' as status, in_branch_date as status_date FROM data
UNION ALL
SELECT awb, 'ant_date' as status, ant_date as status_date FROM data
UNION ALL
SELECT awb, 'dlv_date' as status, dlv_date as status_date FROM data
UNION ALL
SELECT awb, 'cc_date' as status, cc_date as status_date FROM data

huangapple
  • 本文由 发表于 2023年8月10日 10:32:58
  • 转载请务必保留本文链接:https://go.coder-hub.com/76872312.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定