Snowflake获取完整的星期几名称

huangapple go评论143阅读模式
英文:

Snowflake get full day of week name

问题

以下是翻译的部分:

  1. 样本数据:
  2. | dateHour |
  3. | -------- |
  4. | 2021-08-01 18:00:00.000 |
  5. | 2021-08-02 20:00:00.000 |
  6. | 2021-08-03 06:00:00.000 |
  7. | 2021-08-04 08:00:00.000 |
  8. | 2021-08-05 09:00:00.000 |
  9. 尝试在星期几级别构建一个聚合表。因此,需要从“dateHour”列中提取完整的星期几名称。“dateHour”列包含日期和事件的小时值。它类似于时间戳列。
  10. 从“dateHour”列提取完整星期几名称的一种方法是使用以下case when查询。这个解决方案有效,但由于case语句而花费了很多时间。这会影响查询性能。
  11. select
  12. case dayname("dateHour"::date)
  13. when 'Mon' then 'Monday'
  14. when 'Tue' then 'Tuesday'
  15. when 'Wed' then 'Wednesday'
  16. when 'Thu' then 'Thursday'
  17. when 'Fri' then 'Friday'
  18. when 'Sat' then 'Saturday'
  19. when 'Sun' then 'Sunday'
  20. end as "day_of_week"
  21. from tableName
  22. 我尝试过的另一个查询是:
  23. select TO_CHAR(CURRENT_DATE, 'DYDY') Day_Full_Name;
  24. 上面的查询在使用CURRENT_DATE时工作正常,但是当将CURRENT_DATE替换为表中的列名dateHour”时,它会重复显示星期几的简写。就像这样:
  25. SunSun
  26. MonMon
  27. TueTue
  28. WedWed
  29. ThuThu
  30. 要复制DYDY问题,请使用以下代码段进行检查:
  31. with cte as (
  32. select '2021-08-01 18:00:00.000'::timestamp as "dateHour"
  33. )
  34. select
  35. "dateHour"::date as dt,
  36. TO_CHAR(dt,'DYDY') day_full_name
  37. from cte ;
  38. 上面查询的输出:
  39. | DT|DAY_FULL_NAME|
  40. | -------- | -------- |
  41. |2021-08-01|Sunday |
  42. with cte as (
  43. select '2021-08-01 18:00:00.000'::timestamp as "dateHour"
  44. union all
  45. select '2021-08-02 20:00:00.000'::timestamp as "dateHour"
  46. )
  47. select
  48. "dateHour"::date as dt,
  49. TO_CHAR(dt,'DYDY') day_full_name
  50. from cte
  51. ;
  52. 输出:
  53. | DT|DAY_FULL_NAME|
  54. | -------- | -------- |
  55. |2021-08-01|SunSun |
  56. |2021-08-02|SunMon |
  57. 预期的输出(在输出中,星期日可以是以下任一项:Sunday / SUNDAY):
  58. | fullDayofWeekName|
  59. | -------- |
  60. | SUNDAY |
  61. | MONDAY |
  62. | TUESDAY |
  63. | WEDNESDAY |
  64. | THURSDAY |
  65. 我需要一种高效的方法来从“dateHour”列生成完整的星期几名称。它不应影响聚合查询的性能。
英文:

Sample data:

dateHour
2021-08-01 18:00:00.000
2021-08-02 20:00:00.000
2021-08-03 06:00:00.000
2021-08-04 08:00:00.000
2021-08-05 09:00:00.000

Trying to build an aggregate table at the weekday level. So there is a need to extract full day of week name from the dateHour column. dateHour column contains date and hour value of an event in it. It is like a timestamp column.

One way of extracting the full day name is using the below case when query. This solution works but it is taking a lot of time due to case statements. This is creating a bottle neck on the performance of the query.

  1. select
  2. case dayname("dateHour"::date)
  3. when 'Mon' then 'Monday'
  4. when 'Tue' then 'Tuesday'
  5. when 'Wed' then 'Wednesday'
  6. when 'Thu' then 'Thursday'
  7. when 'Fri' then 'Friday'
  8. when 'Sat' then 'Saturday'
  9. when 'Sun' then 'Sunday'
  10. end as "day_of_week"
  11. from tableName

Another query that I have tried is:

  1. select TO_CHAR(CURRENT_DATE, 'DYDY') Day_Full_Name;

The above query works fine when CURRENT_DATE is used but when CURRENT_DATE is replaced with the column name dateHour from the table then it is giving short week day twice.
Something like this:

  1. SunSun
  2. MonMon
  3. TueTue
  4. WedWed
  5. ThuThu

To replicate the DYDY issue check with the below code snippets:

  1. with cte as (
  2. select '2021-08-01 18:00:00.000'::timestamp as "dateHour"
  3. )
  4. select
  5. "dateHour"::date as dt,
  6. TO_CHAR(dt,'DYDY') day_full_name
  7. from cte ;

The output from the above query:

DT DAY_FULL_NAME
2021-08-01 Sunday
  1. with cte as (
  2. select '2021-08-01 18:00:00.000'::timestamp as "dateHour"
  3. union all
  4. select '2021-08-02 20:00:00.000'::timestamp as "dateHour"
  5. )
  6. select
  7. "dateHour"::date as dt,
  8. TO_CHAR(dt,'DYDY') day_full_name
  9. from cte
  10. ;

Output:

DT DAY_FULL_NAME
2021-08-01 SunSun
2021-08-02 SunMon

Expected output (In the Output Sunday can be either of these: Sunday / SUNDAY )

fullDayofWeekName
SUNDAY
MONDAY
TUESDAY
WEDNESDAY
THURSDAY

I need an efficient way to generate the full week day name from the dateHour column. It shouldn't effect the performance of the aggregate query.

答案1

得分: 1

如果性能成问题,请改变操作顺序。

如果你正在执行大量的转换操作,从日期转换为字符串,然后再在字符串上进行聚合,这两个步骤会比先在数字上进行聚合(比如说按照星期几或日期截断值),然后在另一个选择层将其转换为字符串要慢。

更慢的方式:

  1. select
  2. TO_CHAR(dt, 'DYDY') as day_name,
  3. sum(value) as sum_val
  4. from big_table
  5. group by day_name

更快的方式:

  1. select
  2. convert_to_name_step(dow) as day_name,
  3. sum_val
  4. from (
  5. select
  6. dayofweek(dt) as dow,
  7. sum(value) as sum_val
  8. from big_table
  9. group by dow
  10. )

转换错误:

  1. with data as (
  2. select
  3. column1 as dt
  4. from values
  5. ('2021-08-01 18:00:00.000'::timestamp),
  6. ('2021-08-02 20:00:00.000'::timestamp),
  7. (CURRENT_TIMESTAMP)
  8. )
  9. select
  10. dt
  11. ,SYSTEM$TYPEOF(dt) as t
  12. ,TO_CHAR(dt, 'DYDY') as n
  13. ,CURRENT_TIMESTAMP
  14. ,SYSTEM$TYPEOF(CURRENT_TIMESTAMP) as cd_t
  15. ,TO_CHAR(CURRENT_TIMESTAMP, 'DYDY') as cd_n
  16. from data;

显示的问题不是数据传递给函数的问题,而是实现似乎使用了不同的代码路径。

英文:

If performance is a problem, change the order of operations.

If you are doing large volumes of transforms date->string, then aggregation on strings, those two steps will be slower than if you aggregate on number (say dayofweek or date truncated value) and then convert to string at the in another select layer.

aka slower:

  1. select
  2. TO_CHAR(dt, 'DYDY') as day_name,
  3. sum(value) as sum_val
  4. from big_table
  5. group by day_name

verse faster:

  1. select
  2. convert_to_name_step(dow) as day_name,
  3. sum_val
  4. from (
  5. select
  6. dayofweek(dt) as dow,
  7. sum(value) as sum_val
  8. from big_table
  9. group by dow
  10. )

convertion bug:

  1. with data as (
  2. select
  3. column1 as dt
  4. from values
  5. ('2021-08-01 18:00:00.000'::timestamp),
  6. ('2021-08-02 20:00:00.000'::timestamp),
  7. (CURRENT_TIMESTAMP)
  8. )
  9. select
  10. dt
  11. ,SYSTEM$TYPEOF(dt) as t
  12. ,TO_CHAR(dt, 'DYDY') as n
  13. ,CURRENT_TIMESTAMP
  14. ,SYSTEM$TYPEOF(CURRENT_TIMESTAMP) as cd_t
  15. ,TO_CHAR(CURRENT_TIMESTAMP, 'DYDY') as cd_n
  16. from data;

Shows it is not the data being passed to the function, but more the implementations appears to be different code paths.

Snowflake获取完整的星期几名称

答案2

得分: 0

我可以重现使用联合示例的故障,但找不到关于使用' DYDY '获取星期几名称的任何文档。 话虽如此,基于数组的方法可能比遍历if else块要快

  1. select current_timestamp as ct,
  2. dayofweek(ct) as dow,
  3. ['Sunday','Monday','Tuesday','Wednesday','Thursday','Friday','Saturday'][dow]::varchar
英文:

I can reproduce the glitch with the union example, but couldn't find any documentation on the use of 'DYDY' to get day names. That said, an array-based approach might be faster than traversing the if else block

  1. select current_timestamp as ct,
  2. dayofweek(ct) as dow,
  3. ['Sunday','Monday','Tuesday','Wednesday','Thursday','Friday','Saturday'][dow]::varchar

huangapple
  • 本文由 发表于 2023年7月24日 20:05:22
  • 转载请务必保留本文链接:https://go.coder-hub.com/76754335.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定