英文:
Snowflake get full day of week name
问题
以下是翻译的部分:
样本数据:
| dateHour |
| -------- |
| 2021-08-01 18:00:00.000 |
| 2021-08-02 20:00:00.000 |
| 2021-08-03 06:00:00.000 |
| 2021-08-04 08:00:00.000 |
| 2021-08-05 09:00:00.000 |
尝试在星期几级别构建一个聚合表。因此,需要从“dateHour”列中提取完整的星期几名称。“dateHour”列包含日期和事件的小时值。它类似于时间戳列。
从“dateHour”列提取完整星期几名称的一种方法是使用以下case when查询。这个解决方案有效,但由于case语句而花费了很多时间。这会影响查询性能。
select
case dayname("dateHour"::date)
when 'Mon' then 'Monday'
when 'Tue' then 'Tuesday'
when 'Wed' then 'Wednesday'
when 'Thu' then 'Thursday'
when 'Fri' then 'Friday'
when 'Sat' then 'Saturday'
when 'Sun' then 'Sunday'
end as "day_of_week"
from tableName
我尝试过的另一个查询是:
select TO_CHAR(CURRENT_DATE, 'DYDY') Day_Full_Name;
上面的查询在使用CURRENT_DATE时工作正常,但是当将CURRENT_DATE替换为表中的列名“dateHour”时,它会重复显示星期几的简写。就像这样:
SunSun
MonMon
TueTue
WedWed
ThuThu
要复制DYDY问题,请使用以下代码段进行检查:
with cte as (
select '2021-08-01 18:00:00.000'::timestamp as "dateHour"
)
select
"dateHour"::date as dt,
TO_CHAR(dt,'DYDY') day_full_name
from cte ;
上面查询的输出:
| DT|DAY_FULL_NAME|
| -------- | -------- |
|2021-08-01|Sunday |
with cte as (
select '2021-08-01 18:00:00.000'::timestamp as "dateHour"
union all
select '2021-08-02 20:00:00.000'::timestamp as "dateHour"
)
select
"dateHour"::date as dt,
TO_CHAR(dt,'DYDY') day_full_name
from cte
;
输出:
| DT|DAY_FULL_NAME|
| -------- | -------- |
|2021-08-01|SunSun |
|2021-08-02|SunMon |
预期的输出(在输出中,星期日可以是以下任一项:Sunday / SUNDAY):
| fullDayofWeekName|
| -------- |
| SUNDAY |
| MONDAY |
| TUESDAY |
| WEDNESDAY |
| THURSDAY |
我需要一种高效的方法来从“dateHour”列生成完整的星期几名称。它不应影响聚合查询的性能。
英文:
Sample data:
dateHour |
---|
2021-08-01 18:00:00.000 |
2021-08-02 20:00:00.000 |
2021-08-03 06:00:00.000 |
2021-08-04 08:00:00.000 |
2021-08-05 09:00:00.000 |
Trying to build an aggregate table at the weekday level. So there is a need to extract full day of week name from the dateHour
column. dateHour
column contains date and hour value of an event in it. It is like a timestamp column.
One way of extracting the full day name is using the below case when query. This solution works but it is taking a lot of time due to case statements. This is creating a bottle neck on the performance of the query.
select
case dayname("dateHour"::date)
when 'Mon' then 'Monday'
when 'Tue' then 'Tuesday'
when 'Wed' then 'Wednesday'
when 'Thu' then 'Thursday'
when 'Fri' then 'Friday'
when 'Sat' then 'Saturday'
when 'Sun' then 'Sunday'
end as "day_of_week"
from tableName
Another query that I have tried is:
select TO_CHAR(CURRENT_DATE, 'DYDY') Day_Full_Name;
The above query works fine when CURRENT_DATE
is used but when CURRENT_DATE
is replaced with the column name dateHour
from the table then it is giving short week day twice.
Something like this:
SunSun
MonMon
TueTue
WedWed
ThuThu
To replicate the DYDY issue check with the below code snippets:
with cte as (
select '2021-08-01 18:00:00.000'::timestamp as "dateHour"
)
select
"dateHour"::date as dt,
TO_CHAR(dt,'DYDY') day_full_name
from cte ;
The output from the above query:
DT | DAY_FULL_NAME |
---|---|
2021-08-01 | Sunday |
with cte as (
select '2021-08-01 18:00:00.000'::timestamp as "dateHour"
union all
select '2021-08-02 20:00:00.000'::timestamp as "dateHour"
)
select
"dateHour"::date as dt,
TO_CHAR(dt,'DYDY') day_full_name
from cte
;
Output:
DT | DAY_FULL_NAME |
---|---|
2021-08-01 | SunSun |
2021-08-02 | SunMon |
Expected output (In the Output Sunday can be either of these: Sunday / SUNDAY )
fullDayofWeekName |
---|
SUNDAY |
MONDAY |
TUESDAY |
WEDNESDAY |
THURSDAY |
I need an efficient way to generate the full week day name from the dateHour
column. It shouldn't effect the performance of the aggregate query.
答案1
得分: 1
如果性能成问题,请改变操作顺序。
如果你正在执行大量的转换操作,从日期转换为字符串,然后再在字符串上进行聚合,这两个步骤会比先在数字上进行聚合(比如说按照星期几或日期截断值),然后在另一个选择层将其转换为字符串要慢。
更慢的方式:
select
TO_CHAR(dt, 'DYDY') as day_name,
sum(value) as sum_val
from big_table
group by day_name
更快的方式:
select
convert_to_name_step(dow) as day_name,
sum_val
from (
select
dayofweek(dt) as dow,
sum(value) as sum_val
from big_table
group by dow
)
转换错误:
with data as (
select
column1 as dt
from values
('2021-08-01 18:00:00.000'::timestamp),
('2021-08-02 20:00:00.000'::timestamp),
(CURRENT_TIMESTAMP)
)
select
dt
,SYSTEM$TYPEOF(dt) as t
,TO_CHAR(dt, 'DYDY') as n
,CURRENT_TIMESTAMP
,SYSTEM$TYPEOF(CURRENT_TIMESTAMP) as cd_t
,TO_CHAR(CURRENT_TIMESTAMP, 'DYDY') as cd_n
from data;
显示的问题不是数据传递给函数的问题,而是实现似乎使用了不同的代码路径。
英文:
If performance is a problem, change the order of operations.
If you are doing large volumes of transforms date->string, then aggregation on strings, those two steps will be slower than if you aggregate on number (say dayofweek or date truncated value) and then convert to string at the in another select layer.
aka slower:
select
TO_CHAR(dt, 'DYDY') as day_name,
sum(value) as sum_val
from big_table
group by day_name
verse faster:
select
convert_to_name_step(dow) as day_name,
sum_val
from (
select
dayofweek(dt) as dow,
sum(value) as sum_val
from big_table
group by dow
)
convertion bug:
with data as (
select
column1 as dt
from values
('2021-08-01 18:00:00.000'::timestamp),
('2021-08-02 20:00:00.000'::timestamp),
(CURRENT_TIMESTAMP)
)
select
dt
,SYSTEM$TYPEOF(dt) as t
,TO_CHAR(dt, 'DYDY') as n
,CURRENT_TIMESTAMP
,SYSTEM$TYPEOF(CURRENT_TIMESTAMP) as cd_t
,TO_CHAR(CURRENT_TIMESTAMP, 'DYDY') as cd_n
from data;
Shows it is not the data being passed to the function, but more the implementations appears to be different code paths.
答案2
得分: 0
我可以重现使用联合示例的故障,但找不到关于使用' DYDY '获取星期几名称的任何文档。 话虽如此,基于数组的方法可能比遍历if else块要快
select current_timestamp as ct,
dayofweek(ct) as dow,
['Sunday','Monday','Tuesday','Wednesday','Thursday','Friday','Saturday'][dow]::varchar
英文:
I can reproduce the glitch with the union example, but couldn't find any documentation on the use of 'DYDY'
to get day names. That said, an array-based approach might be faster than traversing the if else block
select current_timestamp as ct,
dayofweek(ct) as dow,
['Sunday','Monday','Tuesday','Wednesday','Thursday','Friday','Saturday'][dow]::varchar
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论