2023年4月13日 22:06:32go评论96阅读模式

英文:

Generate series of dates and hours until next event

问题

SELECT
  Delivery_Area_ID,
  datetime,
  date,
  hour_of_day,
  current_Delivery_Radius_Meters,
  event_started_timestamp,
  event_ended_time,
  event_duration_hours,
  radius_life_of_that_hour,
  is_default
FROM
  temp_table;

英文:


    CREATE TABLE delivery_data (
      Delivery_Area_ID INT,
      current_Delivery_Radius_Meters INT,
      Event_Started_Timestamp TIMESTAMP,
      event_started_date DATE,
      event_started_hour INT,
      event_started_mins INT,
      event_ended_time TIMESTAMP,
      prev_delivery_radius INT
    );
    
    
    INSERT INTO delivery_data (
      Delivery_Area_ID,
      current_Delivery_Radius_Meters,
      Event_Started_Timestamp,
      event_started_date,
      event_started_hour,
      event_started_mins,
      event_ended_time,
      prev_delivery_radius
    )
    VALUES
      (1, 3500, &#39;2022-01-15 19:46:37.995951 UTC&#39;, &#39;2022-01-15&#39;, 19, 46, &#39;2022-01-15 20:05:29.049375 UTC&#39;, NULL),
      (1, 6500, &#39;2022-01-15 20:05:29.049375 UTC&#39;, &#39;2022-01-15&#39;, 20, 5, &#39;2022-01-16 12:31:22.778229 UTC&#39;, 3500),
      (1, 3500, &#39;2022-01-16 12:31:22.778229 UTC&#39;, &#39;2022-01-16&#39;, 12, 31, &#39;2022-01-16 12:50:12.562042 UTC&#39;, 6500),
      (1, 6500, &#39;2022-01-16 12:50:12.562042 UTC&#39;, &#39;2022-01-16&#39;, 12, 50, &#39;2022-01-18 20:46:41.937279 UTC&#39;, 3500),
      (1, 3500, &#39;2022-01-18 20:46:41.937279 UTC&#39;, &#39;2022-01-18&#39;, 20, 46, &#39;2022-01-18 20:58:55.794286 UTC&#39;, 6500);

I have the above table.
I am expecting the below output.

I want to measure the lifespan of a radius for every hour in a day.If you look at the image the first radius 3500 was on '2022-01-15 19:46:37.995951 UTC' at 19:46, meaning life of 3500 for 19th hour is 14, and at '2022-01-15 20:05:29.049375 UTC' the radius was changed to 6500, until 20:05 the radius was still 3500, meaning the life of 3500 at 19th hour is 5 minutes and for 6500 is 55mins at 20th hour.
I have tried below query, but i am not getting accurate results.
fields in my dim_date: (datetime, hour_of_day,date, weekday, week, month, quarter)


with  dim_date AS(
SELECT 
 *
FROM 
 `dim_date` 
WHERE 
 DATE BETWEEN &#39;2022-01-01&#39; AND &#39;2022-12-31&#39;
)
, delivery_radius_log_data AS (
 SELECT 
   Delivery_Area_ID,
   Delivery_Radius_Meters as current_Delivery_Radius_Meters,
   Event_Started_Timestamp,
   --extracting event started date and hour because i want to see how many times the events have happened in a particular hour of a day.
   EXTRACT(DATE FROM Event_Started_Timestamp) AS event_started_date,
   EXTRACT(HOUR FROM Event_Started_Timestamp) AS event_started_hour,
   EXTRACT(MINUTE FROM Event_Started_Timestamp) AS event_started_mins,
  -- getting the next Event_Started_Timestamp , this gives me the current event ended time. 
   LEAD(Event_Started_Timestamp) OVER (
     PARTITION BY Delivery_Area_ID ORDER BY Event_Started_Timestamp ASC
   ) AS event_ended_time,
   -- getting the previous delivery radius, taking it as a helper to check the correctness of execution. 
   LAG(Delivery_Radius_Meters) OVER (
     PARTITION BY Delivery_Area_ID ORDER BY Event_Started_Timestamp ASC
   ) AS prev_delivery_radius
 FROM 
   `radius_data` 
 WHERE 
   DATE(Event_Started_Timestamp) BETWEEN &#39;2022-01-01&#39; AND &#39;2022-12-31&#39;
   AND delivery_area_id =&#39;1&#39;
)
,temp_table as(
select
   dr.delivery_area_id AS delivery_area_id,
   dd.datetime,
   dd.date,
   dd.hour_of_day,
   LEAD(dd.hour_of_day) OVER (PARTITION BY dd.date ORDER BY dd.datetime ASC) AS next_hour_of_day,
   --dr.prev_delivery_radius,
   dr.current_delivery_radius_meters,
   dr.event_started_timestamp,
   event_started_date,
   event_started_hour,
   event_started_mins,
   dr.event_ended_time,
   EXTRACT(DATE FROM dr.event_ended_time) AS event_ended_date,
   EXTRACT(HOUR FROM dr.event_ended_time) AS event_ended_hour,
   EXTRACT(MINUTE FROM dr.event_ended_time) AS event_ended_mins,
   --finding the time difference between the event
   ROUND(TIMESTAMP_DIFF(dr.event_ended_time, dr.event_started_timestamp, second)/3600,2) AS event_duration_hours,
   CASE 
     WHEN 
       TIMESTAMP_DIFF(dr.event_ended_time, dr.event_started_timestamp, second)/3600 &gt;= 24 THEN &#39;Default&#39;
     ELSE &#39;Not Default&#39;
   END AS is_default
FROM
  dim_date dd 
  LEFT JOIN delivery_radius_log_data dr 
   ON DATE(dd.date) = DATE(dr.event_started_date) 
  AND TRIM(CAST(dd.hour_of_day AS STRING))  = TRIM(CAST(dr.event_started_hour AS STRING)) 
ORDER BY
 2
)
SELECT
 *,
 CASE
   WHEN event_started_date = event_ended_date and event_started_hour = event_ended_hour
     THEN event_ended_mins - event_started_mins
   WHEN Event_started_date = event_ended_date and event_started_hour &lt; event_ended_hour 
     THEN  60 - event_started_mins --TIMESTAMP_DIFF(event_ended_time, event_started_timestamp, MINUTE)
   WHEN Event_started_date = event_ended_date and event_started_hour &lt; event_ended_hour 
     THEN  60 - event_started_mins
   WHEN event_started_date &lt; event_ended_date
     THEN 60 - event_started_mins
   ELSE 60
 END AS radius_life_of_that_hour
FROM
 temp_table

I want the output table as described in the image. i.e. for every date i need to have 0-23 hours i.e. at least 24 records per day with the respective lifespan of the radius at that hour.
Thank you:)

答案1

得分: 1

I think what you want to do first is build what is called a DateSpine. By choosing a date column, you basically will expand the dataset to include a row for every time component you choose, in your case, every hour.

我认为您首先要做的是构建所谓的DateSpine。通过选择日期列，您基本上会扩展数据集，以包括您选择的每个时间组件，例如每小时。

I hate BigQuery, and couldn't figure out how to make this HOUR instead of DAY, but it's definitely possible by messing around w/ the generate_date_array to create your span of hours.

我不喜欢BigQuery，而且无法弄清楚如何将其更改为每小时而不是每天，但通过在generate_date_array中进行调整，确实可以实现这一点，以创建您的小时范围。

If you can get this to work, you'll be adding a row for every hour if a row does not exist, and then from there you can work on doing your date math to see the relative time from each hour that you want to calculate.

如果您能使这个工作，您将为每个小时添加一行（如果不存在行），然后从那里可以开始进行日期数学运算，以查看您想要计算的每个小时的相对时间。

Sorry for not a complete answer, but hopefully a step in the right direction.

抱歉，回答不完整，但希望是朝着正确方向的一步。

英文:

I hate BigQuery, and couldn't figure out how to make this HOUR instead of DAY, but its definitely possible by messing around w/ the generate_date_array to create your span of hours.

with calendar as (
  select 
    date_day 
  from 
    unnest(
      generate_date_array(
        (
          SELECT MIN(prediction_point) 
          FROM your_table
        ), 
        (
          SELECT MAX(prediction_point) 
          FROM your_table
        )
      )
    ) as date_day
), 
spine as (
  select 
    distinct date_DAY as period 
  from 
    calendar
) 
select 
  cast(spine.period as timestamp) as prediction_point_SPINE_START, 
  timestamp_add(
    cast(
      date_add(spine.period, INTERVAL 1 DAY) as timestamp
    ), 
    INTERVAL -1 second
  ) as prediction_point_SPINE_END, 
  st.* 
from 
  spine 
  left outer your_table st on cast(
    date_trunc(
      cast(st.prediction_point as date), 
      DAY
    ) as date
  ) = spine.period

Sorry for not a complete answer, but hopefully a step in the right direction.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

生成日期和小时的系列，直到下一个事件。

问题

答案1

在组中跳过缺失值的结果行

Laravel 10 DB:raw错误地在SQL函数周围添加反引号

从 SQL 表中消除重复的行。

优化地理哈希查询

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。