2023年8月4日 01:31:12go评论124阅读模式

英文:

Pairs of initial and final elements in a column of datetime pandas

问题

我有一个时间列，像这样：

时间
7:00:00
7:15:00
7:30:00
8:00:00
8:15:00
8:30:00
8:45:00

我需要获取连续且每隔15分钟变化的时间子集的第一个和最后一个元素。也就是说，第一个子集将是：{"start_time": 7:00:00, "end_time":7:30:00}，第二个子集将是：{"start_time":8:00:00, "end_time":8:45:00}，因此我需要返回一个包含这两个字典的列表，如下所示：

[{"start_time": 7:00:00, "end_time":7:30:00}, {"start_time":8:00:00, "end_time":8:45:00}]

另一个示例：

时间
7:00:00
7:15:00
7:30:00

返回：[{"start_time": 7:00:00, "end_time":7:30:00}]

最后一个示例：

时间
7:00:00
7:15:00
7:30:00
8:00:00
9:00:00
10:00:00
10:15:00
10:30:00
11:00:00

返回：[{"start_time": 7:00:00, "end_time":7:30:00}, {"start_time":10:00:00, "end_time":10:30:00}]

英文:

I have a time column like this:

Time
7:00:00
7:15:00
7:30:00
8:00:00
8:15:00
8:30:00
8:45:00

and I need to get the first and last element of each subset of times that are continuous and change only every 15 minutes. I.e., the first subset would be: {"start_time": 7:00:00, "end_time":7:30:00} and the second subset would be: {"start_time":8:00:00, "end_time":8:45:00}, so I need to return a list with those two dictionaries like:

[{&quot;start_time&quot;: 7:00:00, &quot;end_time&quot;:7:30:00}, {&quot;start_time&quot;:8:00:00, &quot;end_time&quot;:8:45:00}]

another example:

Time
7:00:00
7:15:00
7:30:00

Returns: [{"start_time": 7:00:00, "end_time":7:30:00}]

The last one:

Time
7:00:00
7:15:00
7:30:00
8:00:00
9:00:00
10:00:00
10:15:00
10:30:00
11:00:00

returns: [{"start_time": 7:00:00, "end_time":7:30:00}, {"start_time":10:00:00, "end_time":10:30:00}]

答案1

得分: 2

将列转换为时间差，然后计算前一行和当前行之间的差异，以获取秒数，然后将其与5分钟进行比较以标记更改。然后按连续块（即mask.cumsum()）对数据框进行分组，并使用第一个和最后一个聚合时间。

mask = pd.to_timedelta(df['Time']).diff().dt.total_seconds() != 900
df.groupby(mask.cumsum())['Time'].agg(['first', 'last']).to_dict(orient='records')

结果为：

[{'first': '7:00:00', 'last': '7:30:00'},
 {'first': '8:00:00', 'last': '8:45:00'}]

英文:

Covvert the column to timedelta then calculate the diff between previous and current row to get the number of seconds now compare it with 5 minuts to flag the change. Then group the dataframe by continuous blocks (i.e mask.cumsum()) and aggregate Time with first and last

mask = pd.to_timedelta(df[&#39;Time&#39;]).diff().dt.total_seconds() != 900
df.groupby(mask.cumsum())[&#39;Time&#39;].agg([&#39;first&#39;, &#39;last&#39;]).to_dict(orient=&#39;records&#39;)

[{&#39;first&#39;: &#39;7:00:00&#39;, &#39;last&#39;: &#39;7:30:00&#39;},
 {&#39;first&#39;: &#39;8:00:00&#39;, &#39;last&#39;: &#39;8:45:00&#39;}]

答案2

得分: 1

以下是翻译好的部分：

import pandas as pd
times = [
"7:00:00",
"7:15:00",
"7:30:00",
"8:00:00",
"9:00:00",
"10:00:00",
"10:15:00",
"10:30:00",
"11:00:00"]
times = [pd.Timedelta(t) for t in times]
df = pd.DataFrame(times, columns=['Times'])
fifteen = pd.Timedelta(minutes=15)
prev = None
for t in df['Times']:
    if prev:
        if t - prev == fifteen:
            prev = t
            continue
        if curr != prev:
            print({'start':curr, 'end':prev})
    curr = prev = t

输出：

{'start': Timedelta('0 days 07:00:00'), 'end': Timedelta('0 days 07:30:00')}
{'start': Timedelta('0 days 10:00:00'), 'end': Timedelta('0 days 10:30:00')}

英文:

Something like this works:

import pandas as pd
times = [
&quot;7:00:00&quot;,
&quot;7:15:00&quot;,
&quot;7:30:00&quot;,
&quot;8:00:00&quot;,
&quot;9:00:00&quot;,
&quot;10:00:00&quot;,
&quot;10:15:00&quot;,
&quot;10:30:00&quot;,
&quot;11:00:00&quot;]
times = [pd.Timedelta(t) for t in times]
df = pd.DataFrame(times, columns=[&#39;Times&#39;])
fifteen = pd.Timedelta(minutes=15)
prev = None
for t in df[&#39;Times&#39;]:
    if prev:
        if t - prev == fifteen:
            prev = t
            continue
        if curr != prev:
            print({&#39;start&#39;:curr, &#39;end&#39;:prev})
    curr = prev = t

Output:

{&#39;start&#39;: Timedelta(&#39;0 days 07:00:00&#39;), &#39;end&#39;: Timedelta(&#39;0 days 07:30:00&#39;)}
{&#39;start&#39;: Timedelta(&#39;0 days 10:00:00&#39;), &#39;end&#39;: Timedelta(&#39;0 days 10:30:00&#39;)}

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

在一个datetime pandas列中的初始和最终元素对。

问题

答案1

答案2

如何从使用OWSLib提取的WMS图像中获取适当的GetMap尺寸

在Pandas.DataFrame中获取排名，包括可能存在的并列排名。

Selenium：仅在存在时获取部分类上的文本

如何在Peewee中索引由`array_agg`生成的数组？

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。