“Error ‘left side of interval must be <=' in pandas IntervalIndex"

huangapple go评论85阅读模式
英文:

Error 'left side of interval must be <= right side' in pandas IntervalIndex

问题

I have a dataframe that looks like this:

Volume code from_date to_date Volume_1 Volume_2 Volume_3
0 2022-01-01 00:00:00 2022-01-01 00:59:59 8 4 7
1 2022-01-01 01:00:00 2022-01-01 01:59:59 0 6 5
1 2022-01-01 02:00:00 2022-01-01 02:59:59 10 9 14
2 2022-01-01 03:00:00 2022-01-01 03:59:59 0 11 3
3 2022-01-01 04:00:00 2022-01-01 04:59:59 13 2 1

这个数据框是一个透视表,它统计了不同时间间隔内每个Volume代码(Volume_x)的出现次数。我们还有另一个表格,其中包含日期时间,我需要知道与辅助表格中的日期时间的前一小时内的出现次数。

例如,对于日期时间2022-01-01 03:16:43在Volume_2中,我们将减去一小时,所以是02:16:43,然后在主数据框中查找,这将在该时间段内给我们9次出现。我做了以下操作:

s = pd.IntervalIndex.from_arrays(df['from_date'] - pd.Timedelta(1, 'hour'),
                                 df['to_date'] - pd.Timedelta(1, 'hour'))

这对于一个示例起作用,但现在我不知道为什么会引发以下错误:

ValueError: left side of interval must be <= right side

正如您在表中所见,from_date 始终小于等于 to_date,所以我无法理解为什么会出错。有任何想法吗?

英文:

I have a dataframe that looks like this:

Volume code from_date to_date Volume_1 Volume_2 Volume_3
0 2022-01-01 00:00:00 2022-01-01 00:59:59 8 4 7
1 2022-01-01 01:00:00 2022-01-01 01:59:59 0 6 5
1 2022-01-01 02:00:00 2022-01-01 02:59:59 10 9 14
2 2022-01-01 03:00:00 2022-01-01 03:59:59 0 11 3
3 2022-01-01 04:00:00 2022-01-01 04:59:59 13 2 1

This dataframe is a pivoted table, which counts the ocurrences in each Volume code (Volume_x) in different time intervals. We have another table which has datetimes, and I need to know the number of ocurrences in the previous hour to that of the ancillary table.

For example, for a datetime 2022-01-01 03:16:43 in Volume_2, we would substract one hour, so 02:16:43, and look for it in the main dataframe, which would give us 9 ocurrences in that time frame. I did the following:

s = pd.IntervalIndex.from_arrays(df[&#39;from_date&#39;] - pd.Timedelta(1, &#39;hour&#39;),
                                 df[&#39;to_date&#39;] - pd.Timedelta(1, &#39;hour&#39;))

And it worked for a sample, but now I don't know why it raises the following error:

ValueError: left side of interval must be &lt;= right side

As you can see in the table, from_date is always <= than to_date, so I cannot understand why is it failing. Any ideas?

答案1

得分: 1

你可以在布尔索引中测试更大值的真实数据:

s1 = df['from_date'] - pd.Timedelta(1, 'hour')
s2 = df['to_date'] - pd.Timedelta(1, 'hour')

print (df[s1.gt(s2)])

与下面的代码工作方式相同:

print (df[df['from_date'].gt(df['to_date'])])
英文:

You can test real data for greater values in boolean indexing:

s1 = df[&#39;from_date&#39;] - pd.Timedelta(1, &#39;hour&#39;)
s2 = df[&#39;to_date&#39;] - pd.Timedelta(1, &#39;hour&#39;)

print (df[s1.gt(s2)])

Working same like:

print (df[df[&#39;from_date&#39;].gt(df[&#39;to_date&#39;])])

huangapple
  • 本文由 发表于 2023年4月13日 17:27:33
  • 转载请务必保留本文链接:https://go.coder-hub.com/76003832.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定