英文:
Error 'left side of interval must be <= right side' in pandas IntervalIndex
问题
I have a dataframe that looks like this:
Volume code | from_date | to_date | Volume_1 | Volume_2 | Volume_3 |
---|---|---|---|---|---|
0 | 2022-01-01 00:00:00 | 2022-01-01 00:59:59 | 8 | 4 | 7 |
1 | 2022-01-01 01:00:00 | 2022-01-01 01:59:59 | 0 | 6 | 5 |
1 | 2022-01-01 02:00:00 | 2022-01-01 02:59:59 | 10 | 9 | 14 |
2 | 2022-01-01 03:00:00 | 2022-01-01 03:59:59 | 0 | 11 | 3 |
3 | 2022-01-01 04:00:00 | 2022-01-01 04:59:59 | 13 | 2 | 1 |
这个数据框是一个透视表,它统计了不同时间间隔内每个Volume代码(Volume_x)的出现次数。我们还有另一个表格,其中包含日期时间,我需要知道与辅助表格中的日期时间的前一小时内的出现次数。
例如,对于日期时间2022-01-01 03:16:43在Volume_2中,我们将减去一小时,所以是02:16:43,然后在主数据框中查找,这将在该时间段内给我们9次出现。我做了以下操作:
s = pd.IntervalIndex.from_arrays(df['from_date'] - pd.Timedelta(1, 'hour'),
df['to_date'] - pd.Timedelta(1, 'hour'))
这对于一个示例起作用,但现在我不知道为什么会引发以下错误:
ValueError: left side of interval must be <= right side
正如您在表中所见,from_date
始终小于等于 to_date
,所以我无法理解为什么会出错。有任何想法吗?
英文:
I have a dataframe that looks like this:
Volume code | from_date | to_date | Volume_1 | Volume_2 | Volume_3 |
---|---|---|---|---|---|
0 | 2022-01-01 00:00:00 | 2022-01-01 00:59:59 | 8 | 4 | 7 |
1 | 2022-01-01 01:00:00 | 2022-01-01 01:59:59 | 0 | 6 | 5 |
1 | 2022-01-01 02:00:00 | 2022-01-01 02:59:59 | 10 | 9 | 14 |
2 | 2022-01-01 03:00:00 | 2022-01-01 03:59:59 | 0 | 11 | 3 |
3 | 2022-01-01 04:00:00 | 2022-01-01 04:59:59 | 13 | 2 | 1 |
This dataframe is a pivoted table, which counts the ocurrences in each Volume code (Volume_x) in different time intervals. We have another table which has datetimes, and I need to know the number of ocurrences in the previous hour to that of the ancillary table.
For example, for a datetime 2022-01-01 03:16:43 in Volume_2, we would substract one hour, so 02:16:43, and look for it in the main dataframe, which would give us 9 ocurrences in that time frame. I did the following:
s = pd.IntervalIndex.from_arrays(df['from_date'] - pd.Timedelta(1, 'hour'),
df['to_date'] - pd.Timedelta(1, 'hour'))
And it worked for a sample, but now I don't know why it raises the following error:
ValueError: left side of interval must be <= right side
As you can see in the table, from_date
is always <= than to_date
, so I cannot understand why is it failing. Any ideas?
答案1
得分: 1
你可以在布尔索引
中测试更大值的真实数据:
s1 = df['from_date'] - pd.Timedelta(1, 'hour')
s2 = df['to_date'] - pd.Timedelta(1, 'hour')
print (df[s1.gt(s2)])
与下面的代码工作方式相同:
print (df[df['from_date'].gt(df['to_date'])])
英文:
You can test real data for greater values in boolean indexing
:
s1 = df['from_date'] - pd.Timedelta(1, 'hour')
s2 = df['to_date'] - pd.Timedelta(1, 'hour')
print (df[s1.gt(s2)])
Working same like:
print (df[df['from_date'].gt(df['to_date'])])
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论