如何包括pandas date_range()的两端?

huangapple go评论76阅读模式
英文:

How to include both ends of a pandas date_range()

问题

以下是代码的中文翻译部分:

import pandas as pd
import datetime

# 选项 1
pd.date_range(datetime(2022, 1, 13), datetime(2022, 4, 5), freq='M', inclusive='both')
# 选项 2
pd.date_range("2022-01-13", "2022-04-05", freq='M', inclusive='both')

这两个选项返回的结果都是:DatetimeIndex(['2022-01-31', '2022-02-28', '2022-03-31'], dtype='datetime64[ns]', freq='M')。然而,我期望的结果是一个包含每个月份的日期的列表(4个日期),分别是 [一月, 二月, 三月, 四月]

如果我们现在运行以下代码:

pd.date_range("2022-01-13", "2022-04-05", freq='M', inclusive='right')

我们仍然会得到与之前相同的结果。看起来 inclusive 对结果没有影响。

Pandas 版本:1.5.3

英文:

From a pair of dates, I would like to create a list of dates at monthly frequency, including the months of both dates indicated.

import pandas as pd
import datetime

# Option 1
pd.date_range(datetime(2022, 1, 13),datetime(2022, 4, 5), freq='M', inclusive='both')
# Option 2
pd.date_range("2022-01-13", "2022-04-05", freq='M', inclusive='both')

both return the list: DatetimeIndex(['2022-01-31', '2022-02-28', '2022-03-31'], dtype='datetime64[ns]', freq='M'). However, I am expecting the outcome with a list of dates (4 long) with one date for each month: [january, february, mars, april]

If now we run:

pd.date_range("2022-01-13", "2022-04-05", freq='M', inclusive='right')

we still obtain the same result as before. It looks like inclusive has no effect on the outcome.

Pandas version. 1.5.3

答案1

得分: 2

使用 MonthEndDay 偏移量

这是因为 2022-04-05 在你的月底日期(2022-04-30)之前

你可以使用:

pd.date_range("2022-01-13", pd.Timestamp("2022-04-05")+pd.offsets.MonthEnd(),
              freq='M', inclusive='both')

一个更健壮的变体,也能处理输入日期已经是月底的情况:

pd.date_range("2022-01-13",
              pd.Timestamp("2022-04-05")-pd.offsets.Day()+pd.offsets.MonthEnd(),
              freq='M', inclusive='both')

输出:

DatetimeIndex(['2022-01-31', '2022-02-28', '2022-03-31', '2022-04-30'],
              dtype='datetime64[ns]', freq='M')

替代方法:使用 Period

pd.date_range(pd.Period('2022-01-13', 'M').to_timestamp(),
              pd.Period('2022-04-30', 'M').to_timestamp(how='end'),
              freq='M', inclusive='both')

中间结果:

pd.Period('2022-01-13', 'M').to_timestamp()
# Timestamp('2022-01-01 00:00:00')

pd.Period('2022-04-30', 'M').to_timestamp(how='end')
# Timestamp('2022-04-30 23:59:59.999999999')

或者作为周期:period_range

pd.period_range('2022-01-13', '2022-04-30', freq='M')

输出:

PeriodIndex(['2022-01', '2022-02', '2022-03', '2022-04'], dtype='period[M]')
英文:

using MonthEnd and Day offsets

This is because 2022-04-05 is before your month end (2022-04-30).

You can use:

pd.date_range("2022-01-13", pd.Timestamp("2022-04-05")+pd.offsets.MonthEnd(),
              freq='M', inclusive='both')

A more robust variant to also handle the case in which the input date is already the month end:

pd.date_range("2022-01-13",
              pd.Timestamp("2022-04-05")-pd.offsets.Day()+pd.offsets.MonthEnd(),
              freq='M', inclusive='both')

Output:

DatetimeIndex(['2022-01-31', '2022-02-28', '2022-03-31', '2022-04-30'],
              dtype='datetime64[ns]', freq='M')

alternative: using Period

pd.date_range(pd.Period('2022-01-13', 'M').to_timestamp(),
              pd.Period('2022-04-30', 'M').to_timestamp(how='end'),
              freq='M', inclusive='both')

Intermediates:

pd.Period('2022-01-13', 'M').to_timestamp()
# Timestamp('2022-01-01 00:00:00')

pd.Period('2022-04-30', 'M').to_timestamp(how='end')
# Timestamp('2022-04-30 23:59:59.999999999')

or as periods: period_range

pd.period_range('2022-01-13', '2022-04-30', freq='M')

Output:

PeriodIndex(['2022-01', '2022-02', '2022-03', '2022-04'], dtype='period[M]')

答案2

得分: 0

This is because the Month definition, if you use Day you see the difference.
When you count in months there is no effect.

For inclusive:

both: a <= x <= b (in math convention: [a, b])

neither: a < x < b (in math convention: ]a, b[)

right: a < x <= b (in math convention: ]a, b])

left: a <= x < b (in math convention: [a, b[)

You can&#39;t include beyond the limits

英文:

This is because the Month definition, if you use Day you see the difference.
When you count in months there is no effect.

For inclusive :

both : a <= x <= b (in math convention : [a, b])

neither : a < x < b (in math convention : ]a, b[)

right : a < x <= b (in math convention : ]a, b])

left : a <= x < b (in math convention : [a, b[)

You can&#39;t include beyond the limits

huangapple
  • 本文由 发表于 2023年3月7日 22:05:08
  • 转载请务必保留本文链接:https://go.coder-hub.com/75663011.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定