英文:
Why does pandas `date_range` rounds up to the next month?
问题
使用 pandas.date_range
与起始日期、频率和周期一起使用时,日期范围会在起始日期为月底的情况下四舍五入。
这似乎是一个潜在的边缘情况错误。如果这不是错误,是否有任何关于为什么会这样的想法?
例如
import pandas as pd
start_date = pd.Timestamp(2023, 5, 31)
date_range = pd.date_range(start=start_date, freq="MS", periods=6)
结果是
DatetimeIndex(['2023-06-01', '2023-07-01', '2023-08-01', '2023-09-01',
'2023-10-01', '2023-11-01'],
dtype='datetime64[ns]', freq='MS')
根据文档,我预期它应该从五月开始,十月结束:
DatetimeIndex(['2023-05-01', '2023-06-01', '2023-07-01', '2023-08-01', '2023-09-01',
'2023-10-01'],
dtype='datetime64[ns]', freq='MS')
我以为可能与 inclusive
参数有关,但这也不是原因。
英文:
When using pandas.date_range
with start date, frequency, and periods the date range rounds up when using the start date as the last day of a month.
It seems like a silent edge case bug. If it's not a bug, any idea why it does that?
For example
import pandas as pd
start_date = pd.Timestamp(2023, 5, 31)
date_range = pd.date_range(start=start_date, freq="MS", periods=6)
results in
DatetimeIndex(['2023-06-01', '2023-07-01', '2023-08-01', '2023-09-01',
'2023-10-01', '2023-11-01'],
dtype='datetime64[ns]', freq='MS')
From the documentation, I'd expect it to start in May and end in October:
DatetimeIndex(['2023-05-01', '2023-06-01', '2023-07-01', '2023-08-01', '2023-09-01',
'2023-10-01'],
dtype='datetime64[ns]', freq='MS')
I thought it had to do with the inclusive
argument but that's not the reason either.
答案1
得分: 0
pd.date_range
用于生成在start
和end
之间的日期范围。2023-05-01
小于起始日期2023-05-31
,它永远不会达到起始日期。要实现你想要的效果,你可以通过将pd.Timestamp
的日替换为1来进行如下操作:
start_date = pd.Timestamp(2023, 5, 31)
date_range = pd.date_range(start=start_date.replace(day=1), freq="MS", periods=6)
print(date_range)
DatetimeIndex(['2023-05-01', '2023-06-01', '2023-07-01', '2023-08-01',
'2023-09-01', '2023-10-01'],
dtype='datetime64[ns]', freq='MS')
英文:
pd.date_range
is to generate a range of date between start
and end
. 2023-05-01
is less than start date 2023-05-31
, it will never reach it. To do what you want, you can replace the day of pd.Timestamp
by 1.
start_date = pd.Timestamp(2023, 5, 31)
date_range = pd.date_range(start=start_date.replace(day=1), freq="MS", periods=6)
print(date_range)
DatetimeIndex(['2023-05-01', '2023-06-01', '2023-07-01', '2023-08-01',
'2023-09-01', '2023-10-01'],
dtype='datetime64[ns]', freq='MS')
答案2
得分: 0
以下是翻译好的部分:
"documentation reads
“such that they all satisfy start <= x <= end”
Therefore, as the date provided is pd.Timestamp(2023, 5, 31)
, the first "MS"
(start-of-month) date that satisfies start <= x
is the following month."
英文:
The documentation reads
> "such that they all satisfy start <[=] x <[=] end"
Therefore, as the date provided is pd.Timestamp(2023, 5, 31)
, the first "MS"
(start-of-month) date that satisfies start <= x
is the following month.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论