Pandas 忽略传递的移位对象进行索引移位。

huangapple go评论74阅读模式
英文:

Pandas shift index ignoring passed shift object

问题

Here's the translated portion of your text:

让我们假设我有一个名为df的数据框如下所示

import pandas as pd
df = pd.date_range('2023-04-01', '2023-05-01')
frequency = df.shift(freq='W')
print(frequency)

我得到的输出中频率为None

DatetimeIndex(['2023-04-02', '2023-04-09', '2023-04-09', '2023-04-09',
               '2023-04-09', '2023-04-09', '2023-04-09', '2023-04-09',
               '2023-04-16', '2023-04-16', '2023-04-16', '2023-04-16',
               '2023-04-16', '2023-04-16', '2023-04-16', '2023-04-23',
               '2023-04-23', '2023-04-23', '2023-04-23', '2023-04-23',
               '2023-04-23', '2023-04-23', '2023-04-30', '2023-04-30',
               '2023-04-30', '2023-04-30', '2023-04-30', '2023-04-30',
               '2023-04-30', '2023-05-07', '2023-05-07'],
              dtype='datetime64[ns]', freq=None) <<<<<<--------这里------<<<<<<

根据[文档][1],`W`代表周

我有什么遗漏吗我在寻找快速解决办法...是否有其他方法

版本1.4.2

[![在此输入图像描述][2]][2]

[1]: https://pandas.pydata.org/pandas-docs/version/0.9.1/timeseries.html#offset-aliases
[2]: https://i.stack.imgur.com/2akOD.png

<details>
<summary>英文:</summary>

Let&#39;s say I&#39;ve df like this 

    import pandas as pd
    df= pd.date_range(&#39;2023-04-01&#39;, &#39;2023-05-01&#39;)
    frequency = df.shift(freq=&#39;W&#39;)
    print(frequency)

output I got freuqnecy as `None`

    DatetimeIndex([&#39;2023-04-02&#39;, &#39;2023-04-09&#39;, &#39;2023-04-09&#39;, &#39;2023-04-09&#39;,
                   &#39;2023-04-09&#39;, &#39;2023-04-09&#39;, &#39;2023-04-09&#39;, &#39;2023-04-09&#39;,
                   &#39;2023-04-16&#39;, &#39;2023-04-16&#39;, &#39;2023-04-16&#39;, &#39;2023-04-16&#39;,
                   &#39;2023-04-16&#39;, &#39;2023-04-16&#39;, &#39;2023-04-16&#39;, &#39;2023-04-23&#39;,
                   &#39;2023-04-23&#39;, &#39;2023-04-23&#39;, &#39;2023-04-23&#39;, &#39;2023-04-23&#39;,
                   &#39;2023-04-23&#39;, &#39;2023-04-23&#39;, &#39;2023-04-30&#39;, &#39;2023-04-30&#39;,
                   &#39;2023-04-30&#39;, &#39;2023-04-30&#39;, &#39;2023-04-30&#39;, &#39;2023-04-30&#39;,
                   &#39;2023-04-30&#39;, &#39;2023-05-07&#39;, &#39;2023-05-07&#39;],
                  dtype=&#39;datetime64[ns]&#39;, freq=None) &lt;&lt;&lt;&lt;&lt;&lt;&lt;&lt;--------Here------&lt;&lt;&lt;&lt;&lt;

According to [documentation][1] `W` stands for week

Am i missing anything here?? I was looking for a quick fix..Is there any alternate way to do it?

    Version: 1.4.2

[![enter image description here][2]][2]


  [1]: https://pandas.pydata.org/pandas-docs/version/0.9.1/timeseries.html#offset-aliases
  [2]: https://i.stack.imgur.com/2akOD.png

</details>


# 答案1
**得分**: 1

I am not sure what you are trying to do, but it gives you the expected outcome.

According to the [documentation][1]:
```plaintext
month_starts = pd.date_range('1/1/2011', periods=5, freq='MS')
month_starts
DatetimeIndex(['2011-01-01', '2011-02-01', '2011-03-01', '2011-04-01',
               '2011-05-01'],
              dtype='datetime64[ns]', freq='MS')
month_starts.shift(10, freq='D')
DatetimeIndex(['2011-01-11', '2011-02-11', '2011-03-11', '2011-04-11',
               '2011-05-11'],
              dtype='datetime64[ns]', freq=None)

It produces the right outcome.

In your case the original df:

DatetimeIndex(['2023-04-01', '2023-04-02', '2023-04-03', '2023-04-04',
               '2023-04-05', '2023-04-06', '2023-04-07', '2023-04-08',
               '2023-04-09', '2023-04-10', '2023-04-11', '2023-04-12',
               '2023-04-13', '2023-04-14', '2023-04-15', '2023-04-16',
               '2023-04-17', '2023-04-18', '2023-04-19', '2023-04-20',
               '2023-04-21', '2023-04-22', '2023-04-23', '2023-04-24',
               '2023-04-25', '2023-04-26', '2023-04-27', '2023-04-28',
               '2023-04-29', '2023-04-30', '2023-05-01'],
              dtype='datetime64[ns]', freq='D')

is converting to the next week (always the Sunday of the week):

DatetimeIndex(['2023-04-02', '2023-04-09', '2023-04-09', '2023-04-09',
               '2023-04-09', '2023-04-09', '2023-04-09', '2023-04-09',
               '2023-04-16', '2023-04-16', '2023-04-16', '2023-04-16',
               '2023-04-16', '2023-04-16', '2023-04-16', '2023-04-23',
               '2023-04-23', '2023-04-23', '2023-04-23', '2023-04-23',
               '2023-04-23', '2023-04-23', '2023-04-30', '2023-04-30',
               '2023-04-30', '2023-04-30', '2023-04-30', '2023-04-30',
               '2023-04-30', '2023-05-07', '2023-05-07'],
              dtype='datetime64[ns]', freq=None)

He gives back the frequency none because the datapoints don't have a frequency. You could clean the "duplicates" and then you have your preferred frequency:

print(frequency.drop_duplicates())
DatetimeIndex(['2023-04-02', '2023-04-09', '2023-04-16', '2023-04-23',
               '2023-04-30', '2023-05-07'],
              dtype='datetime64[ns]', freq=None)

but it will not detect the frequency then.

英文:

I am not sure what you are trying to do, but it gives you the expected outcome.

According to the documentation:

month_starts = pd.date_range(&#39;1/1/2011&#39;, periods=5, freq=&#39;MS&#39;)
month_starts
DatetimeIndex([&#39;2011-01-01&#39;, &#39;2011-02-01&#39;, &#39;2011-03-01&#39;, &#39;2011-04-01&#39;,
&#39;2011-05-01&#39;],
dtype=&#39;datetime64[ns]&#39;, freq=&#39;MS&#39;)
month_starts.shift(10, freq=&#39;D&#39;)
DatetimeIndex([&#39;2011-01-11&#39;, &#39;2011-02-11&#39;, &#39;2011-03-11&#39;, &#39;2011-04-11&#39;,
&#39;2011-05-11&#39;],
dtype=&#39;datetime64[ns]&#39;, freq=None)

It produces the right outcome.

In your case the original df:

DatetimeIndex([&#39;2023-04-01&#39;, &#39;2023-04-02&#39;, &#39;2023-04-03&#39;, &#39;2023-04-04&#39;,
&#39;2023-04-05&#39;, &#39;2023-04-06&#39;, &#39;2023-04-07&#39;, &#39;2023-04-08&#39;,
&#39;2023-04-09&#39;, &#39;2023-04-10&#39;, &#39;2023-04-11&#39;, &#39;2023-04-12&#39;,
&#39;2023-04-13&#39;, &#39;2023-04-14&#39;, &#39;2023-04-15&#39;, &#39;2023-04-16&#39;,
&#39;2023-04-17&#39;, &#39;2023-04-18&#39;, &#39;2023-04-19&#39;, &#39;2023-04-20&#39;,
&#39;2023-04-21&#39;, &#39;2023-04-22&#39;, &#39;2023-04-23&#39;, &#39;2023-04-24&#39;,
&#39;2023-04-25&#39;, &#39;2023-04-26&#39;, &#39;2023-04-27&#39;, &#39;2023-04-28&#39;,
&#39;2023-04-29&#39;, &#39;2023-04-30&#39;, &#39;2023-05-01&#39;],
dtype=&#39;datetime64[ns]&#39;, freq=&#39;D&#39;)

is converting to the next week (always the sunday of the week):

DatetimeIndex([&#39;2023-04-02&#39;, &#39;2023-04-09&#39;, &#39;2023-04-09&#39;, &#39;2023-04-09&#39;,
&#39;2023-04-09&#39;, &#39;2023-04-09&#39;, &#39;2023-04-09&#39;, &#39;2023-04-09&#39;,
&#39;2023-04-16&#39;, &#39;2023-04-16&#39;, &#39;2023-04-16&#39;, &#39;2023-04-16&#39;,
&#39;2023-04-16&#39;, &#39;2023-04-16&#39;, &#39;2023-04-16&#39;, &#39;2023-04-23&#39;,
&#39;2023-04-23&#39;, &#39;2023-04-23&#39;, &#39;2023-04-23&#39;, &#39;2023-04-23&#39;,
&#39;2023-04-23&#39;, &#39;2023-04-23&#39;, &#39;2023-04-30&#39;, &#39;2023-04-30&#39;,
&#39;2023-04-30&#39;, &#39;2023-04-30&#39;, &#39;2023-04-30&#39;, &#39;2023-04-30&#39;,
&#39;2023-04-30&#39;, &#39;2023-05-07&#39;, &#39;2023-05-07&#39;],
dtype=&#39;datetime64[ns]&#39;, freq=None)

He gives back the frequency none because the datapoints don't have a frequency. You could clean the "duplicates" and then you have your preferred frequency:

print(frequency.drop_duplicates())
DatetimeIndex([&#39;2023-04-02&#39;, &#39;2023-04-09&#39;, &#39;2023-04-16&#39;, &#39;2023-04-23&#39;,
&#39;2023-04-30&#39;, &#39;2023-05-07&#39;],
dtype=&#39;datetime64[ns]&#39;, freq=None)

but it will not detect the frequency then.

huangapple
  • 本文由 发表于 2023年4月13日 18:54:21
  • 转载请务必保留本文链接:https://go.coder-hub.com/76004579.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定