英文:
Create identical timestamp for each n elements
问题
I have a function which creates an artificial list of 100,000 timestamps going back in time. The frequency is 2 minutes.
datelist = pd.date_range(end = pd.datetime.today(), periods=100000, freq='2min00S').tolist()
The result looks like:
[Timestamp('2018-12-03 19:48:35.874707', freq='2T'),
Timestamp('2018-12-03 19:50:35.874707', freq='2T'),
Timestamp('2018-12-03 19:52:35.874707', freq='2T'),
Timestamp('2018-12-03 19:54:35.874707', freq='2T'),
Timestamp('2018-12-03 19:56:35.874707', freq='2T'),
Timestamp('2018-12-03 19:58:35.874707', freq='2T'),
Timestamp('2018-12-03 20:00:35.874707', freq='2T'),
Timestamp('2018-12-03 20:02:35.874707', freq='2T'),
Timestamp('2018-12-03 20:04:35.874707', freq='2T'),
Timestamp('2018-12-03 20:06:35.874707', freq='2T'),
...]
I would like to create an identical timestamp for each 50 elements.
At the moment I have a different timestamp for each 100,000 elements. Any idea to do that?
In other words: The frequency of 2 minutes remains the same but the timestamp will be identical for each 50 elements.
This final list will be integrated as a new column into a pandas dataframe.
```data_pd['Timestamp'] = datelist```
<details>
<summary>英文:</summary>
I have a function which creates an artificial list of 100.000 timestamps going back in time. The frequency is 2 minutes.
```datelist = pd.date_range(end = pd.datetime.today(), periods=100000, freq='2min00S').tolist()```
The result looks like:
[Timestamp('2018-12-03 19:48:35.874707', freq='2T'),
Timestamp('2018-12-03 19:50:35.874707', freq='2T'),
Timestamp('2018-12-03 19:52:35.874707', freq='2T'),
Timestamp('2018-12-03 19:54:35.874707', freq='2T'),
Timestamp('2018-12-03 19:56:35.874707', freq='2T'),
Timestamp('2018-12-03 19:58:35.874707', freq='2T'),
Timestamp('2018-12-03 20:00:35.874707', freq='2T'),
Timestamp('2018-12-03 20:02:35.874707', freq='2T'),
Timestamp('2018-12-03 20:04:35.874707', freq='2T'),
Timestamp('2018-12-03 20:06:35.874707', freq='2T'),
...]
I would like to create an identical timestamp for each 50 elements.
At the moment I have a different timestamp for each 100.000 elements. Any idea to do that?
In other words: The frequency of 2 minutes remains the same but the timestamp will be identical for each 50 elements.
This final list will be integrated as new column into a pandas dataframe.
```data_pd['Timestamp'] = datelist```
</details>
# 答案1
**得分**: 0
我相信你需要在删除 `tolist()` 后使用整数除法 `50` 通过 `numpy.arange` 的数组对 `DatetimeIndex` 进行索引:
```python
dates = pd.date_range(end=pd.datetime.today(), periods=100000, freq='2min00S')
data_pd['Timestamp'] = dates[np.arange(len(data_pd)) // 50]
示例:(每5个值)
dates = pd.date_range(end=pd.datetime.today(), periods=100000, freq='2min00S')
data_pd = pd.DataFrame({'a': range(10)})
data_pd['Timestamp'] = dates[np.arange(len(data_pd)) // 5]
print(data_pd)
a Timestamp
0 0 2019-08-17 13:20:41.002125
1 1 2019-08-17 13:20:41.002125
2 2 2019-08-17 13:20:41.002125
3 3 2019-08-17 13:20:41.002125
4 4 2019-08-17 13:20:41.002125
5 5 2019-08-17 13:22:41.002125
6 6 2019-08-17 13:22:41.002125
7 7 2019-08-17 13:22:41.002125
8 8 2019-08-17 13:22:41.002125
9 9 2019-08-17 13:22:41.002125
英文:
I believe you need indexing DatetimeIndex
after removed tolist()
by array with integer division of 50
by numpy.arange
by lenght of DataFrame:
dates = pd.date_range(end = pd.datetime.today(), periods=100000, freq='2min00S')
data_pd['Timestamp'] = dates[np.arange(len(data_pd)) // 50]
Sample: (each 5 values)
dates = pd.date_range(end = pd.datetime.today(), periods=100000, freq='2min00S')
data_pd = pd.DataFrame({'a':range(10)})
data_pd['Timestamp'] = dates[np.arange(len(data_pd)) // 5]
print (data_pd)
a Timestamp
0 0 2019-08-17 13:20:41.002125
1 1 2019-08-17 13:20:41.002125
2 2 2019-08-17 13:20:41.002125
3 3 2019-08-17 13:20:41.002125
4 4 2019-08-17 13:20:41.002125
5 5 2019-08-17 13:22:41.002125
6 6 2019-08-17 13:22:41.002125
7 7 2019-08-17 13:22:41.002125
8 8 2019-08-17 13:22:41.002125
9 9 2019-08-17 13:22:41.002125
答案2
得分: 0
end_time = pd.datetime.today()
end_date = end_time.date()
datelist = pd.date_range(end=end_date, periods=100000, freq='2min00S').tolist()
将end_time转换为日期,而不是使用带有小数秒的时间。这将始终给您相同的时间
[Timestamp('2019-08-17 02:42:00', freq='2T'),
Timestamp('2019-08-17 02:44:00', freq='2T'),
Timestamp('2019-08-17 02:46:00', freq='2T'),
Timestamp('2019-08-17 02:48:00', freq='2T'),
Timestamp('2019-08-17 02:50:00', freq='2T'),
Timestamp('2019-08-17 02:52:00', freq='2T'),
Timestamp('2019-08-17 02:54:00', freq='2T'),
Timestamp('2019-08-17 02:56:00', freq='2T'),
Timestamp('2019-08-17 02:58:00', freq='2T'),
Timestamp('2019-08-17 03:00:00', freq='2T'),
Timestamp('2019-08-17 03:02:00', freq='2T'),
Timestamp('2019-08-17 03:04:00', freq='2T'),
Timestamp('2019-08-17 03:06:00', freq='2T'),
英文:
end_time = pd.datetime.today()
end_date = end_time.date()
datelist = pd.date_range(end = end_date, periods=100000, freq='2min00S').tolist()
convert the end_time to date instead of using a time with decimal seconds. This will always gives you the same time
[Timestamp('2019-08-17 02:42:00', freq='2T'),
Timestamp('2019-08-17 02:44:00', freq='2T'),
Timestamp('2019-08-17 02:46:00', freq='2T'),
Timestamp('2019-08-17 02:48:00', freq='2T'),
Timestamp('2019-08-17 02:50:00', freq='2T'),
Timestamp('2019-08-17 02:52:00', freq='2T'),
Timestamp('2019-08-17 02:54:00', freq='2T'),
Timestamp('2019-08-17 02:56:00', freq='2T'),
Timestamp('2019-08-17 02:58:00', freq='2T'),
Timestamp('2019-08-17 03:00:00', freq='2T'),
Timestamp('2019-08-17 03:02:00', freq='2T'),
Timestamp('2019-08-17 03:04:00', freq='2T'),
Timestamp('2019-08-17 03:06:00', freq='2T'),
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论