英文:
Filter out rows of the last available date of the quarter pandas
问题
我已经做了一些工作,但它一直报错:
ValueError: cannot insert Date, already exists
我该如何解决这个问题?
英文:
I have a dataset that looks like this
data = {'Date': ['2022-01-01', '2022-02-15', '2022-03-10', '2022-04-20', '2022-05-05', '2022-06-30', '2022-07-15', '2022-08-10', '2022-09-25', '2022-09-25'],
'Value': [10, 15, 20, 25, 30, 35, 40, 45, 50, 50]}
What I want to do is get the last available date of each quarter so that the result would look like
Date Value
0 2022-03-10 20
1 2022-06-30 35
2 2022-09-25 50
3 2022-09-25 50
What I have done is something like this
import pandas as pd
# Create sample DataFrame
data = {'Date': ['2022-01-01', '2022-02-15', '2022-03-10', '2022-04-20', '2022-05-05', '2022-06-30', '2022-07-15', '2022-08-10', '2022-09-25', '2022-09-25'],
'Value': [10, 15, 20, 25, 30, 35, 40, 45, 50, 50]}
df = pd.DataFrame(data)
# Convert 'Date' column to datetime format
df['Date'] = pd.to_datetime(df['Date'])
# Create a new DataFrame with last dates of quarters
last_dates = df.resample('Q', on='Date').last().reset_index()
# Merge the original DataFrame with the last_dates DataFrame
df = pd.merge(df, last_dates, on='Date')
print(df)
But it keeps throwing me the error
ValueError: cannot insert Date, already exists
How can I resolve this issue?
答案1
得分: 0
可能的解决方案:
df['Date'] = pd.to_datetime(df['Date'])
(df.groupby(df['Date'].dt.to_period('Q'))
.agg({'Date': 'last', 'Value': 'last'})
.reset_index(drop=True).merge(df))
输出:
Date Value
0 2022-03-10 20
1 2022-06-30 35
2 2022-09-25 50
3 2022-09-25 50
英文:
A possible solution:
df['Date'] = pd.to_datetime(df['Date'])
(df.groupby(df['Date'].dt.to_period('Q'))
.agg({'Date': 'last', 'Value': 'last'})
.reset_index(drop=True).merge(df))
Output:
Date Value
0 2022-03-10 20
1 2022-06-30 35
2 2022-09-25 50
3 2022-09-25 50
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论