英文:
Separating date and year, converting to datetime and creating a new column
问题
将日期列转换为日期时间类型。
创建一个新列issue_year,并将其设置为日期列中的年份。
需要将格式为'Dec-11'的日期分离为月份和年份。Dec表示月份,11表示年份。
还需要创建一个新列以存储年份。
所有数据都需要转换为日期时间类型。
我尝试过:
loan_df.issue_d = pd.to_datetime(loan_df.issue_d, errors='coerce', format='%b-%y')
当运行loan_df.issue_d.head()
时,我得到了一个NaT错误。
但显然我漏掉了某些东西。有什么建议吗?
英文:
Convert the date column to a datetime type.
Create a new column, issue_year, and set it to the year from date column.
I need to separate a date in the format 'Dec-11' into month and year. Dec being the month and 11 being the year.
I need to also create a new column for the year.
All data needs to be converted to datetime type.
I have tried:
loan_df.issue_d = pd.to_datetime(loan_df.issue_d, errors='coerce',format='%Y%m')
I am getting an NaT error when I run loan_df.issue_d.head()
But obviously I am missing something. Any suggestions all?
print(loan_df['issue_d'].head())
0 Dec-11
1 Dec-11
2 Dec-11
3 Dec-11
4 Dec-11
Name: issue_d, dtype: object
答案1
得分: 1
你可以尝试使用 format="%b-%y"
:
df['new_issue_d'] = pd.to_datetime(df['issue_d'], format='%b-%y')
print(df)
打印输出:
issue_d new_issue_d
0 Dec-11 2011-12-01
1 Jan-11 2011-01-01
2 Feb-13 2013-02-01
- %b(例如 Sep) - 本地化的缩写月份名称。
- %y(例如 11) - 去掉世纪的年份,作为零填充的十进制数字。
英文:
You can try format="%b-%y"
:
df['new_issue_d'] = pd.to_datetime(df['issue_d'], format='%b-%y')
print(df)
Prints:
issue_d new_issue_d
0 Dec-11 2011-12-01
1 Jan-11 2011-01-01
2 Feb-13 2013-02-01
- %b (ex. Sep) - Month as locale’s abbreviated name.
- %y (ex. 11) - Year without century as a zero-padded decimal number.
答案2
得分: 0
如果要保留 issue_date 的格式:
```python
loan_df['issue_year'] = pd.to_datetime(loan_df['issue_date'], format='%b-%y').dt.year
print(loan_df)
issue_date issue_year
0 Dec-11 2011
1 Dec-11 2011
2 Dec-11 2011
3 Dec-11 2011
如果要格式化 issue_date:
loan_df['issue_date'] = pd.to_datetime(loan_df['issue_date'], format='%b-%y')
loan_df['issue_year'] = loan_df['issue_date'].dt.year
print(loan_df)
issue_date issue_year
0 2011-12-01 2011
1 2011-12-01 2011
2 2011-12-01 2011
3 2011-12-01 2011
英文:
If you want keep the issue_date format:
loan_df['issue_year'] = pd.to_datetime(loan_df['issue_date'], format='%b-%y').dt.year
print(loan_df)
issue_date issue_year
0 Dec-11 2011
1 Dec-11 2011
2 Dec-11 2011
3 Dec-11 2011
If you want to format issue_date :
loan_df['issue_date'] = pd.to_datetime(loan_df['issue_date'], format='%b-%y')
loan_df['issue_year'] = loan_df['issue_date'].dt.year
print(loan_df)
issue_date issue_year
0 2011-12-01 2011
1 2011-12-01 2011
2 2011-12-01 2011
3 2011-12-01 2011
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论