分离日期和年份,转换为日期时间,并创建一个新列

huangapple go评论65阅读模式
英文:

Separating date and year, converting to datetime and creating a new column

问题

将日期列转换为日期时间类型。
创建一个新列issue_year,并将其设置为日期列中的年份。

需要将格式为'Dec-11'的日期分离为月份和年份。Dec表示月份,11表示年份。

还需要创建一个新列以存储年份。

所有数据都需要转换为日期时间类型。

我尝试过:

loan_df.issue_d = pd.to_datetime(loan_df.issue_d, errors='coerce', format='%b-%y')

当运行loan_df.issue_d.head()时,我得到了一个NaT错误。

但显然我漏掉了某些东西。有什么建议吗?

英文:

Convert the date column to a datetime type.
Create a new column, issue_year, and set it to the year from date column.

I need to separate a date in the format 'Dec-11' into month and year. Dec being the month and 11 being the year.

I need to also create a new column for the year.

All data needs to be converted to datetime type.

I have tried:

loan_df.issue_d = pd.to_datetime(loan_df.issue_d, errors='coerce',format='%Y%m')

I am getting an NaT error when I run loan_df.issue_d.head()

But obviously I am missing something. Any suggestions all?

    print(loan_df['issue_d'].head())
    0    Dec-11
    1    Dec-11
    2    Dec-11
    3    Dec-11
    4    Dec-11
    Name: issue_d, dtype: object

答案1

得分: 1

你可以尝试使用 format="%b-%y"

df['new_issue_d'] = pd.to_datetime(df['issue_d'], format='%b-%y')
print(df)

打印输出:

  issue_d new_issue_d
0  Dec-11  2011-12-01
1  Jan-11  2011-01-01
2  Feb-13  2013-02-01
  • %b(例如 Sep) - 本地化的缩写月份名称。
  • %y(例如 11) - 去掉世纪的年份,作为零填充的十进制数字。
英文:

You can try format="%b-%y":

df['new_issue_d'] = pd.to_datetime(df['issue_d'], format='%b-%y')
print(df)

Prints:

  issue_d new_issue_d
0  Dec-11  2011-12-01
1  Jan-11  2011-01-01
2  Feb-13  2013-02-01

  • %b (ex. Sep) - Month as locale’s abbreviated name.
  • %y (ex. 11) - Year without century as a zero-padded decimal number.

答案2

得分: 0

如果要保留 issue_date 的格式
```python
loan_df['issue_year'] = pd.to_datetime(loan_df['issue_date'], format='%b-%y').dt.year
print(loan_df)

  issue_date  issue_year
0     Dec-11        2011
1     Dec-11        2011
2     Dec-11        2011
3     Dec-11        2011

如果要格式化 issue_date:

loan_df['issue_date'] = pd.to_datetime(loan_df['issue_date'], format='%b-%y')
loan_df['issue_year'] = loan_df['issue_date'].dt.year
print(loan_df)

  issue_date  issue_year
0 2011-12-01        2011
1 2011-12-01        2011
2 2011-12-01        2011
3 2011-12-01        2011
英文:

If you want keep the issue_date format:

loan_df['issue_year'] = pd.to_datetime(loan_df['issue_date'], format='%b-%y').dt.year
print(loan_df)

  issue_date  issue_year
0     Dec-11        2011
1     Dec-11        2011
2     Dec-11        2011
3     Dec-11        2011

If you want to format issue_date :

loan_df['issue_date'] = pd.to_datetime(loan_df['issue_date'], format='%b-%y')
loan_df['issue_year'] = loan_df['issue_date'].dt.year
print(loan_df)

  issue_date  issue_year
0 2011-12-01        2011
1 2011-12-01        2011
2 2011-12-01        2011
3 2011-12-01        2011

huangapple
  • 本文由 发表于 2023年3月12日 06:37:17
  • 转载请务必保留本文链接:https://go.coder-hub.com/75710005.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定