分离日期和年份,转换为日期时间,并创建一个新列

huangapple go评论102阅读模式
英文:

Separating date and year, converting to datetime and creating a new column

问题

将日期列转换为日期时间类型。
创建一个新列issue_year,并将其设置为日期列中的年份。

需要将格式为'Dec-11'的日期分离为月份和年份。Dec表示月份,11表示年份。

还需要创建一个新列以存储年份。

所有数据都需要转换为日期时间类型。

我尝试过:

  1. loan_df.issue_d = pd.to_datetime(loan_df.issue_d, errors='coerce', format='%b-%y')

当运行loan_df.issue_d.head()时,我得到了一个NaT错误。

但显然我漏掉了某些东西。有什么建议吗?

英文:

Convert the date column to a datetime type.
Create a new column, issue_year, and set it to the year from date column.

I need to separate a date in the format 'Dec-11' into month and year. Dec being the month and 11 being the year.

I need to also create a new column for the year.

All data needs to be converted to datetime type.

I have tried:

  1. loan_df.issue_d = pd.to_datetime(loan_df.issue_d, errors='coerce',format='%Y%m')

I am getting an NaT error when I run loan_df.issue_d.head()

But obviously I am missing something. Any suggestions all?

  1. print(loan_df['issue_d'].head())
  2. 0 Dec-11
  3. 1 Dec-11
  4. 2 Dec-11
  5. 3 Dec-11
  6. 4 Dec-11
  7. Name: issue_d, dtype: object

答案1

得分: 1

你可以尝试使用 format="%b-%y"

  1. df['new_issue_d'] = pd.to_datetime(df['issue_d'], format='%b-%y')
  2. print(df)

打印输出:

  1. issue_d new_issue_d
  2. 0 Dec-11 2011-12-01
  3. 1 Jan-11 2011-01-01
  4. 2 Feb-13 2013-02-01
  • %b(例如 Sep) - 本地化的缩写月份名称。
  • %y(例如 11) - 去掉世纪的年份,作为零填充的十进制数字。
英文:

You can try format="%b-%y":

  1. df['new_issue_d'] = pd.to_datetime(df['issue_d'], format='%b-%y')
  2. print(df)

Prints:

  1. issue_d new_issue_d
  2. 0 Dec-11 2011-12-01
  3. 1 Jan-11 2011-01-01
  4. 2 Feb-13 2013-02-01

  • %b (ex. Sep) - Month as locale’s abbreviated name.
  • %y (ex. 11) - Year without century as a zero-padded decimal number.

答案2

得分: 0

  1. 如果要保留 issue_date 的格式
  2. ```python
  3. loan_df['issue_year'] = pd.to_datetime(loan_df['issue_date'], format='%b-%y').dt.year
  4. print(loan_df)

  1. issue_date issue_year
  2. 0 Dec-11 2011
  3. 1 Dec-11 2011
  4. 2 Dec-11 2011
  5. 3 Dec-11 2011

如果要格式化 issue_date:

  1. loan_df['issue_date'] = pd.to_datetime(loan_df['issue_date'], format='%b-%y')
  2. loan_df['issue_year'] = loan_df['issue_date'].dt.year
  3. print(loan_df)

  1. issue_date issue_year
  2. 0 2011-12-01 2011
  3. 1 2011-12-01 2011
  4. 2 2011-12-01 2011
  5. 3 2011-12-01 2011
英文:

If you want keep the issue_date format:

  1. loan_df['issue_year'] = pd.to_datetime(loan_df['issue_date'], format='%b-%y').dt.year
  2. print(loan_df)

  1. issue_date issue_year
  2. 0 Dec-11 2011
  3. 1 Dec-11 2011
  4. 2 Dec-11 2011
  5. 3 Dec-11 2011

If you want to format issue_date :

  1. loan_df['issue_date'] = pd.to_datetime(loan_df['issue_date'], format='%b-%y')
  2. loan_df['issue_year'] = loan_df['issue_date'].dt.year
  3. print(loan_df)

  1. issue_date issue_year
  2. 0 2011-12-01 2011
  3. 1 2011-12-01 2011
  4. 2 2011-12-01 2011
  5. 3 2011-12-01 2011

huangapple
  • 本文由 发表于 2023年3月12日 06:37:17
  • 转载请务必保留本文链接:https://go.coder-hub.com/75710005.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定