转换 Pandas 系列中的日期。

huangapple go评论93阅读模式
英文:

Convert date in pandas series

问题

我需要帮助:pandas数据集中有一个日期列。其中一个列中的日期格式为"September 25, 2021"。我无法将其转换为"yyyy-mm-dd"格式(2021-09-25)。这对于将来将这些数据导入到mysql中是必要的(我通过dbever进行工作)。也许这是一个愚蠢的问题,但我是新手。

尝试使用to_datetime函数,但出现了这种错误(IndexError: list index out of range):

  1. date = {'January': '01', 'February': '02', 'March': '03', 'April': '04', 'May': '05', 'June': '06',
  2. 'July': '07', 'August': '08', 'September': '09', 'October': '10', 'November': '11', 'December': '12'}
  3. new_date = []
  4. for d in df['date_added']:
  5. month = d.split(' ')[0]
  6. day = d.split(' ')[1]
  7. year = d.split(', ')[2]
  8. res = year.split('-') + date[month].split('-') + day
  9. new_date.append(res)
英文:

I need help: pandas dataset has a date column. The date in one of the columns is in the format "September 25, 2021 ". I can't convert this to "yyyy-mm-dd" format (2021-09-25). This is necessary in order to import this data into mysql in the future (I work through dbever).
Might be a stupid question but I'm a newbie

Tried to use to_datetime function and this way (IndexError: list index out of range):

  1. date = {'January': '01', 'February': '02', 'March': '03', 'April': '04', 'May': '05', 'June': '06',
  2. 'July': '07', 'August': '08', 'September': '09', 'October': '10', 'November': '11', 'December': '12'}
  3. new_date = []
  4. for d in df['date_added']:
  5. month = d.split(' ')[0]
  6. day = d.split(' ')[1]
  7. year = d.split(', ')[2]
  8. res = year.split('-') + date[month].split('-') + day
  9. new_date.append(res)

答案1

得分: 0

你可以使用strftime。以下是一个示例:

  1. date_string = "September 25, 2021"
  2. date = pd.to_datetime(date_string)
  3. formatted_date = date.strftime('%Y-%m-%d')
  4. print(formatted_date) #2021-09-25
英文:

I think you can use strftime. Here is an example:

  1. date_string = "September 25, 2021"
  2. date = pd.to_datetime(date_string)
  3. formatted_date = date.strftime('%Y-%m-%d')
  4. print(formatted_date) #2021-09-25

答案2

得分: 0

  1. import pandas as pd
  2. df = pd.DataFrame({'date_added': ['September 25, 2010', 'April 1, 2023']})
  3. print(df)
  4. print()
  5. # 如果你实际上需要 datetime 对象
  6. df.date_added = pd.to_datetime(df.date_added) # dtype = datetime64[ns]
  7. print(df)
  8. print()
  9. # 重置数据框
  10. df = pd.DataFrame({'date_added': ['September 25, 2010', 'April 1, 2023']})
  11. # 如果你需要一个格式化的字符串
  12. df.date_added = pd.to_datetime(df.date_added).dt.strftime('%Y-%m-%d') # dtype = object
  13. print(df)
英文:
  1. import pandas as pd
  2. df = pd.DataFrame({'date_added': ['September 25, 2010', 'April 1, 2023']})
  3. print(df)
  4. print()
  5. # If you need actually need datetime object
  6. df.date_added = pd.to_datetime(df.date_added) # dtype = datetime64[ns]
  7. print(df)
  8. print()
  9. # reset dataframe
  10. df = pd.DataFrame({'date_added': ['September 25, 2010', 'April 1, 2023']})
  11. # If you need a formatted string
  12. df.date_added = pd.to_datetime(df.date_added).dt.strftime('%Y-%m-%d') # dtype = object
  13. print(df)

Output:

  1. date_added
  2. 0 September 25, 2010
  3. 1 April 1, 2023
  4. date_added
  5. 0 2010-09-25
  6. 1 2023-04-01
  7. date_added
  8. 0 2010-09-25
  9. 1 2023-04-01

答案3

得分: 0

  1. import pandas as pd
  2. df = pd.DataFrame({'date_added': ['September 25, 2010', 'April 1, 2023']})
  3. r = pd.to_datetime(df['date_added'], format='%B %d, %Y')
  4. print(r)

Result

  1. 0 2010-09-25
  2. 1 2023-04-01
  3. Name: date_added, dtype: datetime64[ns]
英文:
  1. import pandas as pd
  2. df = pd.DataFrame({'date_added': ['September 25, 2010', 'April 1, 2023']})
  3. r = pd.to_datetime(df['date_added'], format='%B %d, %Y')
  4. print(r)

Result

  1. 0 2010-09-25
  2. 1 2023-04-01
  3. Name: date_added, dtype: datetime64[ns]

huangapple
  • 本文由 发表于 2023年4月4日 03:10:35
  • 转载请务必保留本文链接:https://go.coder-hub.com/75922986.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定