使用pandas python中的groupby键来创建另一列。

huangapple go评论79阅读模式
英文:

Use key of groupby to create another column pandas python

问题

You can resolve the "keyword error: month" by renaming the 'month' column in your DataFrame before adding the 'Month Name' column. Here's the corrected code:

import calendar

# Rename the 'month' column to avoid the keyword error
df1 = df1.rename(columns={'month': 'Month'})

# Add the 'Month Name' column using the renamed 'Month' column
df1['Month Name'] = df1['Month'].apply(lambda x: calendar.month_abbr[x])

# Now your DataFrame should have the 'Month Name' column

This code renames the 'month' column to 'Month' and then adds the 'Month Name' column based on the renamed column, which should resolve the keyword error you encountered.

英文:

df

order_date    month      year   Days  Data
2015-12-20     12        2014    1     3
2016-1-21      1         2014    2     3
2015-08-20     8         2015    1     1 
2016-04-12     4         2016	 4     1

and so on

Code:(finding mean, min, and median of days column and finding number of order_dates month wise for each respective year)

df1 = (df.groupby(["year", "month"])
     .agg(Min_days=("days", 'min'),
          Avg_days=("days", 'mean'),
          Median_days=('days','median'),
          Count = ('order_date', 'count'))
     .reset_index())

df1

   year	 month	Min_days	Avg_days	Median_days	    Count
    2015   1       9        12.56666666          10         4
    2015   2       10       13.67678788          9          3    
   ........................................................
    2016   12       12       15.7889990           19          2
    and so on...

Issue at hand:

I want to have another column month name in the table using key month from df1. Im doing this:

Output I want:

    year month Min_days	      Avg_days	  Median_days	  Count    Month Name
    2015   1       9        12.56666666          10         4       Jan
    2015   2       10       13.67678788          9          3       Feb
   ........................................................
    2016   12       12       15.7889990         19          2     Dec
    and so on...

import calendar
df1['Month Name']=df1['month'].apply(lambda x:calendar.month_abbr[x])

But I am getting keyword error: month. I am unable to use key month to create another column month name. Pls help

答案1

得分: 1

似乎没有month列,原因可能是monthMultiIndexlevel

检查一下:

print(df1.index.names)
print(df1.columns.tolist())

所以需要:

df1 = df1.reset_index()
df1['Month Name'] = df1['month'].apply(lambda x: calendar.month_abbr[x])
英文:

It seems there is no column month, resaon should be month is level of MultiIndex.

Check it :

print (df1.index.names)
print (df1.columns.tolist())

So need:

df1 = df1.reset_index()
df1['Month Name']=df1['month'].apply(lambda x:calendar.month_abbr[x])

答案2

得分: 1

你可以尝试使用get_level_values和多级索引映射:

s = df1.index.get_level_values(1).map({i:e for i,e in enumerate([*calendar.month_abbr])})
df1 = df1.assign(Month=pd.Series(s,index=df1.index))

或者更简单的方法,如果数据框不是多级索引并且已经重置,只需使用:

df1 = df1.assign(Month=np.array(calendar.month_abbr)[df1['month']])
英文:

You can try map with get_level_values for multiindex mapping:

s = df1.index.get_level_values(1).map({i:e for i,e in enumerate([*calendar.month_abbr])})
df1 = df1.assign(Month=pd.Series(s,index=df1.index)

Or even simpler without apply ,if the dataframe is not a multi index and already reset, just use

df1 = df1.assign(Month=np.array(calendar.month_abbr)[df1['month']])

huangapple
  • 本文由 发表于 2020年1月6日 17:12:51
  • 转载请务必保留本文链接:https://go.coder-hub.com/59609355.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定