如何在pandas中对下面显示的图像进行多列分组?

huangapple go评论101阅读模式
英文:

How to use groupby for multiple columns in pandas for the below shown image?

问题

  1. 这是在pandas中的输入表格:
  2. [![在这里输入图片描述][1]][1]
  3. [1]: https://i.stack.imgur.com/Du7js.png
  4.  
  5. 这是如下所示的输出表格:
  6. [![在这里输入图片描述][2]][2]
  7. [2]: https://i.stack.imgur.com/3GZuK.png
  8.  
  9. dtype: int64
  10.  
  11. 亲爱的朋友们,
  12.  
  13. 我是pandas的新手,如何使用pandas获得第二张图片中显示的结果。
  14. 我使用以下方法得到如下输出:
  15.  
  16. "df.groupby(['Months', 'Status']).size()"
  17.  
  18. Months Status
  19.  
  20. Apr-20 IW 2
  21. OW 1
  22.  
  23. Jun-20 IW 4
  24. OW 4
  25.  
  26. May-20 IW 3
  27. OW 2
  28.  
  29. dtype: int64
  30.  
  31. 但如何将这个输出转换为第二张图片所示的样式呢?
  32. 如果有人能帮助我,将会更有帮助。提前感谢。
英文:

<pre><br> This is input table in pandas:
[![enter image description here][1]][1]
[1]: https://i.stack.imgur.com/Du7js.png

<br>

this is an output table as shown below:
[![enter image description here][2]][2]
[2]: https://i.stack.imgur.com/3GZuK.png

dtype: int64
<br>
<br>
Dear Friends,
<br>
I am new to pandas, how to get the result is shown in the second image using pandas.
I am getting output as shown below using this approach <br>

"df.groupby(['Months', 'Status']).size()"

<br>
Months Status
<br>
Apr-20 IW 2
<br>
OW 1
<br>
Jun-20 IW 4
<br>
OW 4
<br>
May-20 IW 3
<br>
OW 2
<br>

dtype: int64
<br>
But how to convert this output as shown in the second image?
It will be more helpful if someone is able to help me. Thanks in advance.
<br>

答案1

得分: 2

使用crosstab函数,带有margins=True参数,然后如果需要,移除最后的Total列,通过DataFrame.reindex按照原始列的顺序重新排列列,最后通过DataFrame.reset_index将索引转换为列,并通过DataFrame.rename_axis移除列名:

  1. df = (pd.crosstab(df['Status'], df['Months'], margins_name='Total', margins=True)
  2. .iloc[:, :-1]
  3. .reindex(df['Months'].unique(), axis=1)
  4. .reset_index()
  5. .rename_axis(None, axis=1))
  6. print (df)
  7. Status Apr_20 May_20 Jun_20
  8. 0 IW 4 2 4
  9. 1 OW 1 2 4
  10. 2 Total 5 4 8
英文:

Use crosstab with margins=True parameter, then if necessary remove last Total column, change order of columns by DataFrame.reindex with ordering of original column and last convert index to column by DataFrame.reset_index and remove columns names by DataFrame.rename_axis:

  1. df = (pd.crosstab(df[&#39;Status&#39;], df[&#39;Months&#39;], margins_name=&#39;Total&#39;, margins=True)
  2. .iloc[:, :-1]
  3. .reindex(df[&#39;Months&#39;].unique(), axis=1)
  4. .reset_index()
  5. .rename_axis(None, axis=1))
  6. print (df)
  7. Status Apr_20 May_20 Jun_20
  8. 0 IW 4 2 4
  9. 1 OW 1 2 4
  10. 2 Total 5 4 8

答案2

得分: 1

Unstack,然后转置:

  1. df = df.groupby(['Months', 'Status']).size().unstack().T

获得一个 total 行:

  1. df.sum().rename('Total').to_frame().T.append(df)
英文:

Unstack, and then transpose:

  1. df = df.groupby([&#39;Months&#39;, &#39;Status&#39;]).size().unstack().T

To get a total row:

  1. df.sum().rename(&#39;Total&#39;).to_frame().T.append(df)

huangapple
  • 本文由 发表于 2020年1月3日 18:23:47
  • 转载请务必保留本文链接:https://go.coder-hub.com/59576849.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定