Pandas:将特定列的值拆分到新列,并在所有新创建的列中查找值的出现次数

huangapple go评论116阅读模式
英文:

Pandas: Split a specific column values to new columns and find occurrences of a value in all newly created columns

问题

Sure, here's the translated code portion:

  1. 我有两列名为familyseverity的列我想要拆分severity列中的唯一值并在新创建的列中找到family列的出现次数
  2. 初始数据框架
  3. ```python
  4. df
  5. family severity
  6. AA High
  7. BB Critical
  8. CC Medium
  9. DD Low
  10. AA Low
  11. CC High

输出

  1. df_output
  2. family Critical High Medium Low Total
  3. AA 0 1 0 1 2
  4. BB 1 0 0 0 1
  5. CC 0 1 1 0 2
  6. DD 0 0 0 1 1
  7. Total 1 2 1 2 6
  1. 这是翻译好的部分,没有其他内容。
  2. <details>
  3. <summary>英文:</summary>
  4. I have two columns called &quot;family&quot; and &quot;severity&quot;. I would like to split the unique values in the &quot;severity&quot; column and find the occurrences of column &quot;family&quot; in newly created columns.
  5. Initial Dataframe:

df

family severity
AA High
BB Critical
CC Medium
DD Low
AA Low
CC High

  1. Output

df_output

family Critical High Medium Low Total
AA 0 1 0 1 2
BB 1 0 0 0 1
CC 0 1 1 0 2
DD 0 0 0 1 1
Total 1 2 1 2 6

  1. </details>
  2. # 答案1
  3. **得分**: 4
  4. 使用 [`crosstab`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.crosstab.html) 并使用 `margins=True`:
  5. ```python
  6. final = pd.crosstab(df['family'], df['severity'], margins=True, margins_name='Total').rename_axis(None, axis=1)
  7. print(final)

  1. Critical High Low Medium Total
  2. family
  3. AA 0 1 1 0 2
  4. BB 1 0 0 0 1
  5. CC 0 1 0 1 2
  6. DD 0 0 1 0 1
  7. Total 1 2 2 1 6

从文档中:

margins:布尔值,默认为 False
添加行/列边距(小计)。

margins_name:字符串,默认为 'All'
margins 为 True 时,包含总计的行/列的名称。

英文:

Use crosstab using margins=True:

  1. final=pd.crosstab(df[&#39;family&#39;],df[&#39;severity&#39;],
  2. margins=True,margins_name=&#39;Total&#39;).rename_axis(None,axis=1)
  3. print(final)

  1. Critical High Low Medium Total
  2. family
  3. AA 0 1 1 0 2
  4. BB 1 0 0 0 1
  5. CC 0 1 0 1 2
  6. DD 0 0 1 0 1
  7. Total 1 2 2 1 6

From docs:
>margins : bool, default False
Add row/column margins (subtotals).

>margins_name : str, default ‘All’
Name of the row/column that will contain the totals when margins is True.

huangapple
  • 本文由 发表于 2020年1月6日 23:27:29
  • 转载请务必保留本文链接:https://go.coder-hub.com/59614725.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定