Pandas:将特定列的值拆分到新列,并在所有新创建的列中查找值的出现次数

huangapple go评论85阅读模式
英文:

Pandas: Split a specific column values to new columns and find occurrences of a value in all newly created columns

问题

Sure, here's the translated code portion:

我有两列名为familyseverity的列我想要拆分severity列中的唯一值并在新创建的列中找到family列的出现次数

初始数据框架

```python
df

family severity
AA     High
BB     Critical
CC     Medium
DD     Low
AA     Low
CC     High

输出

df_output

family Critical High Medium Low Total
AA       0       1     0     1    2
BB       1       0     0     0    1
CC       0       1     1     0    2
DD       0       0     0     1    1
Total    1       2     1     2    6

这是翻译好的部分,没有其他内容。

<details>
<summary>英文:</summary>

I have two columns called &quot;family&quot; and &quot;severity&quot;. I would like to split the unique values in the &quot;severity&quot; column and find the occurrences of column &quot;family&quot; in newly created columns.

Initial Dataframe:

df

family severity
AA High
BB Critical
CC Medium
DD Low
AA Low
CC High

Output

df_output

family Critical High Medium Low Total
AA 0 1 0 1 2
BB 1 0 0 0 1
CC 0 1 1 0 2
DD 0 0 0 1 1
Total 1 2 1 2 6


</details>


# 答案1
**得分**: 4

使用 [`crosstab`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.crosstab.html) 并使用 `margins=True`:

```python
final = pd.crosstab(df['family'], df['severity'], margins=True, margins_name='Total').rename_axis(None, axis=1)
print(final)

            Critical  High  Low  Medium  Total
    family                                  
    AA             0     1    1       0      2
    BB             1     0    0       0      1
    CC             0     1    0       1      2
    DD             0     0    1       0      1
    Total          1     2    2       1      6

从文档中:

margins:布尔值,默认为 False
添加行/列边距(小计)。

margins_name:字符串,默认为 'All'
margins 为 True 时,包含总计的行/列的名称。

英文:

Use crosstab using margins=True:

final=pd.crosstab(df[&#39;family&#39;],df[&#39;severity&#39;],
       margins=True,margins_name=&#39;Total&#39;).rename_axis(None,axis=1)
print(final)

        Critical  High  Low  Medium  Total
family                                    
AA             0     1    1       0      2
BB             1     0    0       0      1
CC             0     1    0       1      2
DD             0     0    1       0      1
Total          1     2    2       1      6

From docs:
>margins : bool, default False
Add row/column margins (subtotals).

>margins_name : str, default ‘All’
Name of the row/column that will contain the totals when margins is True.

huangapple
  • 本文由 发表于 2020年1月6日 23:27:29
  • 转载请务必保留本文链接:https://go.coder-hub.com/59614725.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定