英文:
Pandas: Split a specific column values to new columns and find occurrences of a value in all newly created columns
问题
Sure, here's the translated code portion:
我有两列名为“family”和“severity”的列。我想要拆分“severity”列中的唯一值,并在新创建的列中找到“family”列的出现次数。
初始数据框架:
```python
df
family severity
AA High
BB Critical
CC Medium
DD Low
AA Low
CC High
输出
df_output
family Critical High Medium Low Total
AA 0 1 0 1 2
BB 1 0 0 0 1
CC 0 1 1 0 2
DD 0 0 0 1 1
Total 1 2 1 2 6
这是翻译好的部分,没有其他内容。
<details>
<summary>英文:</summary>
I have two columns called "family" and "severity". I would like to split the unique values in the "severity" column and find the occurrences of column "family" in newly created columns.
Initial Dataframe:
df
family severity
AA High
BB Critical
CC Medium
DD Low
AA Low
CC High
Output
df_output
family Critical High Medium Low Total
AA 0 1 0 1 2
BB 1 0 0 0 1
CC 0 1 1 0 2
DD 0 0 0 1 1
Total 1 2 1 2 6
</details>
# 答案1
**得分**: 4
使用 [`crosstab`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.crosstab.html) 并使用 `margins=True`:
```python
final = pd.crosstab(df['family'], df['severity'], margins=True, margins_name='Total').rename_axis(None, axis=1)
print(final)
Critical High Low Medium Total
family
AA 0 1 1 0 2
BB 1 0 0 0 1
CC 0 1 0 1 2
DD 0 0 1 0 1
Total 1 2 2 1 6
从文档中:
margins
:布尔值,默认为 False
添加行/列边距(小计)。
margins_name
:字符串,默认为 'All'
当margins
为 True 时,包含总计的行/列的名称。
英文:
Use crosstab
using margins=True
:
final=pd.crosstab(df['family'],df['severity'],
margins=True,margins_name='Total').rename_axis(None,axis=1)
print(final)
Critical High Low Medium Total
family
AA 0 1 1 0 2
BB 1 0 0 0 1
CC 0 1 0 1 2
DD 0 0 1 0 1
Total 1 2 2 1 6
From docs:
>margins : bool, default False
Add row/column margins (subtotals).
>margins_name : str, default ‘All’
Name of the row/column that will contain the totals when margins is True.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论