英文:
how to count failure occurrences in a column using pandas?
问题
我需要使用Python的pandas来以CSV格式整理测试结果。结果可以是"passed"或有时是"failed"。在我导入Python和我的代码之后,代码如下:
import pandas as pd
df = pd.read_csv('myfile.csv')
pass_res = df['Status'].value_counts()['passed']
fail_res = df['Status'].value_counts().get('failed', 0)
这段代码将在有失败的情况下运行。然而,如果没有失败,最后一行代码会导致错误。如何检查是否有失败,然后我将执行最后一行呢?
英文:
I need to use python's pandas to tabulate the test result in a csv format. The result could be "passsed" or sometime "failed". After I
import python as pd,my code is:
df = pd.read_csv('myfile.csv')
pass_res =df['Status'].value_counts()['passed']
fail_res =df['Status'].value_counts()['failed']
this code will work if there IS a case of fail. However, when there is no failure, the last line of code will cause an error. How do check, if there is a failure, then I will execute my last line.
答案1
得分: 2
使用Series.get
来提取找到的值,否则返回0。
s = df['Status'].value_counts()
passed = s.get('passed', 0)
failed = s.get('failed', 0)
英文:
Lets use Series.get
to yank the value if found otherwise return 0
s = df['Status'].value_counts()
passed = s.get('passed', 0)
failed = s.get('failed', 0)
答案2
得分: 2
以下是代码的翻译部分:
# 示例
df = pd.DataFrame({'Status': ['passed']*5 + ['other']*3})
status = pd.CategoricalDtype(['passed', 'failed'], ordered=True)
passed, failed = df['Status'].astype(status).value_counts().sort_index()
输出:
>>> passed
5
>>> failed
0
>>> df['Status'].astype(status).value_counts().sort_index()
Status
passed 5
failed 0
Name: count, dtype: int64
>>> df
Status
0 passed
1 passed
2 passed
3 passed
4 passed
5 other
6 other
7 other
请注意,上述内容只是代码的翻译,不包括问题的回答。
英文:
You can also add a CategoricalDType
as value_counts
returns all observed:
# sample
df = pd.DataFrame({'Status': ['passed']*5 + ['other']*3})
status = pd.CategoricalDtype(['passed', 'failed'], ordered=True)
passed, failed = df['Status'].astype(status).value_counts().sort_index()
Output:
>>> passed
5
>>> failed
0
>>> df['Status'].astype(status).value_counts().sort_index()
Status
passed 5
failed 0
Name: count, dtype: int64
>>> df
Status
0 passed
1 passed
2 passed
3 passed
4 passed
5 other
6 other
7 other
答案3
得分: 1
与@Corralien的方法类似,但使用reindex
函数:
df['Status'].value_counts(sort=False).reindex(['passed', 'failed'], fill_value=0)
一次性定义变量:
passed, failed = (df['Status'].value_counts(sort=False)
.reindex(['passed', 'failed'], fill_value=0)
)
英文:
Similar to @Corralien's but with reindex
:
df['Status'].value_counts(sort=False).reindex(['passed', 'failed'], fill_value=0)
Defining the variables in one shot:
passed, failed = (df['Status'].value_counts(sort=False)
.reindex(['passed', 'failed'], fill_value=0)
)
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论