如何使用pandas计算列中的失败次数?

huangapple go评论61阅读模式
英文:

how to count failure occurrences in a column using pandas?

问题

我需要使用Python的pandas来以CSV格式整理测试结果。结果可以是"passed"或有时是"failed"。在我导入Python和我的代码之后,代码如下:

import pandas as pd
df = pd.read_csv('myfile.csv')
pass_res = df['Status'].value_counts()['passed']
fail_res = df['Status'].value_counts().get('failed', 0)

这段代码将在有失败的情况下运行。然而,如果没有失败,最后一行代码会导致错误。如何检查是否有失败,然后我将执行最后一行呢?

英文:

I need to use python's pandas to tabulate the test result in a csv format. The result could be "passsed" or sometime "failed". After I

    import python as pd,my code is:
    df = pd.read_csv('myfile.csv')
    pass_res =df['Status'].value_counts()['passed']
    fail_res =df['Status'].value_counts()['failed']

this code will work if there IS a case of fail. However, when there is no failure, the last line of code will cause an error. How do check, if there is a failure, then I will execute my last line.

答案1

得分: 2

使用Series.get来提取找到的值,否则返回0。

s = df['Status'].value_counts()

passed = s.get('passed', 0)
failed = s.get('failed', 0)
英文:

Lets use Series.get to yank the value if found otherwise return 0

s = df['Status'].value_counts()

passed = s.get('passed', 0)
failed = s.get('failed', 0)

答案2

得分: 2

以下是代码的翻译部分:

# 示例
df = pd.DataFrame({'Status': ['passed']*5 + ['other']*3})

status = pd.CategoricalDtype(['passed', 'failed'], ordered=True)
passed, failed = df['Status'].astype(status).value_counts().sort_index()

输出:

>>> passed
5

>>> failed
0

>>> df['Status'].astype(status).value_counts().sort_index()
Status
passed    5
failed    0
Name: count, dtype: int64

>>> df
   Status
0  passed
1  passed
2  passed
3  passed
4  passed
5   other
6   other
7   other

请注意,上述内容只是代码的翻译,不包括问题的回答。

英文:

You can also add a CategoricalDType as value_counts returns all observed:

# sample
df = pd.DataFrame({'Status': ['passed']*5 + ['other']*3})

status = pd.CategoricalDtype(['passed', 'failed'], ordered=True)
passed, failed = df['Status'].astype(status).value_counts().sort_index()

Output:

>>> passed
5

>>> failed
0

>>> df['Status'].astype(status).value_counts().sort_index()
Status
passed    5
failed    0
Name: count, dtype: int64

>>> df
   Status
0  passed
1  passed
2  passed
3  passed
4  passed
5   other
6   other
7   other

答案3

得分: 1

与@Corralien的方法类似,但使用reindex函数:

df['Status'].value_counts(sort=False).reindex(['passed', 'failed'], fill_value=0)

一次性定义变量:

passed, failed = (df['Status'].value_counts(sort=False)
                  .reindex(['passed', 'failed'], fill_value=0)
                 )
英文:

Similar to @Corralien's but with reindex:

df['Status'].value_counts(sort=False).reindex(['passed', 'failed'], fill_value=0)

Defining the variables in one shot:

passed, failed = (df['Status'].value_counts(sort=False)
                  .reindex(['passed', 'failed'], fill_value=0)
                 )

huangapple
  • 本文由 发表于 2023年6月6日 09:51:02
  • 转载请务必保留本文链接:https://go.coder-hub.com/76410949.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定