2023年4月4日 17:55:53go评论96阅读模式

英文:

How to iterate through each probability column of a dataframe and find the row number from where the probability goes below 50% in python

问题

I have a dataframe which is the output of predict_survival_function() from cox propotional model, which gives the probability of survival of customers for every month (index row number) and in columns we have the customers. So I would like to get the month number for each customers when the probability goes below 0.50. Below is the screenshot of the table.

我有一个数据框，它是来自Cox比例模型的predict_survival_function()的输出，它提供了每个月（索引行号）客户存活的概率，列中包括客户。因此，我想要获取每个客户的月份编号，当概率降至0.50以下时。下面是表格的截图。

The output I am looking for is something like below -
Suppose for first customer No. 4 (First column in the dataframe) If the probability goes less than 0.50 at row number 55. Then the output should be

我寻找的输出类似于下面的内容 -
假设对于第一个客户编号4（数据框中的第一列），如果概率在行号55处小于0.50。那么输出应该是

英文:

And similar for all the other columns in the data frame.
Any help is appreciated

答案1

得分: 2

IIUC，您可以使用 idxmax：

(df <= 0.5).idxmax()

完整输出：

out = ((df <= 0.5).idxmax().rename_axis('Customer Number')
                  .rename('Row Number').reset_index())
print(out)
# 输出
   Customer Number  Row Number
0                4           4
1                5           6
2                7           7

英文:

Suppose the following dataframe:

&gt;&gt;&gt; df
          4         5         7
0  0.974789  0.976546  0.913151
1  0.918408  0.815823  0.909577
2  0.748928  0.801727  0.856562
3  0.691171  0.791815  0.794988
4  0.442441  0.669530  0.750395  # Customer 4, first value below 0.5 -&gt; row 4
5  0.378585  0.568831  0.561721
6  0.285419  0.287814  0.521966  # Customer 5, first value below 0.5 -&gt; row 6
7  0.240335  0.216207  0.176980  # Customer 7, first value below 0.5 -&gt; row 7
8  0.191656  0.095793  0.118300
9  0.183290  0.087297  0.035063

IIUC, you can use idxmax:

&gt;&gt;&gt; (df &lt;= 0.5).idxmax()
4    4
5    6
7    7
dtype: int64

Full output:

out = ((df &lt;= 0.5).idxmax().rename_axis(&#39;Customer Number&#39;)
                  .rename(&#39;Row Number&#39;).reset_index())
print(out)
# Output
   Customer Number  Row Number
0                4           4
1                5           6
2                7           7

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何在Python中迭代遍历数据框每个概率列，并找到概率低于50%的行号。

问题

答案1

mro和mro为什么不在类对象的dir()中列出？

遇到问题在正确导入tensorflow Tokenizer和tensorflow padded_sequences。

The iterating polygons increasing by length of 10 px eachtime don't center perfectly with its inner polygon. What could the maths after line 11 be?

适当的正则表达式 (re) 模式在Python中的表示是：

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。