2023年5月21日 18:27:29go评论94阅读模式

英文:

How to count the number of consecutive rows where only 2 columns have 0 as a value

问题

以下是您要翻译的部分：

我有一个看起来像这样的Dataframe：

A B C
0 0 4 1
1 0 0 2
2 0 0 1
3 2 0 3
4 1 1 1

我需要计算连续的行数，其中A和B列都有0作为值。
如果计数器小于10或大于20，我需要删除它们。
在上面的示例中，计数器为2，所以我期望这是输出：

A B C
0 0 4 1
3 2 0 3
4 1 1 1


我尝试过这样做：
```python
m1 = (df['A'].eq(0) & df['B'].eq(0)) 
m2 = df.groupby(m1.ne(m1.shift()).cumsum()).transform('size').le(9) 
out = df[~(m1&m2)] 
return out

但它什么都没做。


<details>
<summary>英文:</summary>
I have a Dataframe that looks like this:

A B C
0 0 4 1
1 0 0 2
2 0 0 1
3 2 0 3
4 1 1 1

I need to count the number of consecutive rows where both A and B columns have 0 as a value.
If the counter is less than 10 or more than 20 I need to delete all of them.
In the example above the counter is 2, so I&#39;m expecting this as an output:

A B C
0 0 4 1
3 2 0 3
4 1 1 1


I tried this:

m1 = (df['A'].eq(0) & df['B'].eq(0))
m2 = df.groupby(m1.ne(m1.shift()).cumsum()).transform('size').le(9)
out = df[~(m1&m2)]
return out

But it does nothing.
</details>
# 答案1
**得分**: 2
以下是翻译好的部分：
使用[布尔索引](https://pandas.pydata.org/docs/user_guide/indexing.html#boolean-indexing)结合[`groupby.transform`](https://pandas.pydata.org/docs/reference/api/pandas.core.groupby.DataFrameGroupBy.transform.html)来设置对连续行的阈值条件：
```python
# 用于删除行的下限/上限（不包括）
LOW, HIGH = 1, 2
# 对于其中A和B都为0的行
m = df[['A', 'B']].eq(0).all(axis=1)
# 计算连续出现的次数
count = m.groupby((m != m.shift()).cumsum()).transform('size')
# 保留具有非零值或具有大于LOW /小于HIGH的连续零值的行
out = df.loc[(~m | count.between(LOW, HIGH, inclusive='neither'))]

输出：

注意：输出部分保持不变。

英文:

Use boolean indexing with groupby.transform to set up the threshold condition on consecutive rows:

# boundaries below/above which
# to drop the rows (exclusive)
LOW, HIGH = 1, 2
# rows for which both A and B are 0
m = df[[&#39;A&#39;, &#39;B&#39;]].eq(0).all(axis=1)
# count the consecutive
count = m.groupby((m != m.shift()).cumsum()).transform(&#39;size&#39;)
# keep only the values with non zero
# or with &gt; LOW / &lt; HIGH consecutive zeros
out = df.loc[(~m|count.between(LOW, HIGH, inclusive=&#39;neither&#39;))]

Output:

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

计算连续行中仅有两列的值为0的数量。

问题

Python Mysql查询未返回结果。

查找两个列表中共同的最大数 #python

删除子字符串出现以及其后的任何内容。

Langchain agents

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。