英文:
How to count the number of consecutive rows where only 2 columns have 0 as a value
问题
以下是您要翻译的部分:
我有一个看起来像这样的Dataframe:
A B C
0 0 4 1
1 0 0 2
2 0 0 1
3 2 0 3
4 1 1 1
我需要计算连续的行数,其中A和B列都有0作为值。
如果计数器小于10或大于20,我需要删除它们。
在上面的示例中,计数器为2,所以我期望这是输出:
A B C
0 0 4 1
3 2 0 3
4 1 1 1
我尝试过这样做:
```python
m1 = (df['A'].eq(0) & df['B'].eq(0))
m2 = df.groupby(m1.ne(m1.shift()).cumsum()).transform('size').le(9)
out = df[~(m1&m2)]
return out
但它什么都没做。
<details>
<summary>英文:</summary>
I have a Dataframe that looks like this:
A B C
0 0 4 1
1 0 0 2
2 0 0 1
3 2 0 3
4 1 1 1
I need to count the number of consecutive rows where both A and B columns have 0 as a value.
If the counter is less than 10 or more than 20 I need to delete all of them.
In the example above the counter is 2, so I'm expecting this as an output:
A B C
0 0 4 1
3 2 0 3
4 1 1 1
I tried this:
m1 = (df['A'].eq(0) & df['B'].eq(0))
m2 = df.groupby(m1.ne(m1.shift()).cumsum()).transform('size').le(9)
out = df[~(m1&m2)]
return out
But it does nothing.
</details>
# 答案1
**得分**: 2
以下是翻译好的部分:
使用[布尔索引](https://pandas.pydata.org/docs/user_guide/indexing.html#boolean-indexing)结合[`groupby.transform`](https://pandas.pydata.org/docs/reference/api/pandas.core.groupby.DataFrameGroupBy.transform.html)来设置对连续行的阈值条件:
```python
# 用于删除行的下限/上限(不包括)
LOW, HIGH = 1, 2
# 对于其中A和B都为0的行
m = df[['A', 'B']].eq(0).all(axis=1)
# 计算连续出现的次数
count = m.groupby((m != m.shift()).cumsum()).transform('size')
# 保留具有非零值或具有大于LOW /小于HIGH的连续零值的行
out = df.loc[(~m | count.between(LOW, HIGH, inclusive='neither'))]
输出:
A B C
0 0 4 1
3 2 0 3
4 1 1 1
注意:输出部分保持不变。
英文:
Use boolean indexing with groupby.transform
to set up the threshold condition on consecutive rows:
# boundaries below/above which
# to drop the rows (exclusive)
LOW, HIGH = 1, 2
# rows for which both A and B are 0
m = df[['A', 'B']].eq(0).all(axis=1)
# count the consecutive
count = m.groupby((m != m.shift()).cumsum()).transform('size')
# keep only the values with non zero
# or with > LOW / < HIGH consecutive zeros
out = df.loc[(~m|count.between(LOW, HIGH, inclusive='neither'))]
Output:
A B C
0 0 4 1
3 2 0 3
4 1 1 1
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论