如何删除pandas DataFrame中包含零的尾行

huangapple go评论61阅读模式
英文:

How to remove trailing rows that contain zero of pandas DataFrame

问题

我有一个带有单列的pandas数据框,该列以一些值为零结尾,如下所示:

    index value
    0    4.0
    1    34.0
    2    -2.0
    3    15.0
    ...    ...
    96     0.0
    97     45
    98     0.0
    99     0.0
    100    0.0

我想要删除包含零值的尾部行,生成以下数据框:

    index value
    0    4.0
    1    34.0
    2    -2.0
    3    15.0
    ...    ...
    96     0.0
    97     45

如何通过利用pandas的函数来实现呢?

我知道可以通过迭代地检查数据框的最后一个值并删除它,但我更愿意通过使用pandas的内置函数来实现,因为这会更快。

while df.iloc[-1, 0] == 0:
    df.drop(df.tail(1).index, inplace=True)

编辑:需要明确的是,数据框可能包含其他零值,但我只想删除尾部的零值,而其他零值应保持不变。我已相应地编辑了示例。

英文:

I have a pandas dataframe with a single column, which ends with some values being zero, like so:

index value
0    4.0
1    34.0
2    -2.0
3    15.0
...    ...
96     0.0
97     45
98     0.0
99     0.0
100    0.0

I would like to strip away the trailing rows that contain the zero value, producing the following dataframe:

index value
0    4.0
1    34.0
2    -2.0
3    15.0
...    ...
96     0.0
97     45

How can I do it by leveraging pandas's functions?

I know that I can check the last value of the dataframe iteratively and remove it if it's zero, but I'd rather do it in a way that leverages pandas's built-in function because this would be much faster.

while df.iloc[-1,0] == 0:
    df.drop(df.tail(1).index,inplace=True)

EDIT: to be clear, the dataframe may or may not contain other zeros. However, I only want to strip trailing zeros, while the other zeros should stay untouched. I have edited the example accordingly.

答案1

得分: 2

假设零值都堆叠在DataFrame的末尾:

# 找到最后一个非零值的索引
last_nonzero_index = df['value'].to_numpy().nonzero()[0][-1]

# 创建一个只包含非零行的新DataFrame
new_df = df.iloc[:last_nonzero_index + 1]

否则,如果零值分散在整个DataFrame中:

# 找到非零值的索引
nonzero_index = df['value'].to_numpy().nonzero()[0]

# 创建一个只包含非零行的新DataFrame
new_df = df.iloc[nonzero_index]
英文:

Assuming that the zero values are all stacked at the end of the DataFrame:

# find the index of the last non-zero value
last_nonzero_index = df['value'].to_numpy().nonzero()[0][-1]

# create a new DataFrame with only the non-zero rows
new_df = df.iloc[:last_nonzero_index + 1]

Otherwise, if they are scattered throughout the DataFrame:

# find index of non-zero values
nonzero_index = df['value'].to_numpy().nonzero()[0]

# create a new DataFrame with only the non-zero rows
new_df = df.iloc[nonzero_index]

答案2

得分: 2

使用反转的 cummax 和布尔索引(boolean indexing):

out = df[df.loc[::-1, 'value'].ne(0).cummax()]

输出:

       value
index       
0        4.0
1       34.0
2       -2.0
3       15.0
97      45.0

中间步骤:

       value   mask
index              
0        4.0   True
1       34.0   True
2       -2.0   True
3       15.0   True
97      45.0   True
98       0.0  False
99       0.0  False
100      0.0  False

或者,如果您确保至少有一个非零值:

out = df.loc[:df.loc[::-1, 'value'].ne(0).idxmax()]
英文:

Use boolean indexing with a reversed cummax:

out = df[df.loc[::-1, 'value'].ne(0).cummax()]

Output:

       value
index       
0        4.0
1       34.0
2       -2.0
3       15.0
97      45.0

Intermediate:

       value   mask
index              
0        4.0   True
1       34.0   True
2       -2.0   True
3       15.0   True
97      45.0   True
98       0.0  False
99       0.0  False
100      0.0  False

Alternatively, if you are sure that there is at least one non-zero value:

out = df.loc[:df.loc[::-1, 'value'].ne(0).idxmax()]

答案3

得分: 1

你可以使用广播来完成

df = df[(df != 0.0).any(axis=1)]
英文:

You can do it with broadcasting

df = df[(df != 0.0).any(axis=1)]

答案4

得分: 1

您可以将value列与0进行比较并对布尔结果进行反向累加和在累加后末尾的0将保持为0

```python
out = df[df.loc[::-1, 'value'].ne(0).cumsum()[::-1].ne(0)]
print(out)

    value
0     4.0
1    34.0
2    -2.0
3    15.0
4     0.0
97   45.0
英文:

You can compare value column with 0 and do a reverse cumsum of the boolean result. The tailing 0 would keep 0 after the cumsum.

out = df[df.loc[::-1, 'value'].ne(0).cumsum()[::-1].ne(0)]
print(out)

    value
0     4.0
1    34.0
2    -2.0
3    15.0
4     0.0
97   45.0

huangapple
  • 本文由 发表于 2023年5月24日 18:28:44
  • 转载请务必保留本文链接:https://go.coder-hub.com/76322534.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定