英文:
How can I drop all rows in a group if any column contains only nan values for that group?
问题
我试图删除属于一个组的所有行,如果在该组中存在一个只包含NaN值的列。
例如:
ID | 列A | 列B |
---|---|---|
1 | 2 | NaN |
1 | 3 | NaN |
2 | 2 | 3 |
3 | 3 | NaN |
3 | NaN | 4 |
4 | NaN | NaN |
4 | NaN | 4 |
在这种情况下,我只想删除ID为1和ID为4的行,因为ID为1的列B只包含NaN,ID为4的列A只包含NaN。组2是正常的,因为没有NaN,组3也正常,因为两列都不仅包含NaN。
换句话说,我期望的输出是:
ID | 列A | 列B |
---|---|---|
2 | 2 | 3 |
3 | 3 | NaN |
3 | NaN | 4 |
如何实现这一点?
我尝试过遵循https://stackoverflow.com/questions/38574872/python-pandas-remove-group-based-on-collective-nan-count,但这仅考虑单个列,或者对所有列进行NaN聚合。
英文:
I am trying to drop all rows that belong to a group if within that group there exists a column that only contains nan values.
For example:
ID | Column A | Column B |
---|---|---|
1 | 2 | Nan |
1 | 3 | Nan |
2 | 2 | 3 |
3 | 3 | Nan |
3 | NaN | 4 |
4 | Nan | Nan |
4 | NaN | 4 |
In this case, I want only the rows for ID 1 and ID 4 to be removed, as Column B contains only nans for ID 1 and Column A contains only nans for ID 4. Group 2 is fine because no nans, group 3 is fine because neither column contains only nan.
In other words, my expected output is this:
ID | Column A | Column B |
---|---|---|
2 | 2 | 3 |
3 | 3 | Nan |
3 | NaN | 4 |
How do I achieve this?
I tried following https://stackoverflow.com/questions/38574872/python-pandas-remove-group-based-on-collective-nan-count, but this only takes into account either a single column, or aggregates nans across all columns.
答案1
得分: 2
df.groupby('ID').filter(lambda x: ~(x.isna().all().any()))
英文:
Filter out by condition:
df.groupby('ID').filter(lambda x: ~(x.isna().all().any()))
ID Column A Column B
2 2 2 3
3 3 3 None
4 3 None 4
答案2
得分: 0
grouped = df.groupby('ID').sum()
grouped = grouped[grouped['Column B'] > 0]
df[df.index.isin(grouped.index)]
英文:
Try this:
grouped = df.groupby('ID').sum()
grouped = grouped[grouped['Column B'] > 0]
df[df.index.isin(grouped.index)]
答案3
得分: 0
输出:
ID 列A 列B
2 2 2.0 3
3 3 3.0 NaN
4 3 NaN 4
英文:
Code
df.groupby('ID').filter(lambda x: x['Column B'].notna().sum() > 0)
output:
ID Column A Column B
2 2 2.0 3
3 3 3.0 NaN
4 3 NaN 4
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论