英文:
pandas - filter rows with same value in many columns
问题
以下是翻译好的部分:
我有一个包含许多列的pandas DataFrame(大约有100多列,但确切数量并不重要)。
大多数行在所有列中具有相同的值,有些行在所有列中有多个唯一的值。
例如,在以下表格中,第1
和2
行在所有列中具有相同的值,而第3
行在所有列中有多个不同的值。
列 1 | 列 2 | 列 3 | 列 4 | ... | 列 n |
---|---|---|---|---|---|
A | A | A | A | ... | A |
A | A | A | A | ... | A |
C | A | B | A | ... | A |
我想要筛选只在其所有列中具有唯一值的行。在前面的示例中,我只会保留第3
行。
我知道如何使用掩码根据特定列中的值筛选行,但在这种情况下似乎不起作用。
有什么想法吗?
英文:
I have a pandas DataFrame with many columns (around 100+ columns but the exact amount doesn't matter).
Most rows have the same value in all columns, some rows have more than one unique value. <br>
For example, in the following table, rows 1
and 2
have the same value in all columns and row 3
has more than value in all columns.
column 1 | column 2 | column 3 | column 4 | ... | column n |
---|---|---|---|---|---|
A | A | A | A | ... | A |
A | A | A | A | ... | A |
C | A | B | A | ... | A |
I want to filter rows which have only 1 unique value in all of its columns. In the previous example I would only keep row 3
.
I know how to filter rows based on values in specific columns using masks, but this doesn't seem to work in the case. <br>
Any Ideas?
答案1
得分: 2
只返回翻译好的部分:
"Looks like you want to filter based on nunique
with boolean indexing":
"看起来你想基于nunique
和布尔索引进行筛选:"
"out = df[df.nunique(axis=1).ne(1)]":
"输出:"
column 1 column 2 column 3 column 4 column n
2 C A B A A
"Intermediate":
"中间步骤:"
df.nunique(axis=1).ne(1)
0 False
1 False
2 True
dtype: bool
英文:
Looks like you want to filter based on nunique
with boolean indexing:
out = df[df.nunique(axis=1).ne(1)]
Output:
column 1 column 2 column 3 column 4 column n
2 C A B A A
Intermediate:
df.nunique(axis=1).ne(1)
0 False
1 False
2 True
dtype: bool
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论