pandas – 在多列中筛选具有相同值的行

huangapple go评论78阅读模式
英文:

pandas - filter rows with same value in many columns

问题

以下是翻译好的部分:

我有一个包含许多列的pandas DataFrame(大约有100多列,但确切数量并不重要)。

大多数行在所有列中具有相同的值,有些行在所有列中有多个唯一的值。
例如,在以下表格中,第12行在所有列中具有相同的值,而第3行在所有列中有多个不同的值。

列 1 列 2 列 3 列 4 ... 列 n
A A A A ... A
A A A A ... A
C A B A ... A

我想要筛选只在其所有列中具有唯一值的行。在前面的示例中,我只会保留第3行。
我知道如何使用掩码根据特定列中的值筛选行,但在这种情况下似乎不起作用。
有什么想法吗?

英文:

I have a pandas DataFrame with many columns (around 100+ columns but the exact amount doesn't matter).

Most rows have the same value in all columns, some rows have more than one unique value. <br>
For example, in the following table, rows 1 and 2 have the same value in all columns and row 3 has more than value in all columns.

column 1 column 2 column 3 column 4 ... column n
A A A A ... A
A A A A ... A
C A B A ... A

I want to filter rows which have only 1 unique value in all of its columns. In the previous example I would only keep row 3.
I know how to filter rows based on values in specific columns using masks, but this doesn't seem to work in the case. <br>
Any Ideas?

答案1

得分: 2

只返回翻译好的部分:

"Looks like you want to filter based on nunique with boolean indexing":
"看起来你想基于nunique布尔索引进行筛选:"

"out = df[df.nunique(axis=1).ne(1)]":
"输出:"

  column 1 column 2 column 3 column 4 column n
2        C        A        B        A        A

"Intermediate":
"中间步骤:"

df.nunique(axis=1).ne(1)

0    False
1    False
2     True
dtype: bool
英文:

Looks like you want to filter based on nunique with boolean indexing:

out = df[df.nunique(axis=1).ne(1)]

Output:

  column 1 column 2 column 3 column 4 column n
2        C        A        B        A        A

Intermediate:

df.nunique(axis=1).ne(1)

0    False
1    False
2     True
dtype: bool

huangapple
  • 本文由 发表于 2023年3月15日 20:44:56
  • 转载请务必保留本文链接:https://go.coder-hub.com/75744892.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定