Pandas 多重索引与多个条件

huangapple go评论62阅读模式
英文:

Pandas Multi-Index with multiple conditions

问题

我使用了在<https://stackoverflow.com/questions/53927460/select-rows-in-pandas-multiindex-dataframe>中讨论的.loc方法,因为我收到了一个KeyError: 'class'的错误,尽管我认为在我的现有数据框中存在' class '作为列名。后来发现' class '是多重索引中的两个索引之一(' group '是次要索引)。虽然.loc函数允许我选择具有' First ',' Second ',' Third '的行,但我正在努力确定如何应用附加条件来排除第二个索引(' group ')具有空白行的行。

当前数据框如下所示:

class group Column1
First A 123
First
Second B 123
Third C 123
Forth D 123

当前代码如下:

keep_rows = df.loc[['First', 'Second', 'Third']]

我的原始代码如下(由于引用的名称是索引而不是列名,所以引发了KeyError):

keep_rows = df[(df['class'].isin(['First', 'Second', 'Third'])) & (df['group'].isna())]

期望的数据框:

class group Column1
First A 123
Second B 123
Third C 123
英文:

I applied the .loc methodolgy discussed in <https://stackoverflow.com/questions/53927460/select-rows-in-pandas-multiindex-dataframe> because I was recieving a KeyError:'class' even though 'class' exists as (what I thought was a column name) in my existing dataframe. Later finding out that 'class' was one of two indexes in a multiindex ('group' being the secondary index). While the .loc function allows me to select rows with 'First','Second','Third' I'm struggling to determine how to then apply an additional condition to exclude rows where the second index ('group') has blank rows.

Current dataframe looks like this:

class group Column1
First A 123
First
Second B 123
Third C 123
Forth D 123

Current code looks like this:

keep_rows = df.loc[[&#39;First&#39;,&#39;Second&#39;,&#39;Third&#39;]]

My original code looked like this (and was throwing the KeyError due to referenced names beind indexs and not column names)

keep_rows = df[(df[&#39;class&#39;].isin([&#39;First&#39;,&#39;Second&#39;,&#39;Third&#39;])) &amp; (df[&#39;group&#39;].isna())]

Desired dataframe:

class group Column1
First A 123
Second B 123
Third C 123

答案1

得分: 3

你可以使用 get_level_values:

&gt;&gt;&gt; df[(df.index.get_level_values(&#39;class&#39;).isin([&#39;First&#39;,&#39;Second&#39;,&#39;Third&#39;])) 
       &amp; (df.index.get_level_values(&#39;group&#39;) != &#39;&#39;)]

              Column1
class  group         
First  A        123.0
Second B        123.0
Third  C        123.0

详情:

&gt;&gt;&gt; df.index.get_level_values(&#39;class&#39;).isin([&#39;First&#39;,&#39;Second&#39;,&#39;Third&#39;])
array([ True,  True,  True,  True, False])

&gt;&gt;&gt; df.index.get_level_values(&#39;group&#39;).notna()
array([ True, False,  True,  True,  True])
英文:

You can use get_level_values:

&gt;&gt;&gt; df[(df.index.get_level_values(&#39;class&#39;).isin([&#39;First&#39;,&#39;Second&#39;,&#39;Third&#39;])) 
       &amp; (df.index.get_level_values(&#39;group&#39;) != &#39;&#39;)]

              Column1
class  group         
First  A        123.0
Second B        123.0
Third  C        123.0

Details:

&gt;&gt;&gt; df.index.get_level_values(&#39;class&#39;).isin([&#39;First&#39;,&#39;Second&#39;,&#39;Third&#39;])
array([ True,  True,  True,  True, False])

&gt;&gt;&gt; df.index.get_level_values(&#39;group&#39;).notna()
array([ True, False,  True,  True,  True])

huangapple
  • 本文由 发表于 2023年7月11日 04:26:51
  • 转载请务必保留本文链接:https://go.coder-hub.com/76657118.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定