英文:
Pandas Multi-Index with multiple conditions
问题
我使用了在<https://stackoverflow.com/questions/53927460/select-rows-in-pandas-multiindex-dataframe>中讨论的.loc方法,因为我收到了一个KeyError: 'class'的错误,尽管我认为在我的现有数据框中存在' class '作为列名。后来发现' class '是多重索引中的两个索引之一(' group '是次要索引)。虽然.loc函数允许我选择具有' First ',' Second ',' Third '的行,但我正在努力确定如何应用附加条件来排除第二个索引(' group ')具有空白行的行。
当前数据框如下所示:
class | group | Column1 |
---|---|---|
First | A | 123 |
First | ||
Second | B | 123 |
Third | C | 123 |
Forth | D | 123 |
当前代码如下:
keep_rows = df.loc[['First', 'Second', 'Third']]
我的原始代码如下(由于引用的名称是索引而不是列名,所以引发了KeyError):
keep_rows = df[(df['class'].isin(['First', 'Second', 'Third'])) & (df['group'].isna())]
期望的数据框:
class | group | Column1 |
---|---|---|
First | A | 123 |
Second | B | 123 |
Third | C | 123 |
英文:
I applied the .loc methodolgy discussed in <https://stackoverflow.com/questions/53927460/select-rows-in-pandas-multiindex-dataframe> because I was recieving a KeyError:'class' even though 'class' exists as (what I thought was a column name) in my existing dataframe. Later finding out that 'class' was one of two indexes in a multiindex ('group' being the secondary index). While the .loc function allows me to select rows with 'First','Second','Third' I'm struggling to determine how to then apply an additional condition to exclude rows where the second index ('group') has blank rows.
Current dataframe looks like this:
class | group | Column1 |
---|---|---|
First | A | 123 |
First | ||
Second | B | 123 |
Third | C | 123 |
Forth | D | 123 |
Current code looks like this:
keep_rows = df.loc[['First','Second','Third']]
My original code looked like this (and was throwing the KeyError due to referenced names beind indexs and not column names)
keep_rows = df[(df['class'].isin(['First','Second','Third'])) & (df['group'].isna())]
Desired dataframe:
class | group | Column1 |
---|---|---|
First | A | 123 |
Second | B | 123 |
Third | C | 123 |
答案1
得分: 3
你可以使用 get_level_values
:
>>> df[(df.index.get_level_values('class').isin(['First','Second','Third']))
& (df.index.get_level_values('group') != '')]
Column1
class group
First A 123.0
Second B 123.0
Third C 123.0
详情:
>>> df.index.get_level_values('class').isin(['First','Second','Third'])
array([ True, True, True, True, False])
>>> df.index.get_level_values('group').notna()
array([ True, False, True, True, True])
英文:
You can use get_level_values
:
>>> df[(df.index.get_level_values('class').isin(['First','Second','Third']))
& (df.index.get_level_values('group') != '')]
Column1
class group
First A 123.0
Second B 123.0
Third C 123.0
Details:
>>> df.index.get_level_values('class').isin(['First','Second','Third'])
array([ True, True, True, True, False])
>>> df.index.get_level_values('group').notna()
array([ True, False, True, True, True])
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论