2023年7月11日 04:26:51go评论91阅读模式

英文:

Pandas Multi-Index with multiple conditions

问题

我使用了在<https://stackoverflow.com/questions/53927460/select-rows-in-pandas-multiindex-dataframe>中讨论的.loc方法，因为我收到了一个KeyError: 'class'的错误，尽管我认为在我的现有数据框中存在' class '作为列名。后来发现' class '是多重索引中的两个索引之一（' group '是次要索引）。虽然.loc函数允许我选择具有' First '，' Second '，' Third '的行，但我正在努力确定如何应用附加条件来排除第二个索引（' group '）具有空白行的行。

当前数据框如下所示：

class	group	Column1
First	A	123
First
Second	B	123
Third	C	123
Forth	D	123

当前代码如下：

keep_rows = df.loc[['First', 'Second', 'Third']]

我的原始代码如下（由于引用的名称是索引而不是列名，所以引发了KeyError）：

keep_rows = df[(df['class'].isin(['First', 'Second', 'Third'])) & (df['group'].isna())]

期望的数据框：

class	group	Column1
First	A	123
Second	B	123
Third	C	123

英文:

I applied the .loc methodolgy discussed in <https://stackoverflow.com/questions/53927460/select-rows-in-pandas-multiindex-dataframe> because I was recieving a KeyError:'class' even though 'class' exists as (what I thought was a column name) in my existing dataframe. Later finding out that 'class' was one of two indexes in a multiindex ('group' being the secondary index). While the .loc function allows me to select rows with 'First','Second','Third' I'm struggling to determine how to then apply an additional condition to exclude rows where the second index ('group') has blank rows.

Current dataframe looks like this:

class	group	Column1
First	A	123
First
Second	B	123
Third	C	123
Forth	D	123

Current code looks like this:

keep_rows = df.loc[[&#39;First&#39;,&#39;Second&#39;,&#39;Third&#39;]]

My original code looked like this (and was throwing the KeyError due to referenced names beind indexs and not column names)

keep_rows = df[(df[&#39;class&#39;].isin([&#39;First&#39;,&#39;Second&#39;,&#39;Third&#39;])) &amp; (df[&#39;group&#39;].isna())]

Desired dataframe:

class	group	Column1
First	A	123
Second	B	123
Third	C	123

答案1

得分: 3

你可以使用 get_level_values:

&gt;&gt;&gt; df[(df.index.get_level_values(&#39;class&#39;).isin([&#39;First&#39;,&#39;Second&#39;,&#39;Third&#39;])) 
       &amp; (df.index.get_level_values(&#39;group&#39;) != &#39;&#39;)]
              Column1
class  group         
First  A        123.0
Second B        123.0
Third  C        123.0

详情：

&gt;&gt;&gt; df.index.get_level_values(&#39;class&#39;).isin([&#39;First&#39;,&#39;Second&#39;,&#39;Third&#39;])
array([ True,  True,  True,  True, False])
&gt;&gt;&gt; df.index.get_level_values(&#39;group&#39;).notna()
array([ True, False,  True,  True,  True])

英文:

You can use get_level_values:

&gt;&gt;&gt; df[(df.index.get_level_values(&#39;class&#39;).isin([&#39;First&#39;,&#39;Second&#39;,&#39;Third&#39;])) 
       &amp; (df.index.get_level_values(&#39;group&#39;) != &#39;&#39;)]
              Column1
class  group         
First  A        123.0
Second B        123.0
Third  C        123.0

Details:

&gt;&gt;&gt; df.index.get_level_values(&#39;class&#39;).isin([&#39;First&#39;,&#39;Second&#39;,&#39;Third&#39;])
array([ True,  True,  True,  True, False])
&gt;&gt;&gt; df.index.get_level_values(&#39;group&#39;).notna()
array([ True, False,  True,  True,  True])

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

Pandas 多重索引与多个条件

问题

答案1

如何在Go中使用GAE的数据存储(Datastore)，当它最初是用Python创建的？

在Pandas数据框列中查找元素的索引。

Tkinter多标签UI，带有动态创建的小部件 – 不确定是否可能

tkinter窗口不会弹出

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。