2023年8月9日 03:49:24go评论89阅读模式

英文:

Having trouble using loc with multiple indexes in pandas

问题

我很感谢你的帮助。所以我有这两个数据框。请注意，df2有两个索引：

import pandas as pd


data1 = {
    "col 1": ['a', 'b', 'c'],
    "col 2": ['x', 'y', 'z']
}

index1 = ['a', 'b', 'c']
index2 = ['x', 'y', 'z']

data2 = {
  "col 1": [420, 380, 390],
  "col 2": [50, 40, 45]
}

df1 = pd.DataFrame(data1)
df2 = pd.DataFrame(data2, index=[index1, index2])

df1:

  col 1 col 2
0     a     x
1     b     y
2     c     z

df2:

     col 1  col 2
a x    420     50
b y    380     40
c z    390     45

现在我想进行一个调用，其中df1有一个第三列，该列获取df2的第一个（最左边）索引，而不将df2的任何索引转换为列。我在网上阅读了一些资料，发现最多的是df.index()工具，其参数是要使用的索引。但是这对我不起作用。这是我的调用：

df2['col 3'] = df1.loc[df1['col 1'] == df2.index(0)]

出现了一个错误，让我感到困惑。

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-25-2f901269f295> in <module>
----> 1 df2['col 3'] = df1.loc[df1['col 1'] == df2.index(0)]

TypeError: 'MultiIndex' object is not callable

我该怎么办才能解决这个问题？谢谢！

英文:

I appreciate your help. So I have these two dfs. Notice how df2 has two indexes:

import pandas as pd


data1 = {
    &quot;col 1&quot;: [&#39;a&#39;, &#39;b&#39;, &#39;c&#39;],
    &quot;col 2&quot;: [&#39;x&#39;, &#39;y&#39;, &#39;z&#39;]
}

index1 = [&#39;a&#39;, &#39;b&#39;, &#39;c&#39;]
index2 = [&#39;x&#39;, &#39;y&#39;, &#39;z&#39;]

data2 = {
  &quot;col 1&quot;: [420, 380, 390],
  &quot;col 2&quot;: [50, 40, 45]
}

df1 = pd.DataFrame(data1)
df2 = pd.DataFrame(data2, index=[index1, index2])

df1:

  col 1 col 2
0     a     x
1     b     y
2     c     z

df2:

     col 1  col 2
a x    420     50
b y    380     40
c z    390     45

Now I'm trying to make a call where df1 has a third column that takes the first (left-most) index from df2 to the column without turning any of df2's indexes into a column. I've been reading online and the most I could find was the df.index() tool, with the argument being which index specifically used. It's not working for me. Here's my call:

df2[&#39;col 3&#39;] = df1.loc[df1[&#39;col 1&#39;] == df2.index(0)]

There's an error involved, which has left me stumped.

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
&lt;ipython-input-25-2f901269f295&gt; in &lt;module&gt;
----&gt; 1 df2[&#39;col 3&#39;] = df1.loc[df1[&#39;col 1&#39;] == df2.index(0)]

TypeError: &#39;MultiIndex&#39; object is not callable

What can I do to fix this? Thanks!

答案1

得分: 1

错误消息表明您不能像调用函数一样使用括号来调用pandas DataFrame的索引属性。相反，您需要使用方括号来访问多级索引的特定级别。

在您的情况下，您可以使用df2.index.levels[0]来访问多级索引的第一级别。然后，您可以使用get_level_values方法来获取df2中每行的多级索引的第一级别的值。最后，您可以将结果数组分配给df1的一个新列。

以下是更新后的代码，应该可以实现您想要的功能：

import pandas as pd

data1 = {
    "col 1": ['a', 'b', 'c'],
    "col 2": ['x', 'y', 'z']
}

index1 = ['a', 'b', 'c']
index2 = ['x', 'y', 'z']

data2 = {
  "col 1": [420, 380, 390],
  "col 2": [50, 40, 45]
}

df1 = pd.DataFrame(data1)
df2 = pd.DataFrame(data2, index=[index1, index2])

df1['col 3'] = df2.index.get_level_values(0)

输出：

  col 1 col 2 col 3
0     a     x     a
1     b     y     b
2     c     z     c

如您所见，df1添加了一个名为col 3的新列，其中包含来自df2的多级索引的第一级别的值。

英文:

The error message is indicating that you cannot call the index attribute of a pandas DataFrame using parentheses like a function. Instead, you need to use square brackets to access a specific level of the multi-index.

In your case, you can access the first level of the multi-index using df2.index.levels[0]. Then, you can use the get_level_values method to get the values of the first level of the multi-index for each row in df2. Finally, you can assign the resulting array to a new column of df1.

Here's an updated version of your code that should do what you're looking for:

import pandas as pd

data1 = {
    &quot;col 1&quot;: [&#39;a&#39;, &#39;b&#39;, &#39;c&#39;],
    &quot;col 2&quot;: [&#39;x&#39;, &#39;y&#39;, &#39;z&#39;]
}

index1 = [&#39;a&#39;, &#39;b&#39;, &#39;c&#39;]
index2 = [&#39;x&#39;, &#39;y&#39;, &#39;z&#39;]

data2 = {
  &quot;col 1&quot;: [420, 380, 390],
  &quot;col 2&quot;: [50, 40, 45]
}

df1 = pd.DataFrame(data1)
df2 = pd.DataFrame(data2, index=[index1, index2])

df1[&#39;col 3&#39;] = df2.index.get_level_values(0)

Output:

  col 1 col 2 col 3
0     a     x     a
1     b     y     b
2     c     z     c

As you can see, a new column named col 3 has been added to df1, with the values from the first level of the multi-index from df2.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

在使用pandas的loc函数时，遇到了多个索引的问题。

问题

答案1

if语句返回相同的语句，而不是根据if语句的条件而改变。

从现有列中创建新列在Python中。

修复 Python 中的抽象工厂

如何向tkinter输出文本添加超链接。

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论