2020年1月6日 22:10:09go评论100阅读模式

英文:

updating the columns in nested loops in python

问题

我有两个数据框A和B如下所示：

A
col1 col2 col3
A B V1
A B V2
A C V1
A E V2

B
Col1 Col2 Value1 Value2
A B nan nan
A D nan nan
A C nan nan
A G nan nan
A E nan nan

我想根据数据框A更新数据框B中的Value1和Value2列，如果A的Col1和Col2的组合存在于B中，则更新Value1和Value2列，即从数据框A的col3中获取值。

我希望输出如下：

Col1 Col2 Value1 Value2
A B V1 V2
A D nan nan
A C V1 nan
A G nan nan
A E nan V2

我尝试了以下Python代码：

def update_b():
    for x in b.index:
        for y in a.index:          
            if ((a["col1"][y] == b["Col1"][x]) and (a["col2"][y] == b["Col2"][x])):
                if (a["col3"][y] == "V1"):
                    b["Value1"][x] = "V1"
                else:
                    b["Value2"][x] = "V2"
update_b()

但是它给我报错：ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

英文:

I have two dataframes A and B as follows :

A
col1 col2 col3
A     B    V1
A     B    V2
A     C    V1
A     E    V2
B
Col1 Col2 Value1 Value2
A     B   nan     nan
A     D   nan     nan
A     C   nan     nan
A     G   nan     nan
A     E   nan     nan

I want to update the columns Value1 and Value2 in dataframe B on the basis of dataframe A, as if the combination of Col1 and Col 2 of A exists in B it will update columns Value1 Value2 i.e. the values from col3 in dataframe A.

I want the output as :

Col1 Col2 Value1 Value2
A     B   V1      V2
A     D   nan     nan
A     C   V1      nan
A     G   nan     nan
A     E   nan     V2

I tried the following code in python :

def update_b():
for x in b.index:
    for y in a.index:          
                  
            if ((a[&quot;col1&quot;][y] == b[&quot;col1&quot;][x]) &amp; (a[&quot;col2&quot;][y] == b[&quot;col2&quot;][x])):
                               
                if (a[&quot;col3&quot;][y] == &quot;V1&quot;):
                    b[&quot;value1&quot;][x] = &quot;V1&quot;
                else:
                    b[&quot;value2&quot;][x] = &quot;V2&quot;

update_b()

but it gives me an error
> ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

答案1

得分: 1

你的DataFrame(s)设计有问题，你应该重新审查它们，但为了回答你的初始问题，我们可以利用多级索引和映射。

s = df.groupby(['col1','col2'])['col3'].agg(','.join).to_dict()
df2['Value1'] = df2.set_index(['Col1','Col2']).index.map(s)
df2.loc[df2['Value1'].str.contains(r'(v1)',case=False)==True,'Value1'] = 'V1'
df2.loc[df2['Value1'].str.contains(r'(v2)',case=False)==True,'Value2'] = 'V2'
df2.pop('Value1')

结果：

print(df2)
   Col1 Col2 Value1 Value2
0    A    B     V1     V2
1    A    D    NaN    NaN
2    A    C     V1    NaN
3    A    G    NaN    NaN
4    A    E    NaN     V2

英文:

Your dataframe(s) have bad design, you should revisit them, but to answer your initial question, we can leverage multindex & map

s = df.groupby([&#39;col1&#39;,&#39;col2&#39;])[&#39;col3&#39;].agg(&#39;,&#39;.join).to_dict()
df2[&#39;Val&#39;] = df2.set_index([&#39;Col1&#39;,&#39;Col2&#39;]).index.map(s)
df2.loc[df2[&#39;Val&#39;].str.contains(r&#39;(v1)&#39;,case=False)==True,&#39;Value1&#39;] = &#39;V1&#39;
df2.loc[df2[&#39;Val&#39;].str.contains(r&#39;(v2)&#39;,case=False)==True,&#39;Value2&#39;] = &#39;V2&#39;
df2.pop(&#39;Val&#39;)

Result:

print(df2)
   Col1 Col2 Value1 Value2
0    A    B     V1     V2
1    A    D    NaN    NaN
2    A    C     V1    NaN
3    A    G    NaN    NaN
4    A    E    NaN     V2

答案2

得分: 1

根据我理解的逻辑，您可以尝试以下函数。请注意，最后一行与您的预期输出不同（我无法理解为什么会这样）。

def return_updated_B(df):
    m = A.rename(columns=lambda x: x.capitalize())  # 将A的列名首字母大写
    n = m.set_index(['Col1', 'Col2', m.groupby(['Col1', 'Col2']).cumcount().add(1)]) \
        .unstack().droplevel(0, axis=1).add_prefix('Value')
    return n.reindex(pd.MultiIndex.from_arrays((B['Col1'], B['Col2']))).reset_index()

print(return_updated_B(A))
  Col1 Col2 Value1 Value2
0    A    B     V1     V2
1    A    D    NaN    NaN
2    A    C     V1    NaN
3    A    G    NaN    NaN
4    A    E     V2    NaN

英文:

As far as I understood the logic, you may try the below function, Note, the last row is different from your expected output (I couldnt figure how is that possible).

def return_updated_B(df):
    m=A.rename(columns=lambda x:x.capitalize()) #capitalize cols of A
    n=m.set_index([&#39;Col1&#39;,&#39;Col2&#39;,m.groupby([&#39;Col1&#39;,&#39;Col2&#39;])
           .cumcount().add(1)]).unstack().droplevel(0,axis=1).add_prefix(&#39;Value&#39;)
    return n.reindex(pd.MultiIndex.from_arrays((B[&#39;Col1&#39;],B[&#39;Col2&#39;]))).reset_index()

print(return_updated_B(A))
  Col1 Col2 Value1 Value2
0    A    B     V1     V2
1    A    D    NaN    NaN
2    A    C     V1    NaN
3    A    G    NaN    NaN
4    A    E     V2    NaN

答案3

得分: 0

你可以尝试迭代第一个数据框的行，并使用loc方法来分配数值给第二个数据框的行。

类似下面这样：

for index, row in dfA.iterrows():
    dfB.loc[(dfB.Col1 == row['col1']) & (dfB.Col2 == row['col2']), "Value1"] = row['value1']

我根据您提供的代码进行了翻译。由于是根据记忆回答的，可能存在一些语法错误。

英文:

You could try iterating over rows of the first dataframe and loc rows of the second one to assing values.

Something like the following:

for index, row in dfA.iterrows():
    dfB.loc[dfB.Col1 == row[&#39;col1&#39;] &amp; dfB.Col2 == row[&#39;col2&#39;], &quot;Value1&quot;] = row[&#39;value1&#39;]

Writing answer from memory so there may be some syntax errors.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

更新Python中嵌套循环中的列

问题

答案1

答案2

答案3

在pandas中合并行。

ExcelWriter 使用 openpyxl 引擎忽略 date_format 参数

Is there a way to look for a part of a string ('USD' in 'USDSEK') in the keys of a dictionary and if found return the value?

Java版本的Python的def函数

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。