2023年2月27日 01:47:12go评论84阅读模式

英文:

Unexpected results using pandas .loc - trying to concatenate 2 columns based on a condition

问题

以下是您要的代码部分的中文翻译：

# 我有一个数据框，我正在尝试根据条件合并两列。
# 创建数据框
df = em_df[['Redcap_Case_num', 'EV_EM',  'COMP_EM', 'EV_RND', 'COMP_EM_RND']].head(3)
df.to_clipboard(excel=False, sep=', ')
# 将EMFREAS - RND列添加到数据框
cols_to_fill = [x for x in ln.columns if x.startswith("EMFREAS")]
for column in cols_to_fill:
    df[column] = ln[column].copy()
df.to_clipboard(excel=False, sep=', ')
# 请注意，我尝试按照格式化表格的说明操作，但说明对我无效。我不确定我做错了什么，所以这是我能够完成的方式。

以下是您的期望输出的中文翻译：

# 预期的输出应如下所示：
# 需要合并EV_RND列和COMP_EM_RND列，以填充所有以EMFREAS开头的列（您只看到了部分列）。
EV_ND = df["EV_EM"] == 0
EM_ND = df['COMP_EM'] == 'Not Done'
df.loc[EV_ND | EM_ND, cols_to_fill] = df["EV_RND"] + '|' + df["COMP_EM_RND"]
# 预期的结果应如下所示：
# 需要合并EV_RND列和COMP_EM_RND列，以填充所有以EMFREAS开头的列（您只看到了部分列）。
ID      EV_EM COMP_EM     EV_RND  COMP_EM_RND EMFREAS1             EMFREAS2
YA007   1    Not Done    EV ND   Insufficient Insufficient|EV ND   Insufficient|EV ND
YA006   1
YA005   1    Outside grid EM Not done   EM Not done          EM Not done

以上是代码和期望输出的中文翻译，不包含其他内容。

英文:

I have a df where I am trying to merge 2 columns based on a condition.

Create df

df = em_df[[&#39;Redcap_Case_num&#39;, &#39;EV_EM&#39;,  &#39;COMP_EM&#39;, &#39;EV_RND&#39;, &#39;COMP_EM_RND&#39;] ].head(3)
df.to_clipboard(excel = False, sep = &#39;, &#39;)
#Add EMFREAS - RND columns to df
cols_to_fill=[x for x in ln.columns if x.startswith(&quot;EMFREAS&quot;)]
for column in cols_to_fill:
    df[column] = ln[column].copy()
df.to_clipboard(excel = False, sep = &#39;, &#39;)

Output - Please understand I have tried to follow the instructions to format the table, but the instructions did not work for me. I'm not sure what I am doing wrong so this is how I was able to do it.

ID        EV_EM     COMP_EM          EV_RND      COMP_EM_RND     EMFREAS1   EMFREAS2 
YA007      1        Not Done                   Insufficient                                                                                
YA006      1                                                                                                                                              
YA005      0        Outside grid  EM Not done

I need to merge the EV_RND column and the COMP_EM_RND columns to populate all the columns that start with EMFREAS (You are only seeing a subset of the columns)

Here is the code I am trying to use to do this:

#apply ND filter to df and merge to ln df
EV_ND = df[&quot;EV_EM&quot;]==0 
EM_ND = df[&#39;COMP_EM&#39;] == &#39;Not Done&#39;
df.loc[EV_ND | EM_ND, cols_to_fill]=df[&quot;EV_RND&quot;] + &#39;|&#39; + df[&quot;COMP_EM_RND&quot;]

The expected outcome should look like this:

ID       EV_EM COMP_EM          EV_RND      COMP_EM_RND     EMFREAS1            EMFREAS2 
YA007    1        Not Done      EV ND        Insufficient   Insufficient|EV ND Insufficient|EV ND                                                                          
YA006    1                                                                                                                                              
YA005    1        Outside grid  EM Not done                EM Not done           EM Not done

答案1

得分: 1

如果您的数据框中的空值实际上是空字符串，您可以创建一个分隔符系列，如果 EV_RND 和 COMP_EM_RND 不为空，那么该系列将等于 |，否则为空字符串。然后连接 EV_RND，分隔符系列和 COMP_EM_RND：

sep_series = df.apply(lambda x: '|'
                        if (x['EV_RND'] and x['COMP_EM_RND'])
                        else '', axis=1)
fill_series = df['EV_RND'].str.cat(sep_series).str.cat(df['COMP_EM_RND'])
for col in df.columns:
    if col.startswith('EMFREAS'):
        df[col] = df[col].replace('', np.nan).fillna(fill_series)

输出：

      ID  EV_EM       COMP_EM       EV_RND   COMP_EM_RND            EMFREAS1            EMFREAS2
0  YA007      1      Not Done        EV ND  Insufficient  EV ND|Insufficient  EV ND|Insufficient
1  YA006      1                                                                 EV ND|Insufficient
2  YA005      0  Outside grid  EM Not done                       EM Not done         EM Not done

英文:

If the empty values in your df are actually empty strings, you can create a separator series equal to | if EV_RND and COMP_EM_RND are not empty, empty string otherwise. Then concat EV_RND, the separator series and COMP_EM_RND:

sep_series = df.apply(lambda x: &#39;|&#39;
                        if (x[&#39;EV_RND&#39;] and x[&#39;COMP_EM_RND&#39;])
                        else &#39;&#39;, axis=1)
fill_series = df[&#39;EV_RND&#39;].str.cat(sep_series).str.cat(df[&#39;COMP_EM_RND&#39;])
for col in df.columns:
    if col.startswith(&#39;EMFREAS&#39;):
        df[col] = df[col].replace(&#39;&#39;, np.nan).fillna(fill_series)

Output:

      ID  EV_EM       COMP_EM       EV_RND   COMP_EM_RND            EMFREAS1            EMFREAS2
0  YA007      1      Not Done        EV ND  Insufficient  EV ND|Insufficient  EV ND|Insufficient
1  YA006      1                                                                                 
2  YA005      0  Outside grid  EM Not done                       EM Not done         EM Not done

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

使用pandas的.loc出现意外结果 – 尝试根据条件连接2个列

问题

答案1

Pandas 读取 Excel 文件并用删除线标记行。

如何在继承自Pandas DataFrame的类的初始化中添加新列？

当在数据框构造函数中使用’squeeze’关键字时为什么会出错？

pandas修改了三个其他列的条件值

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。