2023年3月4日 07:12:35go评论96阅读模式

英文:

How to apply a function to the values of one column, then take the output and apply it to multiple other columns in pandas

问题

好的，以下是代码部分的翻译：

Alright y&#39;all, maybe my strategy here isn&#39;t ideal, but I&#39;ve got a very awkward dataset to work with and I need help.
I have a pandas dataframe that&#39;s structured such that only the first column has values:
df = 
|Ind| Column A | Column B | Column C |
| - | -------- | -------- | -------- |
| 0 | String1  | Null     | Null     |
| 1 | String2  | Null     | Null     |
What I&#39;d like to do is iteratively take the value from Column A and put it through a function whose output is a list. From there I need to fill the remaining columns with the output of the function, such that:
df = 
|Ind| Column A | Column B         | Column C         |
| - | -------- | ---------------- | ---------------- |
| 0 | String1  | func(String1)[0] | func(String1)[1] |
| 1 | String2  | func(String2)[0] | func(String2)[1] |
Thus far I&#39;ve been trying to do this using anonymous functions, as such:
df.iloc[:,1:].apply(lambda y: df[&quot;Column A&quot;].apply(lambda x: list(map(func, x)))
Which almost does what I want, but does not map the list into the respective columns, and the result is instead:
df = 
|Ind| Column A | Column B      | Column C      |
| - | -------- | ------------- | ------------- |
| 0 | String1  | func(String1) | func(String1) |
| 1 | String2  | func(String2) | func(String2) |
If there&#39;s a better approach I&#39;m totally open.

英文:

Alright y'all, maybe my strategy here isn't ideal, but I've got a very awkward dataset to work with and I need help.

I have a pandas dataframe that's structured such that only the first column has values:

df = 
|Ind| Column A | Column B | Column C |
| - | -------- | -------- | -------- |
| 0 | String1  | Null     | Null     |
| 1 | String2  | Null     | Null     |

What I'd like to do is iteratively take the value from Column A and put it through a function whose output is a list. From there I need to fill the remaining columns with the output of the function, such that:

df = 
|Ind| Column A | Column B         | Column C         |
| - | -------- | ---------------- | ---------------- |
| 0 | String1  | func(String1)[0] | func(String1)[1] |
| 1 | String2  | func(String2)[0] | func(String2)[1] |

Thus far I've been trying to do this using anonymous functions, as such:

df.iloc[:,1:].apply(lambda y: df[&quot;Column A&quot;].apply(lambda x: list(map(func, x)))

Which almost does what I want, but does not map the list into the respective columns, and the result is instead:

df = 
|Ind| Column A | Column B      | Column C      |
| - | -------- | ------------- | ------------- |
| 0 | String1  | func(String1) | func(String1) |
| 1 | String2  | func(String2) | func(String2) |

If there's a better approach I'm totally open.

答案1

得分: 0

你可以使用一个临时助手列与列表一起，然后在最后删除它，像这样：

df['temporary'] = df['Column A'].map(func)
df['Column B'] = df['temporary'].str[0]
df['Column C'] = df['temporary'].str[1]
df = df.drop('temporary', axis=1)

英文:

You could use a temporary helper column with the list and then drop it at the end, like this:

df[&#39;temporary&#39;] = df[&#39;Column A&#39;].map(func)
df[&#39;Column B&#39;] = df[&#39;temporary&#39;].str[0]
df[&#39;Column C&#39;] = df[&#39;temporary&#39;].str[1]
df = df.drop(&#39;temporary&#39;, axis=1)

答案2

得分: 0

函数式编程并不像它们所说的那么有趣。这里有一个过程化版本，它对列中的每个值应用一个函数，并扩展数据帧。请注意，该函数可以返回不同数量的结果。

import pandas as pd
# 创建一个包含三行一列的数据帧
df = pd.DataFrame(["foo", "bar", "fubar"], columns=["Column A"])
# 应用于第1列并创建新行（可变长度是可以的）
def fun(s):
    return list(c for c in s)
# 通过应用函数创建一个新数据帧
df2 = pd.DataFrame(fun(s) for s in df["Column A"])
# 命名新列（第一列保持不变）
column_names = {i: f"Column {'ABCDEFGHIJK'[i+1]}" for i in range(len(df2.columns))}
# 使用新名称添加新列
df = pd.concat([df, df2], axis=1).rename(columns=column_names)
df

英文:

Functional programming is not as fun as they say it is. Here's a procedural version, that applies a function to each value in the column, and extends the data frame. Note that the function can return a variable number of results.

import pandas as pd
# Make a three row, one column data frame
df = pd.DataFrame([&quot;foo&quot;,&quot;bar&quot;,&quot;fubar&quot;],columns=[&quot;Column A&quot;])
# Apply to column 1 and create new rows (variable length is fine)
def fun(s):
    return list(c for c in s)
# Make a new data frame by applying function
df2 = pd.DataFrame(fun(s) for s in df[&quot;Column A&quot;])
# Name new columns (first column remains same)
column_names = {i:f&quot;Column {&#39;ABCDEFGHIJK&#39;[i+1]}&quot; for i in range(len(df2.columns))}
# And add new columns using new names
df = pd.concat([df,df2 ],axis=1).rename(columns=column_names)
df

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何将一个函数应用于一列的值，然后将输出应用于pandas中的多个其他列。

问题

答案1

答案2

TypeError in Django: “float () argument must be a string or a number, not ‘tuple.'”

如何将返回值进行整理并保存为单独的JSON/CSV文件？

pip: bad interpreter: /../ no such file or directory

如何使用类型提示要求键值对，当键具有无效的标识符时？

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。