英文:
How to apply a function to the values of one column, then take the output and apply it to multiple other columns in pandas
问题
好的,以下是代码部分的翻译:
Alright y'all, maybe my strategy here isn't ideal, but I've got a very awkward dataset to work with and I need help.
I have a pandas dataframe that's structured such that only the first column has values:
df =
|Ind| Column A | Column B | Column C |
| - | -------- | -------- | -------- |
| 0 | String1 | Null | Null |
| 1 | String2 | Null | Null |
What I'd like to do is iteratively take the value from Column A and put it through a function whose output is a list. From there I need to fill the remaining columns with the output of the function, such that:
df =
|Ind| Column A | Column B | Column C |
| - | -------- | ---------------- | ---------------- |
| 0 | String1 | func(String1)[0] | func(String1)[1] |
| 1 | String2 | func(String2)[0] | func(String2)[1] |
Thus far I've been trying to do this using anonymous functions, as such:
df.iloc[:,1:].apply(lambda y: df["Column A"].apply(lambda x: list(map(func, x)))
Which almost does what I want, but does not map the list into the respective columns, and the result is instead:
df =
|Ind| Column A | Column B | Column C |
| - | -------- | ------------- | ------------- |
| 0 | String1 | func(String1) | func(String1) |
| 1 | String2 | func(String2) | func(String2) |
If there's a better approach I'm totally open.
英文:
Alright y'all, maybe my strategy here isn't ideal, but I've got a very awkward dataset to work with and I need help.
I have a pandas dataframe that's structured such that only the first column has values:
df =
|Ind| Column A | Column B | Column C |
| - | -------- | -------- | -------- |
| 0 | String1 | Null | Null |
| 1 | String2 | Null | Null |
What I'd like to do is iteratively take the value from Column A and put it through a function whose output is a list. From there I need to fill the remaining columns with the output of the function, such that:
df =
|Ind| Column A | Column B | Column C |
| - | -------- | ---------------- | ---------------- |
| 0 | String1 | func(String1)[0] | func(String1)[1] |
| 1 | String2 | func(String2)[0] | func(String2)[1] |
Thus far I've been trying to do this using anonymous functions, as such:
df.iloc[:,1:].apply(lambda y: df["Column A"].apply(lambda x: list(map(func, x)))
Which almost does what I want, but does not map the list into the respective columns, and the result is instead:
df =
|Ind| Column A | Column B | Column C |
| - | -------- | ------------- | ------------- |
| 0 | String1 | func(String1) | func(String1) |
| 1 | String2 | func(String2) | func(String2) |
If there's a better approach I'm totally open.
答案1
得分: 0
你可以使用一个临时助手列与列表一起,然后在最后删除它,像这样:
df['temporary'] = df['Column A'].map(func)
df['Column B'] = df['temporary'].str[0]
df['Column C'] = df['temporary'].str[1]
df = df.drop('temporary', axis=1)
英文:
You could use a temporary helper column with the list and then drop it at the end, like this:
df['temporary'] = df['Column A'].map(func)
df['Column B'] = df['temporary'].str[0]
df['Column C'] = df['temporary'].str[1]
df = df.drop('temporary', axis=1)
答案2
得分: 0
函数式编程并不像它们所说的那么有趣。这里有一个过程化版本,它对列中的每个值应用一个函数,并扩展数据帧。请注意,该函数可以返回不同数量的结果。
import pandas as pd
# 创建一个包含三行一列的数据帧
df = pd.DataFrame(["foo", "bar", "fubar"], columns=["Column A"])
# 应用于第1列并创建新行(可变长度是可以的)
def fun(s):
return list(c for c in s)
# 通过应用函数创建一个新数据帧
df2 = pd.DataFrame(fun(s) for s in df["Column A"])
# 命名新列(第一列保持不变)
column_names = {i: f"Column {'ABCDEFGHIJK'[i+1]}" for i in range(len(df2.columns))}
# 使用新名称添加新列
df = pd.concat([df, df2], axis=1).rename(columns=column_names)
df
英文:
Functional programming is not as fun as they say it is. Here's a procedural version, that applies a function to each value in the column, and extends the data frame. Note that the function can return a variable number of results.
import pandas as pd
# Make a three row, one column data frame
df = pd.DataFrame(["foo","bar","fubar"],columns=["Column A"])
# Apply to column 1 and create new rows (variable length is fine)
def fun(s):
return list(c for c in s)
# Make a new data frame by applying function
df2 = pd.DataFrame(fun(s) for s in df["Column A"])
# Name new columns (first column remains same)
column_names = {i:f"Column {'ABCDEFGHIJK'[i+1]}" for i in range(len(df2.columns))}
# And add new columns using new names
df = pd.concat([df,df2 ],axis=1).rename(columns=column_names)
df
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论