英文:
Adding multiple rows to newly created columns in a pandas dataframe
问题
我正在使用pandas来存储机器学习模型的结果,我有一个存储输入数据的数据帧。我想扩展该数据帧,以包含模型返回的两个输出,但我不知道该如何做。
我尝试过像这样做:
import pandas as pd
df = pd.DataFrame({'col1':[1,2,3,4,5], 'col2':[1,2,3,4,5]})
df[['col3', 'col4']] = [[1,2,3,4,5],[1,2,3,4,5]]
但它会抛出错误:
Exception has occurred: ValueError
Columns must be same length as key
File "D:\InSilicoOP-FUAM\In Silico OP\src\pruebas.py", line 20, in <module>
df[['col3', 'col4']] = [[1,2,3,4,5],[1,2,3,4,5]]
ValueError: Columns must be same length as key
我还尝试过以下两种方式:
df['col3', 'col4'] = [1,2,3,4,5],[1,2,3,4,5]
和
df['col3', 'col4'] = [[1,2,3,4,5],[1,2,3,4,5]]
这两种方式都会引发ValueError: Length of values (2) does not match length of index (5)
错误。
我知道我可以分别分配每一列,像这样:
df['col3'] = [1,2,3,4,5]
但那样我就不得不将模型的结果分开(这本身就是一个大问题...)
是否有一种方法可以同时分配多个列?
英文:
I'm using pandas to store the results of a machine learning model, and I have a dataframe that stores the input data. I want to extend that dataframe with the two outputs that the model returns, but I don't know how to do it.
I've tried doing somethin like this:
import pandas
df = pd.DataFrame({'col1':[1,2,3,4,5], 'col2':[1,2,3,4,5]})
df[['col3', 'col4']] = [[1,2,3,4,5],[1,2,3,4,5]]
But it throws an error
Exception has occurred: ValueError
Columns must be same length as key
File "D:\InSilicoOP-FUAM\In Silico OP\src\pruebas.py", line 20, in <module>
df[['col3', 'col4']] = [[1,2,3,4,5],[1,2,3,4,5]]
ValueError: Columns must be same length as key
I've also tried with
df['col3', 'col4'] = [1,2,3,4,5],[1,2,3,4,5]
and df['col3', 'col4'] = [[1,2,3,4,5],[1,2,3,4,5]]
and those throw ValueError: Length of values (2) does not match length of index (5)
I know I can assign each column separatedly, like so
df['col3'] = [1,2,3,4,5]
But then I'd have to separate the results from the model (which is a big problem on it's own...)
Is there a way to assign multiple
答案1
得分: 0
假设您的结果存储在一个列表的列表中,您可以将其转换为pandas
DataFrame,然后与原始数据进行join
操作:
results = [[1, 2, 3, 4, 5], [1, 2, 3, 4, 5]]
df.join(pd.DataFrame(results, index=["col_3", "col_4"]).T)
col1 col2 col_3 col_4
0 1 1 1 1
1 2 2 2 2
2 3 3 3 3
3 4 4 4 4
4 5 5 5 5
英文:
Assuming your results are stored in a list
of lists, you could convert that to pandas
DataFrame and join
to the original:
results = [[1,2,3,4,5],[1,2,3,4,5]]
>>> df.join(pd.DataFrame(results, index=["col_3","col_4"]).T)
col1 col2 col_3 col_4
0 1 1 1 1
1 2 2 2 2
2 3 3 3 3
3 4 4 4 4
4 5 5 5 5
答案2
得分: 0
在您的具体示例中,您可以使用 loc 属性。但确保将数组插入正确的形状:
df.loc[:,["col3", "col4"]] = [[1,1],[2,2],[3,3],[4,4],[5,5]]
或者,您可以使用 numpy 的转置来从您示例中的数组创建正确的形状:
df.loc[:,["col3", "col4"]] = np.transpose([[1,2,3,4,5],[1,2,3,4,5]])
英文:
In your specific example you could use the loc property. Make sure you insert the array in the right shape though:
df.loc[:,["col3", "col4"]] = [[1,1],[2,2],[3,3],[4,4],[5,5]]
Alternatively you can use numpy's transpose to create the right shape from the array that you had in your example.
df.loc[:,["col3", "col4"]] = np.transpose([[1,2,3,4,5],[1,2,3,4,5]])
答案3
得分: 0
你可以将col3和col4的输出写入另一个数据框,然后将它们连接起来。我不确定这是否仍然会将您的结果与模型分开?
df = pd.DataFrame({'col1':[1,2,3,4,5], 'col2':[1,2,3,4,5]})
df2 = pd.DataFrame({'col3':[1,2,3,4,5],'col4':[1,2,3,4,5]})
df3 = df.join(df2)
print(df3)
col1 col2 col3 col4
0 1 1 1 1
1 2 2 2 2
2 3 3 3 3
3 4 4 4 4
4 5 5 5 5
英文:
You could write the outputs of col3 and col4 to another dataframe and then join them. I'm not sure if this is still separating your results out from the model?
df = pd.DataFrame({'col1':[1,2,3,4,5], 'col2':[1,2,3,4,5]})
df2 = pd.DataFrame({'col3':[1,2,3,4,5],'col4':[1,2,3,4,5]})
df3 = df.join(df2)
print(df3)
col1 col2 col3 col4
0 1 1 1 1
1 2 2 2 2
2 3 3 3 3
3 4 4 4 4
4 5 5 5 5
答案4
得分: 0
你可以从结果列表中解压值并将它们分配给列 col3
和 col4
:
import pandas as pd
df = pd.DataFrame({'col1':[1,2,3,4,5], 'col2':[1,2,3,4,5]})
results = [[1,2,3,4,5],[1,2,3,4,5]]
df['col3'], df['col4'] = results
英文:
You can unpack the values from the results list and assign them to the columns col3
and col4
:
import pandas as pd
df = pd.DataFrame({'col1':[1,2,3,4,5], 'col2':[1,2,3,4,5]})
results = [[1,2,3,4,5],[1,2,3,4,5]]
df['col3'], df['col4'] = results
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论