在一个 pandas 数据框中添加多行到新创建的列中

huangapple go评论55阅读模式

Adding multiple rows to newly created columns in a pandas dataframe




import pandas as pd
df = pd.DataFrame({'col1':[1,2,3,4,5], 'col2':[1,2,3,4,5]})
df[['col3', 'col4']] = [[1,2,3,4,5],[1,2,3,4,5]]


Exception has occurred: ValueError
Columns must be same length as key
  File "D:\InSilicoOP-FUAM\In Silico OP\src\pruebas.py", line 20, in <module>
    df[['col3', 'col4']] = [[1,2,3,4,5],[1,2,3,4,5]]
ValueError: Columns must be same length as key


df['col3', 'col4'] = [1,2,3,4,5],[1,2,3,4,5]

df['col3', 'col4'] = [[1,2,3,4,5],[1,2,3,4,5]]

这两种方式都会引发ValueError: Length of values (2) does not match length of index (5)错误。


df['col3'] = [1,2,3,4,5]




I'm using pandas to store the results of a machine learning model, and I have a dataframe that stores the input data. I want to extend that dataframe with the two outputs that the model returns, but I don't know how to do it.

I've tried doing somethin like this:

import pandas
df = pd.DataFrame({&#39;col1&#39;:[1,2,3,4,5], &#39;col2&#39;:[1,2,3,4,5]})
df[[&#39;col3&#39;, &#39;col4&#39;]] = [[1,2,3,4,5],[1,2,3,4,5]]

But it throws an error

Exception has occurred: ValueError
Columns must be same length as key
  File &quot;D:\InSilicoOP-FUAM\In Silico OP\src\pruebas.py&quot;, line 20, in &lt;module&gt;
    df[[&#39;col3&#39;, &#39;col4&#39;]] = [[1,2,3,4,5],[1,2,3,4,5]]
ValueError: Columns must be same length as key

I've also tried with
df[&#39;col3&#39;, &#39;col4&#39;] = [1,2,3,4,5],[1,2,3,4,5] and df[&#39;col3&#39;, &#39;col4&#39;] = [[1,2,3,4,5],[1,2,3,4,5]] and those throw ValueError: Length of values (2) does not match length of index (5)

I know I can assign each column separatedly, like so

df[&#39;col3&#39;] = [1,2,3,4,5]

But then I'd have to separate the results from the model (which is a big problem on it's own...)

Is there a way to assign multiple


得分: 0

假设您的结果存储在一个列表的列表中,您可以将其转换为pandas DataFrame,然后与原始数据进行join操作:

results = [[1, 2, 3, 4, 5], [1, 2, 3, 4, 5]]

df.join(pd.DataFrame(results, index=["col_3", "col_4"]).T)

   col1  col2  col_3  col_4
0     1     1      1      1
1     2     2      2      2
2     3     3      3      3
3     4     4      4      4
4     5     5      5      5

Assuming your results are stored in a list of lists, you could convert that to pandas DataFrame and join to the original:

results = [[1,2,3,4,5],[1,2,3,4,5]]

&gt;&gt;&gt; df.join(pd.DataFrame(results, index=[&quot;col_3&quot;,&quot;col_4&quot;]).T)

   col1  col2  col_3  col_4
0     1     1      1      1
1     2     2      2      2
2     3     3      3      3
3     4     4      4      4
4     5     5      5      5


得分: 0

在您的具体示例中,您可以使用 loc 属性。但确保将数组插入正确的形状:

df.loc[:,["col3", "col4"]] = [[1,1],[2,2],[3,3],[4,4],[5,5]]

或者,您可以使用 numpy 的转置来从您示例中的数组创建正确的形状:

df.loc[:,["col3", "col4"]] = np.transpose([[1,2,3,4,5],[1,2,3,4,5]])

In your specific example you could use the loc property. Make sure you insert the array in the right shape though:

df.loc[:,[&quot;col3&quot;, &quot;col4&quot;]] = [[1,1],[2,2],[3,3],[4,4],[5,5]]

Alternatively you can use numpy's transpose to create the right shape from the array that you had in your example.

df.loc[:,[&quot;col3&quot;, &quot;col4&quot;]] = np.transpose([[1,2,3,4,5],[1,2,3,4,5]])


得分: 0


df = pd.DataFrame({'col1':[1,2,3,4,5], 'col2':[1,2,3,4,5]})
df2 = pd.DataFrame({'col3':[1,2,3,4,5],'col4':[1,2,3,4,5]})
df3 = df.join(df2)
   col1  col2  col3  col4
0     1     1     1     1
1     2     2     2     2
2     3     3     3     3
3     4     4     4     4
4     5     5     5     5

You could write the outputs of col3 and col4 to another dataframe and then join them. I'm not sure if this is still separating your results out from the model?

df = pd.DataFrame({&#39;col1&#39;:[1,2,3,4,5], &#39;col2&#39;:[1,2,3,4,5]})
df2 = pd.DataFrame({&#39;col3&#39;:[1,2,3,4,5],&#39;col4&#39;:[1,2,3,4,5]})
df3 = df.join(df2)
   col1  col2  col3  col4
0     1     1     1     1
1     2     2     2     2
2     3     3     3     3
3     4     4     4     4
4     5     5     5     5


得分: 0

你可以从结果列表中解压值并将它们分配给列 col3col4

import pandas as pd

df = pd.DataFrame({'col1':[1,2,3,4,5], 'col2':[1,2,3,4,5]})

results = [[1,2,3,4,5],[1,2,3,4,5]]

df['col3'], df['col4'] = results

You can unpack the values from the results list and assign them to the columns col3 and col4:

import pandas as pd

df = pd.DataFrame({&#39;col1&#39;:[1,2,3,4,5], &#39;col2&#39;:[1,2,3,4,5]})

results = [[1,2,3,4,5],[1,2,3,4,5]]

df[&#39;col3&#39;], df[&#39;col4&#39;] = results

  • 本文由 发表于 2023年7月12日 23:01:47
  • 转载请务必保留本文链接:https://go.coder-hub.com/76671987.html



:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:
