从numpy ndarrays创建pandas dataframe时出错

huangapple go评论72阅读模式
英文:

Error while creating a pandas dataframe out of numpy ndarrays

问题

我正在尝试连接大小为(19200,)的NumPy ndarrays以创建一个数据框。每个1D数组将成为我的数据框中的一行。我的代码如下:

new_array_1 = pd.DataFrame(new_array_1, index=['new_array_1'])
new_array_2 = pd.DataFrame(new_array_2, index=['new_array_2'])
new_array_3 = pd.DataFrame(new_array_3, index=['new_array_3'])

df = pd.concat([df, new_array_1, new_array_2, new_array_3])

但是我得到了错误:

传递的值的形状为192001),索引暗示为11

但是在我给new_arrays加上方括号后,代码如下:

new_array_1 = pd.DataFrame([new_array_1], index=['new_array_1'])
df = pd.concat([df, new_array_1])

然后我得到了错误:

必须传递2D输入形状=(1, 1, 19200)

请问应该如何解决这个问题?请注意,我不能一次性添加所有数组,我会在获得数据时通过添加数据行来更新我的数据框。

英文:

I'm trying to concatenate numpy ndarrays of size (19200,) to make a data frame. Each 1D array would be a row in my data frame. My code looks like that:

new_array_1 = pd.DataFrame(new_array_1, index=['new_array_1'])
new_array_2 = pd.DataFrame(new_array_2, index=['new_array_2'])
new_array_3 = pd.DataFrame(new_array_3, index=['new_array_3'])

df = pd.concat([df, new_array_1, new_array_2, new_array_3])

but I got the error:

Shape of passed values is (19200, 1), indices imply (1, 1)

But then after I add the square brackets around the new_arrays like this:

new_array_1 = pd.DataFrame([new_array_1], index=['new_array_1'])
df = pd.concat([df, new_array_1])

and I got the error:

Must pass 2-d input. shape=(1, 1, 19200)

What should I do to solve the problem, please?
Please note that I cannot add all my arrays at the same time, I update my data frame by adding the rows of data whenever I got the data.

答案1

得分: 1

尝试将它们创建为Series而不是DataFrame,因为它们现在都是单独的:

s1 = pd.Series(np.array([1,2,3,4]), name='Array_1')
s2 = pd.Series(np.array([5,6,7,8]), name='Array_2')

然后,您可以使用Pandas将它们连接成一个新的DataFrame

df = pd.concat([s1, s2], axis=1)
# 输出
# Array_1   Array_2
#   1	      5
#   2	      6
#   3	      7
#   4         8

如果您的所有数组长度相同,您根本不需要使用concat

array_1 = np.array([1,2,3,4])
array_2 = np.array([5,6,7,8])

df = pd.DataFrame({'Array_1': array_1, 'Array_2': array_2})
英文:

Try creating them as a Series instead of a DataFrame, as they are all individuals for now:

s1 = pd.Series(np.array([1,2,3,4]), name='Array_1')
s2 = pd.Series(np.array([5,6,7,8]), name='Array_2')

Then you can use Pandas to concatenate them into a new DataFrame:

df = pd.concat([s1, s2], axis=1)
# Output
# Array_1   Array_2
#   1	      5
#   2	      6
#   3	      7
#   4         8

If all of your arrays are of the same length, you don't need to use concat at all:

array_1 = np.array([1,2,3,4])
array_2 = np.array([5,6,7,8])

df = pd.DataFrame({'Array_1': array_1, 'Array_2': array_2})

huangapple
  • 本文由 发表于 2023年6月5日 23:24:05
  • 转载请务必保留本文链接:https://go.coder-hub.com/76407887.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定