英文:
Error while creating a pandas dataframe out of numpy ndarrays
问题
我正在尝试连接大小为(19200,)的NumPy ndarrays以创建一个数据框。每个1D数组将成为我的数据框中的一行。我的代码如下:
new_array_1 = pd.DataFrame(new_array_1, index=['new_array_1'])
new_array_2 = pd.DataFrame(new_array_2, index=['new_array_2'])
new_array_3 = pd.DataFrame(new_array_3, index=['new_array_3'])
df = pd.concat([df, new_array_1, new_array_2, new_array_3])
但是我得到了错误:
传递的值的形状为(19200,1),索引暗示为(1,1)
但是在我给new_arrays加上方括号后,代码如下:
new_array_1 = pd.DataFrame([new_array_1], index=['new_array_1'])
df = pd.concat([df, new_array_1])
然后我得到了错误:
必须传递2D输入。形状=(1, 1, 19200)
请问应该如何解决这个问题?请注意,我不能一次性添加所有数组,我会在获得数据时通过添加数据行来更新我的数据框。
英文:
I'm trying to concatenate numpy ndarrays of size (19200,) to make a data frame. Each 1D array would be a row in my data frame. My code looks like that:
new_array_1 = pd.DataFrame(new_array_1, index=['new_array_1'])
new_array_2 = pd.DataFrame(new_array_2, index=['new_array_2'])
new_array_3 = pd.DataFrame(new_array_3, index=['new_array_3'])
df = pd.concat([df, new_array_1, new_array_2, new_array_3])
but I got the error:
Shape of passed values is (19200, 1), indices imply (1, 1)
But then after I add the square brackets around the new_arrays like this:
new_array_1 = pd.DataFrame([new_array_1], index=['new_array_1'])
df = pd.concat([df, new_array_1])
and I got the error:
Must pass 2-d input. shape=(1, 1, 19200)
What should I do to solve the problem, please?
Please note that I cannot add all my arrays at the same time, I update my data frame by adding the rows of data whenever I got the data.
答案1
得分: 1
尝试将它们创建为Series
而不是DataFrame,因为它们现在都是单独的:
s1 = pd.Series(np.array([1,2,3,4]), name='Array_1')
s2 = pd.Series(np.array([5,6,7,8]), name='Array_2')
然后,您可以使用Pandas将它们连接成一个新的DataFrame
:
df = pd.concat([s1, s2], axis=1)
# 输出
# Array_1 Array_2
# 1 5
# 2 6
# 3 7
# 4 8
如果您的所有数组长度相同,您根本不需要使用concat
:
array_1 = np.array([1,2,3,4])
array_2 = np.array([5,6,7,8])
df = pd.DataFrame({'Array_1': array_1, 'Array_2': array_2})
英文:
Try creating them as a Series
instead of a DataFrame, as they are all individuals for now:
s1 = pd.Series(np.array([1,2,3,4]), name='Array_1')
s2 = pd.Series(np.array([5,6,7,8]), name='Array_2')
Then you can use Pandas to concatenate them into a new DataFrame
:
df = pd.concat([s1, s2], axis=1)
# Output
# Array_1 Array_2
# 1 5
# 2 6
# 3 7
# 4 8
If all of your arrays are of the same length, you don't need to use concat
at all:
array_1 = np.array([1,2,3,4])
array_2 = np.array([5,6,7,8])
df = pd.DataFrame({'Array_1': array_1, 'Array_2': array_2})
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论