当连接两个数据框时,会添加一个额外的行。

huangapple go评论56阅读模式
英文:

when concatenating two data frames an extra row is added

问题

我试图连接两个panda数据框,但不幸的是它不起作用,以下是代码:

train_df = pd.concat([x_train, y_train], axis=1)

print(train_df)

y_train和x_train的长度相同,大小和行索引都正确,我只想像连接两个矩阵一样连接它们。

我的当前输出如下:

       Age  Sex  HighChol   BMI  ...  PhysHlth  DiffWalk  HighBP  Diabetes
0     10.0  1.0       1.0  33.0  ...      30.0       0.0     1.0       NaN
1     10.0  1.0       0.0  21.0  ...      30.0       1.0     1.0       1.0
2      4.0  0.0       0.0  32.0  ...       7.0       0.0     0.0       1.0
3     11.0  1.0       1.0  35.0  ...      10.0       1.0     1.0       0.0
4     10.0  0.0       1.0  27.0  ...       0.0       0.0     1.0       1.0
...    ...  ...       ...   ...  ...       ...       ...     ...       ...
996    3.0  0.0       1.0  33.0  ...       0.0       0.0     0.0       0.0
997    9.0  0.0       1.0  41.0  ...      30.0       1.0     1.0       0.0
998   12.0  0.0       1.0  34.0  ...       0.0       0.0     1.0       1.0
999    6.0  0.0       0.0  31.0  ...       0.0       0.0     0.0       0.0
1000   NaN  NaN       NaN   NaN  ...       NaN       NaN     NaN       1.0
[1001 rows x 15 columns]

出现nan行的原因似乎是因为y_train实际上是一个系列(Series)。

英文:

I am trying to concatenate two panda dataframes but unfortunately it's not working this is the following code:


train_df =pd.concat([x_train,y_train],axis =1 )

print(train_df)

y_train and x_train are of the same length and have the correct size and row indexes, I just wish to conctenate both of them like concatenating two matrices together.
My current output is the following:

       Age  Sex  HighChol   BMI  ...  PhysHlth  DiffWalk  HighBP  Diabetes
0     10.0  1.0       1.0  33.0  ...      30.0       0.0     1.0       NaN
1     10.0  1.0       0.0  21.0  ...      30.0       1.0     1.0       1.0
2      4.0  0.0       0.0  32.0  ...       7.0       0.0     0.0       1.0
3     11.0  1.0       1.0  35.0  ...      10.0       1.0     1.0       0.0
4     10.0  0.0       1.0  27.0  ...       0.0       0.0     1.0       1.0
...    ...  ...       ...   ...  ...       ...       ...     ...       ...
996    3.0  0.0       1.0  33.0  ...       0.0       0.0     0.0       0.0
997    9.0  0.0       1.0  41.0  ...      30.0       1.0     1.0       0.0
998   12.0  0.0       1.0  34.0  ...       0.0       0.0     1.0       1.0
999    6.0  0.0       0.0  31.0  ...       0.0       0.0     0.0       0.0
1000   NaN  NaN       NaN   NaN  ...       NaN       NaN     NaN       1.0
[1001 rows x 15 columns]

which for some reason seems to add a row of nan

edit:
apparently y_train is a series

答案1

得分: 2

你的问题描述中包括以下翻译内容:

  • "You have a shift between y_train and x_train index: x_train index range is 0-999 while y_train is 1-1000." 翻译为 "你的 y_trainx_train 索引存在偏移:x_train 的索引范围是 0 到 999,而 y_train 是 1 到 1000。"

  • "pd.concat uses this index to align row. A workaround is:" 翻译为 "pd.concat 使用这个索引来对齐行。一种解决方法是:"

  • "train_df = x_train.copy()\ntrain_df['Diabetes'] = y_train.values\n\n# Or\train_df = pd.concat([x_train, y_train.reset_index(drop=True)], axis=1)" 翻译为 "train_df = x_train.copy()\ntrain_df['Diabetes'] = y_train.values\n\n# 或者\train_df = pd.concat([x_train, y_train.reset_index(drop=True)], axis=1)"

  • "But take care, you have to find why you have this shift." 翻译为 "但要注意,你需要找出为什么存在这种偏移。"

  • "Note: y_train is a Series whose the name is Diabetes that's why the last column of train_df is Diabetes." 翻译为 "注意:y_train 是一个名为 DiabetesSeries,这就是为什么 train_df 的最后一列是 Diabetes。"

英文:

You have a shift between y_train and x_train index: x_train index range is 0-999 while y_train is 1-1000.

pd.concat uses this index to align row. A workaround is:

train_df = x_train.copy()
train_df['Diabetes'] = y_train.values

# Or
train_df = pd.concat([x_train, y_train.reset_index(drop=True)], axis=1)

But take care, you have to find why you have this shift.

Note: y_train is a Series whose the name is Diabetes that's why the last column of train_df is Diabetes.

huangapple
  • 本文由 发表于 2023年3月21日 01:21:18
  • 转载请务必保留本文链接:https://go.coder-hub.com/75793406.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定