在Pandas数据框中移动列值会导致缺失值。

huangapple go评论66阅读模式
英文:

Shifting column values in Pandas Dataframe causes missing values

问题

我想将列值向左移动一个空格。我不想保存列'average_rating'的原始值。

我使用了shift命令:

data3 = data3.shift(-1, axis=1)

但我得到的输出中有两列缺失值-'num_pages'和'text_reviews_count'。

英文:

I want to shift column values one space to the left. I don't want to save the original values of the column 'average_rating'.

在Pandas数据框中移动列值会导致缺失值。

I used the shift command:

data3 = data3.shift(-1, axis=1)

But the output I get has missing values for two columns- 'num_pages' and 'text_reviews_count'

在Pandas数据框中移动列值会导致缺失值。

答案1

得分: 1

由于源列和目标列的数据类型不匹配。尝试在shift()后将列值转换为每个源列和目标列的目标数据类型,例如.fillna(0).astype(int)

或者,您可以将数据框中的所有数据都转换为字符串,然后执行移位操作。然后,您可能希望将它们再次转换回其原始数据类型。

df = df.astype(str)  # 将所有数据转换为字符串
df_shifted = (df.shift(-1,axis=1))  # 执行移位操作
df_string = df_shifted.to_csv()  # 将移位后的数据存储到一个字符串变量中
new_df = pd.read_csv(StringIO(df_string), index_col=0)  # 从字符串变量中重新读取数据

输出:

       average_rating        isbn  isbn13 language_code  num_pages  ratings_count  text_reviews_count  extra
0                3.57  0674842111  978067         en-US        236             55                 6.0    NaN
1                3.60  1593600119  978067           eng        400             25                 4.0    NaN
2                3.63  156384155X  978067           eng        342             38                 4.0    NaN
3                3.98  1857237250  978067           eng        383           2197                17.0    NaN
4                0.00  0851742718  978067           eng         49              0                 0.0    NaN
英文:

It is because the data types of the source and target columns do not match. Try converting the column value after shift() to the target data type for each source and target column - for example .fillna(0).astype(int).

Alternately, you can convert all the data in the data frame to strings and then perform the shift. You might want to convert them back to their original data types again.

df = df.astype(str)  # convert all data to str
df_shifted = (df.shift(-1,axis=1))  # perform the shift
df_string = df_shifted.to_csv()  # store the shifted to a string variable
new_df = pd.read_csv(StringIO(df_string), index_col=0)  # read the data again from the string variable 

Output:

   average_rating        isbn  isbn13 language_code  num_pages  ratings_count  text_reviews_count  extra
0            3.57  0674842111  978067         en-US        236             55                 6.0    NaN
1            3.60  1593600119  978067           eng        400             25                 4.0    NaN
2            3.63  156384155X  978067           eng        342             38                 4.0    NaN
3            3.98  1857237250  978067           eng        383           2197                17.0    NaN
4            0.00  0851742718  978067           eng         49              0                 0.0    NaN

huangapple
  • 本文由 发表于 2020年1月6日 20:04:13
  • 转载请务必保留本文链接:https://go.coder-hub.com/59611789.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定