MultiIndex 在使用 pd.concat 时的名称消失

huangapple go评论99阅读模式
英文:

MultiIndex names when using pd.concat disappeared

问题

考虑到你的要求,以下是翻译好的部分:

  1. 考虑以下数据框 `df1` `df2`
  2. df1:
  3. sim_names Model 1
  4. signal_names my_y1 my_y2
  5. units °C kPa
  6. (Time, s)
  7. 0.0 0.738280 1.478617
  8. 0.1 1.078653 0.486527
  9. 0.2 0.794123 0.604792
  10. 0.3 0.392690 1.072772
  11. df2:
  12. 空的数据框
  13. 列: []
  14. 索引: [0.0, 0.1, 0.2, 0.3]
  15. 正如你所见,`df1` 有三个级别的名称分别为 `"sim_names"``"signal_names"` `"units"`
  16. 接下来,我想要将这两个数据框连接起来,因此我运行了以下命令:
  17. df2 = pd.concat(
  18. [df1, df2],
  19. axis="columns",
  20. )
  21. 但是我得到了以下结果:
  22. df2:
  23. Model 1
  24. my_y1 my_y2
  25. °C kPa
  26. (Time, s)
  27. 0.0 0.738280 1.478617
  28. 0.1 1.078653 0.486527
  29. 0.2 0.794123 0.604792
  30. 0.3 0.392690 1.072772
  31. 正如你所见,级别名称消失了。
  32. 我应该怎么做才能在结果的 `df2` 中保留 `df1` 的级别名称?
  33. 我想要的结果 `df2` 应该像下面这样:
  34. df2:
  35. sim_names Model 1
  36. signal_names my_y1 my_y2
  37. units °C kPa
  38. (Time, s)
  39. 0.0 0.738280 1.478617
  40. 0.1 1.078653 0.486527
  41. 0.2 0.794123 0.604792
  42. 0.3 0.392690 1.072772
  43. 我尝试将 `names=["sim_names", "signal_names", "units"]` 作为参数传递给 `pd.concat`,但是得到了与上述相同的错误结果。
英文:

Consider the following dataframes df1 and df2:

  1. df1:
  2. sim_names Model 1
  3. signal_names my_y1 my_y2
  4. units °C kPa
  5. (Time, s)
  6. 0.0 0.738280 1.478617
  7. 0.1 1.078653 0.486527
  8. 0.2 0.794123 0.604792
  9. 0.3 0.392690 1.072772
  10. df2:
  11. Empty DataFrame
  12. Columns: []
  13. Index: [0.0, 0.1, 0.2, 0.3]

As you see, df1 has three levels with names "sim_names", "signal_names" and "units".

Next, I want to concatenate the two dataframes, and therefore I run the following command:

  1. df2 = pd.concat(
  2. [df1, df2],
  3. axis="columns",
  4. )

but what I get is the following:

  1. df2:
  2. Model 1
  3. my_y1 my_y2
  4. °C kPa
  5. (Time, s)
  6. 0.0 0.738280 1.478617
  7. 0.1 1.078653 0.486527
  8. 0.2 0.794123 0.604792
  9. 0.3 0.392690 1.072772

As you see, the levels names are gone.

What should I do to keep the levels names of df1 in the resulting df2?

My wanted resulting df2 should be like the following:

  1. df2:
  2. sim_names Model 1
  3. signal_names my_y1 my_y2
  4. units °C kPa
  5. (Time, s)
  6. 0.0 0.738280 1.478617
  7. 0.1 1.078653 0.486527
  8. 0.2 0.794123 0.604792
  9. 0.3 0.392690 1.072772

I tried to pass names=["sim_names", "signal_names", "units"] as argument to pd.concat but I got the same wrong result as above.

答案1

得分: 1

I'm not sure but seems like this is the normal behaviour (see GH13475).

作为一种解决方法,您可以使用 rename_axis/names :

  1. out = pd.concat(
  2. [df1, df2],
  3. axis="columns",
  4. ).rename_axis(df1.columns.names, axis=1) # <- added chain


Output :

  1. print(out)
  2. sim_names Model 1
  3. signal_names my_y1 my_y2
  4. units kPa
  5. (Time, s)
  6. 0.00 0.74 1.48
  7. 0.10 1.08 0.49
  8. 0.20 0.79 0.60
  9. 0.30 0.39 1.07
英文:

I'm not sure but seems like this is the normal behaviour (see GH13475).

As a workaround, you can use rename_axis/names :

  1. out = pd.concat(
  2. [df1, df2],
  3. axis=&quot;columns&quot;,
  4. ).rename_axis(df1.columns.names, axis=1) # &lt;- added chain


Output :

  1. print(out)
  2. sim_names Model 1
  3. signal_names my_y1 my_y2
  4. units &#176;C kPa
  5. (Time, s)
  6. 0.00 0.74 1.48
  7. 0.10 1.08 0.49
  8. 0.20 0.79 0.60
  9. 0.30 0.39 1.07

huangapple
  • 本文由 发表于 2023年5月11日 03:16:27
  • 转载请务必保留本文链接:https://go.coder-hub.com/76221895.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定