将多个标量值添加到多级索引的DataFrame中的多个列。

huangapple go评论64阅读模式
英文:

Add multiple columns to MultiIndex dataframe from multiple scalar values

问题

给定以下的MultiIndex df

| foo     |        |
| one     | two    |
| ------- | ------ |
| "12345" | "1235" |
| "12345" | "1345" |

我想要追加更多列,每列都填充相同的值,但这个值对于不同的列是不同的。我将这些值存储在MultiIndex pandas Series se 中,方式如下:

|bar | 0  | 2 |
|    | 1  | 3 |
 ...  ...  ...
|    | 99 | 7 |

结果会看起来像这样:

| foo      |          | bar |     | ... |     |
| one      | two      | 0   | 1   | ... | 99  |
| -------- | -------- | --- | --- | ... | --- |
| "12345"  | "1235"   | 2   | 3   | ... | 7   |
| "12345"  | "1345"   | 2   | 3   | ... | 7   |

对于我的问题,我找到了这个非常丑陋的解决办法...

for i in range(len(se)):
    df["bar", i] = se[i]

... 这也会给我一个警告:

PerformanceWarning: DataFrame is highly fragmented. This is usually the result of callingframe.insertmany times, which has poor performance. Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use newframe = frame.copy()

我已经尝试了一段时间,寻找解决办法,提前感谢你们的有用答案!

英文:

Given the following MultiIndex df

| foo     |        |
| one     | two    |
| ------- | ------ |
| "12345" | "1235" |
| "12345" | "1345" |

I would like to append more columns that are each all filled with the same value, but this value is different for different columns. I have these values stored the following way as a MultiIndex pandas Series se:

|bar | 0  | 2 |
|    | 1  | 3 |
 ...  ...  ...
|    | 99 | 7 |

The result would look like this:

| foo      |          | bar |     | ... |     |
| one      | two      | 0   | 1   | ... | 99  |
| -------- | -------- | --- | --- | ... | --- |
| "12345"  | "1235"   | 2   | 3   | ... | 7   |
| "12345"  | "1345"   | 2   | 3   | ... | 7   |

I have found this very ugly solution to my problem...

for i in range(len(se)):
    df["bar", i] = se[i]

... that also gives me a warning:

PerformanceWarning: DataFrame is highly fragmented. This is usually the result of callingframe.insertmany times, which has poor performance. Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use newframe = frame.copy()

Have been trying to find the solution to this for a while now, thanks in advance for useful answers!

答案1

得分: 0

IIUC,您可以使用:

tmp = se.to_frame().T

out = df.join(tmp.loc[tmp.index.repeat(len(df))].reset_index(drop=True))

输出:

print(out)

     foo       bar      
     one   two   0  1 99
0  12345  1235   2  3  7
1  12345  1345   2  3  7
英文:

IIUC, you can use :

tmp = se.to_frame().T

out = df.join(tmp.loc[tmp.index.repeat(len(df))].reset_index(drop=True))

Output :

print(out)

     foo       bar      
     one   two   0  1 99
0  12345  1235   2  3  7
1  12345  1345   2  3  7

huangapple
  • 本文由 发表于 2023年5月18日 00:41:35
  • 转载请务必保留本文链接:https://go.coder-hub.com/76274377.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定