英文:
Add multiple columns to MultiIndex dataframe from multiple scalar values
问题
给定以下的MultiIndex df
| foo | |
| one | two |
| ------- | ------ |
| "12345" | "1235" |
| "12345" | "1345" |
我想要追加更多列,每列都填充相同的值,但这个值对于不同的列是不同的。我将这些值存储在MultiIndex pandas Series se
中,方式如下:
|bar | 0 | 2 |
| | 1 | 3 |
... ... ...
| | 99 | 7 |
结果会看起来像这样:
| foo | | bar | | ... | |
| one | two | 0 | 1 | ... | 99 |
| -------- | -------- | --- | --- | ... | --- |
| "12345" | "1235" | 2 | 3 | ... | 7 |
| "12345" | "1345" | 2 | 3 | ... | 7 |
对于我的问题,我找到了这个非常丑陋的解决办法...
for i in range(len(se)):
df["bar", i] = se[i]
... 这也会给我一个警告:
PerformanceWarning: DataFrame is highly fragmented. This is usually the result of calling
frame.insertmany times, which has poor performance. Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use newframe = frame.copy()
我已经尝试了一段时间,寻找解决办法,提前感谢你们的有用答案!
英文:
Given the following MultiIndex df
| foo | |
| one | two |
| ------- | ------ |
| "12345" | "1235" |
| "12345" | "1345" |
I would like to append more columns that are each all filled with the same value, but this value is different for different columns. I have these values stored the following way as a MultiIndex pandas Series se
:
|bar | 0 | 2 |
| | 1 | 3 |
... ... ...
| | 99 | 7 |
The result would look like this:
| foo | | bar | | ... | |
| one | two | 0 | 1 | ... | 99 |
| -------- | -------- | --- | --- | ... | --- |
| "12345" | "1235" | 2 | 3 | ... | 7 |
| "12345" | "1345" | 2 | 3 | ... | 7 |
I have found this very ugly solution to my problem...
for i in range(len(se)):
df["bar", i] = se[i]
... that also gives me a warning:
PerformanceWarning: DataFrame is highly fragmented. This is usually the result of calling
frame.insertmany times, which has poor performance. Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use newframe = frame.copy()
Have been trying to find the solution to this for a while now, thanks in advance for useful answers!
答案1
得分: 0
IIUC,您可以使用:
tmp = se.to_frame().T
out = df.join(tmp.loc[tmp.index.repeat(len(df))].reset_index(drop=True))
输出:
print(out)
foo bar
one two 0 1 99
0 12345 1235 2 3 7
1 12345 1345 2 3 7
英文:
IIUC, you can use :
tmp = se.to_frame().T
out = df.join(tmp.loc[tmp.index.repeat(len(df))].reset_index(drop=True))
Output :
print(out)
foo bar
one two 0 1 99
0 12345 1235 2 3 7
1 12345 1345 2 3 7
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论