英文:
Dataframe transform in Python
问题
我正在尝试在Python中转换一个数据帧 - 愿意使用Pandas或NumPy如果它能完成任务
原始数据帧如下所示
A B D E
foo-1 bar-6 C1 11
foo-2 bar-5 C2 12
foo-3 bar-4 C1 13
foo-4 bar-3 C1 14
foo-5 bar-2 C2 15
foo-6 bar-1 C2 16
而我正在尝试将其转换为这个
A B C1 C2
foo-1 bar-6 11 NAN
foo-2 bar-5 NAN 12
foo-3 bar-4 13 NAN
foo-4 bar-3 14 NAN
foo-5 bar-2 NAN 15
foo-6 bar-1 NAN 16
或者这样,然后我会删除D和E列
A B D E C1 C2
foo-1 bar-6 C1 11 11 NAN
foo-2 bar-5 C2 12 NAN 12
foo-3 bar-4 C1 13 13 NAN
foo-4 bar-3 C1 14 14 NAN
foo-5 bar-2 C2 15 NAN 15
foo-6 bar-1 C2 16 NAN 16
我尝试过这个
for row in dataframe.index:
dataframe[dataframe[D]] = dataframe[E]
但我得到了错误的结果
英文:
I am trying to transform a dataframe in Python - happy to use Pandas or NumPy if it will do the job
The orginal dataframe looks like this
A B D E
foo-1 bar-6 C1 11
foo-2 bar-5 C2 12
foo-3 bar-4 C1 13
foo-4 bar-3 C1 14
foo-5 bar-2 C2 15
foo-6 bar-1 C2 16
And I am trying to transform it into this
A B C1 C2
foo-1 bar-6 11 NAN
foo-2 bar-5 NAN 12
foo-3 bar-4 13 NAN
foo-4 bar-3 14 NAN
foo-5 bar-2 NAN 15
foo-6 bar-1 NAN 16
or this then I will drop cols D & E
A B D E C1 C2
foo-1 bar-6 C1 11 11 NAN
foo-2 bar-5 C2 12 NAN 12
foo-3 bar-4 C1 13 13 NAN
foo-4 bar-3 C1 14 14 NAN
foo-5 bar-2 C2 15 NAN 15
foo-6 bar-1 C2 16 NAN 16
I have tried this
for row in dataframe.index:
dataframe\[dataframe\[D\]\] = dataframe\[E\]
but I get the wrong results
答案1
得分: 1
尝试将原始数据框的一部分进行透视,然后将其与原数据框连接:
out = df.join(pd.pivot(df[['D', 'E']], columns='D', values='E'))
print(out)
打印结果:
A B D E C1 C2
0 foo-1 bar-6 C1 11 11.0 NaN
1 foo-2 bar-5 C2 12 NaN 12.0
2 foo-3 bar-4 C1 13 13.0 NaN
3 foo-4 bar-3 C1 14 14.0 NaN
4 foo-5 bar-2 C2 15 NaN 15.0
5 foo-6 bar-1 C2 16 NaN 16.0
英文:
Try to pivot part of the original dataframe then join it back:
out = df.join(pd.pivot(df[['D', 'E']], columns='D', values='E'))
print(out)
Prints:
A B D E C1 C2
0 foo-1 bar-6 C1 11 11.0 NaN
1 foo-2 bar-5 C2 12 NaN 12.0
2 foo-3 bar-4 C1 13 13.0 NaN
3 foo-4 bar-3 C1 14 14.0 NaN
4 foo-5 bar-2 C2 15 NaN 15.0
5 foo-6 bar-1 C2 16 NaN 16.0
答案2
得分: 1
以下是翻译好的部分:
这个问题经常出现,因为这个操作的名称并不明显。其中一个称呼是数据透视表。它也是堆叠操作的反操作。因此,您可以像@ScottBenson的回答中那样使用unstack
,或者使用DataFrame.pivot
方法。
df.pivot(index=['A', 'B'], columns='D', values='E')
输出
D C1 C2
A B
foo-1 bar-6 11.0 NaN
foo-2 bar-5 NaN 12.0
foo-3 bar-4 13.0 NaN
foo-4 bar-3 14.0 NaN
foo-5 bar-2 NaN 15.0
foo-6 bar-1 NaN 16.0
英文:
This question comes up a lot because it is not obvious what this operation is called. One word for it is a pivot table. It is also the opposite of the stack operation. So, you can use unstack
as in the answer by @ScottBenson or the DataFrame.pivot
method.
df.pivot(index=['A', 'B'], columns='D', values='E')
Output
D C1 C2
A B
foo-1 bar-6 11.0 NaN
foo-2 bar-5 NaN 12.0
foo-3 bar-4 13.0 NaN
foo-4 bar-3 14.0 NaN
foo-5 bar-2 NaN 15.0
foo-6 bar-1 NaN 16.0
答案3
得分: 0
这是一个相当简单的解决方案。如果您有任何问题,请告诉我 (:
data = {
'A': ['foo-1', 'foo-2', 'foo-3', 'foo-4', 'foo-5', 'foo-6'],
'B': ['bar-6', 'bar-5', 'bar-4', 'bar-3', 'bar-2', 'bar-1'],
'D': ['C1', 'C2', 'C1', 'C1', 'C2', 'C2'],
'E': [11, 12, 13, 14, 15, 16]
}
df = pd.DataFrame(data)
column_index = [0, 1, 2, 3, 4, 5]
for (a, b, c) in zip(df['D'], df['E'], column_index):
if df['D'][c] == 'C1':
df['E'][c] = 'NAN'
df['D'][c] = b
else:
df['E'][c] = b
df['D'][c] = 'NAN'
df.columns = ['A', 'B', 'C1', 'C2']
print(df)
A | B | C1 | C2 | |
---|---|---|---|---|
0 | foo-1 | bar-6 | 11 | nan |
1 | foo-2 | bar-5 | nan | 12 |
2 | foo-3 | bar-4 | 13 | nan |
3 | foo-4 | bar-3 | 14 | nan |
4 | foo-5 | bar-2 | nan | 15 |
5 | foo-6 | bar-1 | nan | 16 |
英文:
Here is a fairly simple solution. Let me know if you have any questions (:
data = {
'A': ['foo-1', 'foo-2', 'foo-3', 'foo-4', 'foo-5', 'foo-6'],
'B': ['bar-6', 'bar-5', 'bar-4', 'bar-3', 'bar-2', 'bar-1'],
'D': ['C1', 'C2', 'C1', 'C1', 'C2', 'C2'],
'E': [11, 12, 13, 14, 15, 16]
}
df = pd.DataFrame(data)
column_index = [0,1,2,3,4,5]
for (a,b,c) in zip(df['D'], df['E'], column_index):
if df['D'][c] == 'C1':
df['E'][c] = 'NAN
df['D'][c] = b
else:
df['E'][c] = b
df['D'][c] = 'NAN
df.columns = ['A', 'B', 'C1', 'C2']
print(df)
A | B | C1 | C2 | |
---|---|---|---|---|
0 | foo-1 | bar-6 | 11 | nan |
1 | foo-2 | bar-5 | nan | 12 |
2 | foo-3 | bar-4 | 13 | nan |
3 | foo-4 | bar-3 | 14 | nan |
4 | foo-5 | bar-2 | nan | 15 |
5 | foo-6 | bar-1 | nan | 16 |
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论