英文:
Python / Pandas: Shift entities of a row to the right (end)
问题
以下是翻译好的部分:
import numpy as np
import pandas as pd
data = {
'Customer': ['A', 'B', 'C'],
'Date1': [10, 20, 30],
'Date2': [40, 50, np.nan],
'Date3': [np.nan, np.nan, np.nan],
'Date4': [60, np.nan, np.nan]
}
df = pd.DataFrame(data)
for i in range(1, len(df.columns)):
df.iloc[:, i] = df.iloc[:, i-1].shift(fill_value=np.nan)
print(df)
希望这对你有所帮助。如果有任何其他问题,请随时提出。
英文:
I have the following data frame (number of "Date" columns can vary):
Customer Date1 Date2 Date3 Date4
0 A 10 40.0 NaN 60.0
1 B 20 50.0 NaN NaN
2 C 30 NaN NaN NaN
If there is a "NaN" in the last column (as said, number of columns can vary), I want to right shift all the columns to the end of the data frame such that it then looks like this:
Customer Date1 Date2 Date3 Date4
0 A 10 40.0 NaN 60.0
1 B NaN NaN 20 50.0
2 C NaN NaN NaN 30
All the values which remain empty can be set to NaN.
How can I do that in Python?
I tried this code but didn't work:
import numpy as np
import pandas as pd
data = {
'Customer': ['A', 'B', 'C'],
'Date1': [10, 20, 30],
'Date2': [40, 50, np.nan],
'Date3': [np.nan, np.nan, np.nan],
'Date4': [60, np.nan, np.nan]
}
df = pd.DataFrame(data)
for i in range(1, len(df.columns)):
df.iloc[:, i] = df.iloc[:, i-1].shift(fill_value=np.nan)
print(df)
答案1
得分: 1
如果您没有一行只包含NaN值,您可以使用:
for i in range(len(df)):
while(np.isnan(df.iloc[i,-1])):
df.iloc[i,1:]=df.iloc[i,1:].shift(periods=1, fill_value=np.nan)
输出:
英文:
If you don't have a row with only NaN values, you could use:
for i in range(len(df)):
while(np.isnan(df.iloc[i,-1])):
df.iloc[i,1:]=df.iloc[i,1:].shift(periods=1, fill_value=np.nan)
Output:
答案2
得分: 0
你可以临时将非目标列设置为索引(或删除它们),然后通过排序将非NaN值推到右边,仅更新与特定掩码匹配的行(这里是最后一列中的NaN值):
out = (df
.set_index('Customer', append=True)
.pipe(lambda d: d.mask(d.iloc[:, -1].isna(),
d.transform(lambda x : sorted(x, key=pd.notnull), axis=1)
)
)
.reset_index('Customer')
)
备选方案:
other_cols = ['Customer']
out = df.drop(columns=other_cols)
m = out.iloc[:, -1].isna()
out.loc[m, :] = out.loc[m, :].transform(lambda x : sorted(x, key=pd.notnull), axis=1)
out = df[other_cols].join(out)[df.columns]
注:有多种移动非NaN值的方法,这里是其中一种,但如果这是一个瓶颈,也可以使用非排序的方法。
输出:
Customer Date1 Date2 Date3 Date4
0 A 10.0 40.0 NaN 60.0
1 B NaN NaN 20.0 50.0
2 C NaN NaN NaN 30.0
英文:
You can temporarily set the non-target columns as index (or drop them), then push the non-NaNs to the right with sorting, and only update the rows that are matching a specific mask (here NaN in the last column):
out = (df
.set_index('Customer', append=True)
.pipe(lambda d: d.mask(d.iloc[:, -1].isna(),
d.transform(lambda x : sorted(x, key=pd.notnull), axis=1)
)
)
.reset_index('Customer')
)
Alternative:
other_cols = ['Customer']
out = df.drop(columns=other_cols)
m = out.iloc[:, -1].isna()
out.loc[m, :] = out.loc[m, :].transform(lambda x : sorted(x, key=pd.notnull), axis=1)
out = df[other_cols].join(out)[df.columns]
NB. there are several methods to shift non-NaNs, here is one, but non-sorting based methods are possible if this is a bottleneck.
Output:
Customer Date1 Date2 Date3 Date4
0 A 10.0 40.0 NaN 60.0
1 B NaN NaN 20.0 50.0
2 C NaN NaN NaN 30.0
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论