Python / Pandas: 将行中的实体向右移动(末尾)

huangapple go评论71阅读模式
英文:

Python / Pandas: Shift entities of a row to the right (end)

问题

以下是翻译好的部分:

import numpy as np
import pandas as pd

data = {
    'Customer': ['A', 'B', 'C'],
    'Date1': [10, 20, 30],
    'Date2': [40, 50, np.nan],
    'Date3': [np.nan, np.nan, np.nan],
    'Date4': [60, np.nan, np.nan]
}

df = pd.DataFrame(data)

for i in range(1, len(df.columns)):
    df.iloc[:, i] = df.iloc[:, i-1].shift(fill_value=np.nan)

print(df)

希望这对你有所帮助。如果有任何其他问题,请随时提出。

英文:

I have the following data frame (number of "Date" columns can vary):

Customer Date1 Date2 Date3 Date4
0 A 10 40.0 NaN 60.0

1 B 20 50.0 NaN NaN

2 C 30 NaN NaN NaN

If there is a "NaN" in the last column (as said, number of columns can vary), I want to right shift all the columns to the end of the data frame such that it then looks like this:

Customer Date1 Date2 Date3 Date4

0 A 10 40.0 NaN 60.0

1 B NaN NaN 20 50.0

2 C NaN NaN NaN 30

All the values which remain empty can be set to NaN.

How can I do that in Python?

I tried this code but didn't work:

import numpy as np
import pandas as pd

data = {
    'Customer': ['A', 'B', 'C'],
    'Date1': [10, 20, 30],
    'Date2': [40, 50, np.nan],
    'Date3': [np.nan, np.nan, np.nan],
    'Date4': [60, np.nan, np.nan]
}

df = pd.DataFrame(data)


for i in range(1, len(df.columns)):
    df.iloc[:, i] = df.iloc[:, i-1].shift(fill_value=np.nan)

print(df)

答案1

得分: 1

如果您没有一行只包含NaN值,您可以使用:

for i in range(len(df)):
    while(np.isnan(df.iloc[i,-1])):
        df.iloc[i,1:]=df.iloc[i,1:].shift(periods=1, fill_value=np.nan)

输出:

Python / Pandas: 将行中的实体向右移动(末尾)

英文:

If you don't have a row with only NaN values, you could use:

for i in range(len(df)):
    while(np.isnan(df.iloc[i,-1])):
        df.iloc[i,1:]=df.iloc[i,1:].shift(periods=1, fill_value=np.nan)

Output:

Python / Pandas: 将行中的实体向右移动(末尾)

答案2

得分: 0

你可以临时将非目标列设置为索引(或删除它们),然后通过排序将非NaN值推到右边,仅更新与特定掩码匹配的行(这里是最后一列中的NaN值):

out = (df
   .set_index('Customer', append=True)
   .pipe(lambda d: d.mask(d.iloc[:, -1].isna(),
                          d.transform(lambda x : sorted(x, key=pd.notnull), axis=1)
                         )
        )
   .reset_index('Customer')
)

备选方案:

other_cols = ['Customer']
out = df.drop(columns=other_cols)
m = out.iloc[:, -1].isna()
out.loc[m, :] = out.loc[m, :].transform(lambda x : sorted(x, key=pd.notnull), axis=1)
out = df[other_cols].join(out)[df.columns]

注:有多种移动非NaN值的方法,这里是其中一种,但如果这是一个瓶颈,也可以使用非排序的方法。

输出:

  Customer  Date1  Date2  Date3  Date4
0        A   10.0   40.0    NaN   60.0
1        B    NaN    NaN   20.0   50.0
2        C    NaN    NaN    NaN   30.0
英文:

You can temporarily set the non-target columns as index (or drop them), then push the non-NaNs to the right with sorting, and only update the rows that are matching a specific mask (here NaN in the last column):

out = (df
   .set_index('Customer', append=True)
   .pipe(lambda d: d.mask(d.iloc[:, -1].isna(),
                          d.transform(lambda x : sorted(x, key=pd.notnull), axis=1)
                         )
        )
   .reset_index('Customer')
)

Alternative:

other_cols = ['Customer']
out = df.drop(columns=other_cols)
m = out.iloc[:, -1].isna()
out.loc[m, :] = out.loc[m, :].transform(lambda x : sorted(x, key=pd.notnull), axis=1)
out = df[other_cols].join(out)[df.columns]

NB. there are several methods to shift non-NaNs, here is one, but non-sorting based methods are possible if this is a bottleneck.

Output:

  Customer  Date1  Date2  Date3  Date4
0        A   10.0   40.0    NaN   60.0
1        B    NaN    NaN   20.0   50.0
2        C    NaN    NaN    NaN   30.0

huangapple
  • 本文由 发表于 2023年7月17日 19:43:34
  • 转载请务必保留本文链接:https://go.coder-hub.com/76704112.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定