如何检查行趋势并将失败案例的差异和差异百分比分别添加到单独的列中

huangapple go评论53阅读模式
英文:

How to check the row trend and add the difference and difference percentage in separate columns for the failing cases

问题

以下是翻译好的代码部分:

import pandas as pd
import numpy as np

d = {'Cell': ['A', 'B', 'C', 'D', 'E'], 'D1': [5, 2, 2, 6, 6], 'D2': [np.nan, 5, 6, np.nan, 3], 'D3': [7, np.nan, 5, 5, np.nan], 'D6': [17, 3, np.nan, np.nan, 2]}
df = pd.DataFrame(d)

希望对你有所帮助。如需进一步的翻译或解释,请随时提出。

英文:

An extension to the problem statement
https://stackoverflow.com/questions/75412399/how-to-check-each-row-trend-with-some-tolerance-by-ignoring-the-np-nan-values-in

import pandas as pd
import numpy as np
    d = {'Cell':['A','B','C','D','E'],'D1':[5, 2, 2, 6,6], 'D2':[np.nan, 5, 6, np.nan,3], 'D3':[7,np.nan, 5, 5,np.nan], 'D6':[17, 3, np.nan,np.nan,2]}
    df = pd.DataFrame(d)


Cell  D1   D2   D3    D6
0    A   5  NaN  7.0  17.0
1    B   2  5.0  NaN   3.0
2    C   2  6.0  5.0   NaN
3    D   6  NaN  5.0   NaN
4    E   6  3.0  NaN   2.0

i want output like this with additional columns diff and diff% along with is_increasing and failing columns

  Cell  D1   D2   D3    D6  is_increasing?   failing  diff          diff%
0    A   5  NaN  7.0  17.0            True       NaN  NaN           NaN
1    B   2  5.0  NaN   3.0            False      [D6]  [-2]         [40%]
2    C   2  6.0  5.0   NaN           False      [D3]  [-1]        [16.6%]
3    D   6  NaN  5.0   NaN           False      [D3]  [-1]        [16.6%]
4    E   6  3.0  NaN   2.0           False  [D2, D6]  [-3,-1]   [50%,33%]

Explanation of the columns:

is_increasing --> whether the values are strictly increasing or not
failing --> columns whether strictly increasing is not followed when compared with previous value
diff --> difference of the values where there is failing cases
diff% --> difference in terms of percentages for the failing cases

between (6,5) numbers in the columns

diff column --> 5-6=-1
diff%--> 1-(5/6)=16.6%

Please let me the solution to this problem, i tried different ways but not able to come up with solution.

答案1

得分: 1

以下是您要翻译的内容:

# 仅筛选相关列
# 使用任何方法
df2 = df.drop(columns='Cell')

d = (df2.ffill(axis=1)
        .diff(axis=1)
     )

m = (d.where(df2.notna())
      .lt(0)
     )

df['is_increasing'] = ~m.any(axis=1)

df['failing'] = (
  m.mul(df2.columns)
   .where(m).stack()
   .groupby(level=0).agg(list)
)

df['diff'] = (d
   .where(m).stack()
   .groupby(level=0).agg(list)
 )

p = (df2.ffill(axis=1)
        .pct_change(axis=1)
        .mul(-100).round(2)
     )
df['diff%'] = (p
   .where(m).stack()
   .groupby(level=0).agg(list)
 )

print(df)

输出:

  Cell  D1   D2   D3    D6  is_increasing   failing          diff          diff%
0    A   5  NaN  7.0  17.0           True       NaN           NaN            NaN
1    B   2  5.0  NaN   3.0          False      [D6]        [-2.0]         [40.0]
2    C   2  6.0  5.0   NaN          False      [D3]        [-1.0]        [16.67]
3    D   6  NaN  5.0   NaN          False      [D3]        [-1.0]        [16.67]
4    E   6  3.0  NaN   2.0          False  [D2, D6]  [-3.0, -1.0]  [50.0, 33.33]

请注意,代码中的注释部分未被翻译。

英文:

You can use:

# filter only relevant columns
# use any method
df2 = df.drop(columns='Cell')

d = (df2.ffill(axis=1)
        .diff(axis=1)
     )

m = (d.where(df2.notna())
      .lt(0)
     )

df['is_increasing'] = ~m.any(axis=1)

df['failing'] = (
  m.mul(df2.columns)
   .where(m).stack()
   .groupby(level=0).agg(list)
)

df['diff'] = (d
   .where(m).stack()
   .groupby(level=0).agg(list)
 )

p = (df2.ffill(axis=1)
        .pct_change(axis=1)
        .mul(-100).round(2)
     )
df['diff%'] = (p
   .where(m).stack()
   .groupby(level=0).agg(list)
 )

print(df)

Output:

  Cell  D1   D2   D3    D6  is_increasing   failing          diff          diff%
0    A   5  NaN  7.0  17.0           True       NaN           NaN            NaN
1    B   2  5.0  NaN   3.0          False      [D6]        [-2.0]         [40.0]
2    C   2  6.0  5.0   NaN          False      [D3]        [-1.0]        [16.67]
3    D   6  NaN  5.0   NaN          False      [D3]        [-1.0]        [16.67]
4    E   6  3.0  NaN   2.0          False  [D2, D6]  [-3.0, -1.0]  [50.0, 33.33]

huangapple
  • 本文由 发表于 2023年2月14日 03:45:41
  • 转载请务必保留本文链接:https://go.coder-hub.com/75440543.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定