添加(和)逻辑以处理列的列表并生成特征。

huangapple go评论58阅读模式
英文:

Adding (and) logic for a list of columns and generating feature

问题

I have translated the relevant code section for you:

df2['alpha_flag'] = np.where((df2['A'] == 1) & (df2['B'] == 1) & (df2['D'] == 1), 1, 0)

This code calculates the 'alpha_flag' based on the conditions specified in your description.

英文:

I have requirement to create a feature by checking the fields of two different dataframes.
df1 is a dataframe created from mysqldb and it will be static, the values of column(required) from df1 can be moved from True or False.

below are more details.

df1= 
flag	beta	required
alpha	A	     TRUE
alpha	B	     TRUE
alpha	C	     FALSE
alpha	D	     TRUE

df2= 
name	A	B	C	D	E	F
roy	    1	1	0	1	0	0
john	0	1	1	1	0	0
sam	    1	1	1	1	1	1

I trying to create feature alpha_flag by using the column name "beta" from df1 if the required column is TRUE.

So the code for alpha_flag will be like below

df2['alpha_flag'] = np.where(df2['A']==1 and df2['B']==1 and df2['D']==1 , 1 , 0)

The difficulty i am facing is to choose only the columns that are mentioned as required true from df1 and dynamically update my 'alpha_flag' creation condition.

答案1

得分: 1

Sure, here is the translated code portion:

df2['alpha_flag'] = df2[df1.loc[df1['required'].eq('TRUE'), 'beta'].tolist()].eq(1).all(axis=1).astype(int)
如果所有'required'的值都是'FALSE'这将导致从'df2'中选择空列并导致对空数据框进行操作可能会导致意外行为因此更健壮的版本如下

cols = df1.loc[df1['required'], 'beta'].tolist()
df2['alpha_flag'] = df2[cols].eq(1).all(axis=1).astype(int) if len(cols) else 0
英文:

IIUC, you can use boolean indexing to select the beta rows where required is TRUE, then check if these columns in df2 is all equal with 1

df2['alpha_flag'] = df2[df1.loc[df1['required'].eq('TRUE'), 'beta'].tolist()].eq(1).all(axis=1).astype(int)

Note that if all required value is FALSE, this will cause selecting empty column from df2 and lead to empty dataframe manipulation which might cause unexpected behavior, so a more robust version would be

cols = df1.loc[df1['required'], 'beta'].tolist()
df2['alpha_flag'] = df2[cols].eq(1).all(axis=1).astype(int) if len(cols) else 0

huangapple
  • 本文由 发表于 2023年5月13日 14:40:33
  • 转载请务必保留本文链接:https://go.coder-hub.com/76241416.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定