合并两个满足条件的数据框。

huangapple go评论83阅读模式
英文:

Merge two dataframes where requirements are met

问题

我有两个数据框,我想在满足特定条件的情况下将它们合并。

我想将DF2合并到DF1中,其中客户端MAC和AP MAC匹配,但同时"Description"包含"AP",且"Client VLAN"等于"1234"。然后,我想用相应的"Interface"填充"AP Interface"。

我的实际DF1有超过10,000行,而DF2通常只有300行或更少。

DF1:

Switch Interface Description Client VLAN Client MAC
SW1 Gi1/0/1 AP Port 1234 1234.1234.1234
SW1 Gi1/0/1 AP Port 2344 3210.3210.3210
SW1 Gi1/0/2 AP Port 1234 3456.3456.3456
SW2 Gi2/0/5 Computer 2343 6543.6543.6543
SW2 Gi2/0/8 Trunk 1234 2345.2345.2345
SW3 Te1/1/1 AP Port 1234 7890.7890.7890
SW4 Gi3/0/6 AP Port 1234 2345.2345.2345

DF2:

AP Name AP Intf AP MAC
AP1 1234.1234.1234
AP2 3456.3456.3456
AP3 2345.2345.2345
AP4 7890.7890.7890

期望的输出:

Switch Interface Description Client VLAN Client MAC AP Name AP Intf AP MAC
SW1 Gi1/0/1 AP Port 1234 1234.1234.1234 AP1 Gi1/0/1 1234.1234.1234
SW1 Gi1/0/1 AP Port 2344 3210.3210.3210
SW1 Gi1/0/2 AP Port 1234 3456.3456.3456 AP2 Gi1/0/2 3456.3456.3456
SW2 Gi2/0/5 Computer 2343 6543.6543.6543
SW2 Gi2/0/8 Trunk 1234 2345.2345.2345
SW3 Te1/1/1 AP Port 1234 7890.7890.7890 AP4 Te1/1/1 7890.7890.7890
SW4 Gi3/0/6 AP Port 1234 2345.2345.2345 AP3 Gi3/0/6 2345.2345.2345

我正在使用以下代码,但输出结果不正确:

def helper(data, col): return data.groupby(col).cumcount()

df1['Client MAC'] = df1['Client MAC'].fillna("")
df2['AP MAC'] = df2['AP MAC'].fillna("")

df = df1.merge(df2, left_on=['Client MAC', helper(df1, ['Client MAC'])], right_on=['AP MAC', helper(df2, \
                        ['AP MAC'])], how='left').drop(columns='key_1')

当前输出:

Switch Interface Description Client VLAN Client MAC AP Name AP Intf AP MAC
SW1 Gi1/0/1 AP Port 1234 1234.1234.1234 AP1 Gi1/0/1 1234.1234.1234
SW1 Gi1/0/1 AP Port 2344 3210.3210.3210
SW1 Gi1/0/2 AP Port 1234 3456.3456.3456 AP2 Gi1/0/2 3456.3456.3456
SW2 Gi2/0/5 Computer 2343 6543.6543.6543
SW2 Gi2/0/8 Trunk 1234 2345.2345.2345 AP3 Gi3/0/6 2345.2345.2345
SW3 Te1/1/1 AP Port 1234 7890.7890.7890 AP4 Te1/1/1 7890.7890.7890
SW4 Gi3/0/6 AP Port 1234 2345.2345.2345
英文:

I have two dataframes that I am trying to merge where certain criteria is met.

I want to merge DF2 into DF1 where the Client MAC and AP MAC match but also where the 'Description' contains 'AP' and the 'Client VLAN' is equal to '1234'. I then want to populate the 'AP Interface' with the corresponding 'Interface'.

My actual DF1 is over 10k rows and DF2 is typically 300 or less.

DF1:

Switch Interface Description Client VLAN Client MAC
SW1 Gi1/0/1 AP Port 1234 1234.1234.1234
SW1 Gi1/0/1 AP Port 2344 3210.3210.3210
SW1 Gi1/0/2 AP Port 1234 3456.3456.3456
SW2 Gi2/0/5 Computer 2343 6543.6543.6543
SW2 Gi2/0/8 Trunk 1234 2345.2345.2345
SW3 Te1/1/1 AP Port 1234 7890.7890.7890
SW4 Gi3/0/6 AP Port 1234 2345.2345.2345

DF2:

AP Name AP Intf AP MAC
AP1 1234.1234.1234
AP2 3456.3456.3456
AP3 2345.2345.2345
AP4 7890.7890.7890

Desired output:

Switch Interface Description Client VLAN Client MAC AP Name AP Intf AP MAC
SW1 Gi1/0/1 AP Port 1234 1234.1234.1234 AP1 Gi1/0/1 1234.1234.1234
SW1 Gi1/0/1 AP Port 2344 3210.3210.3210
SW1 Gi1/0/2 AP Port 1234 3456.3456.3456 AP2 Gi1/0/2 3456.3456.3456
SW2 Gi2/0/5 Computer 2343 6543.6543.6543
SW2 Gi2/0/8 Trunk 1234 2345.2345.2345
SW3 Te1/1/1 AP Port 1234 7890.7890.7890 AP4 Te1/1/1 7890.7890.7890
SW4 Gi3/0/6 AP Port 1234 2345.2345.2345 AP3 Gi3/0/6 2345.2345.2345

The code I'm using that is giving me the below output:

def helper(data, col): return data.groupby(col).cumcount()

df1['Client MAC'] = df1['Client MAC'].fillna("")
df2['AP MAC'] = df2['AP MAC'].fillna("")

df = df1.merge(df2, left_on=['Client MAC', helper(df1, ['Client MAC'])], right_on=['AP MAC', helper(df2, \
                        ['AP MAC'])], how='left').drop(columns='key_1')

Current output:

Switch Interface Description Client VLAN Client MAC AP Name AP Intf AP MAC
SW1 Gi1/0/1 AP Port 1234 1234.1234.1234 AP1 Gi1/0/1 1234.1234.1234
SW1 Gi1/0/1 AP Port 2344 3210.3210.3210
SW1 Gi1/0/2 AP Port 1234 3456.3456.3456 AP2 Gi1/0/2 3456.3456.3456
SW2 Gi2/0/5 Computer 2343 6543.6543.6543
SW2 Gi2/0/8 Trunk 1234 2345.2345.2345 AP3 Gi3/0/6 2345.2345.2345
SW3 Te1/1/1 AP Port 1234 7890.7890.7890 AP4 Te1/1/1 7890.7890.7890
SW4 Gi3/0/6 AP Port 1234 2345.2345.2345

答案1

得分: 3

使用merge + mask操作:

df = df1.merge(df2, left_on='Client MAC', right_on='AP MAC', how='left')
df['AP Intf'] = (df['AP Intf'].mask(df['Description'].str.contains('AP')
                                    & df['Client VLAN'].eq(1234), df['Interface']))

  Switch Interface Description  Client VLAN      Client MAC AP Name   AP Intf          AP MAC
0   SW1   Gi1/0/1     AP Port          1234  1234.1234.1234     AP1   Gi1/0/1   1234.1234.1234
1   SW1   Gi1/0/1     AP Port          2344  3210.3210.3210     NaN       NaN              NaN
2   SW1   Gi1/0/2     AP Port          1234  3456.3456.3456     AP2   Gi1/0/2   3456.3456.3456
3   SW2   Gi2/0/5    Computer          2343  6543.6543.6543     NaN       NaN              NaN
4   SW2   Gi2/0/8       Trunk          1234  2345.2345.2345     AP3       NaN   2345.2345.2345
5   SW3   Te1/1/1     AP Port          1234  7890.7890.7890     AP4   Te1/1/1   7890.7890.7890
6   SW4   Gi3/0/6     AP Port          1234  2345.2345.2345     AP3   Gi3/0/6   2345.2345.2345
英文:

With merge + mask operations:

df = df1.merge(df2, left_on='Client MAC', right_on='AP MAC', how='left')
df['AP Intf'] = (df['AP Intf'].mask(df['Description'].str.contains('AP')
                                    & df['Client VLAN'].eq(1234), df['Interface']))

  Switch Interface Description  Client VLAN      Client MAC AP Name   AP Intf          AP MAC
0   SW1   Gi1/0/1     AP Port          1234  1234.1234.1234     AP1   Gi1/0/1   1234.1234.1234
1   SW1   Gi1/0/1     AP Port          2344  3210.3210.3210     NaN       NaN              NaN
2   SW1   Gi1/0/2     AP Port          1234  3456.3456.3456     AP2   Gi1/0/2   3456.3456.3456
3   SW2   Gi2/0/5    Computer          2343  6543.6543.6543     NaN       NaN              NaN
4   SW2   Gi2/0/8       Trunk          1234  2345.2345.2345     AP3       NaN   2345.2345.2345
5   SW3   Te1/1/1     AP Port          1234  7890.7890.7890     AP4   Te1/1/1   7890.7890.7890
6   SW4   Gi3/0/6     AP Port          1234  2345.2345.2345     AP3   Gi3/0/6   2345.2345.2345

huangapple
  • 本文由 发表于 2023年8月8日 23:04:32
  • 转载请务必保留本文链接:https://go.coder-hub.com/76860835.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定