英文:
keep value in a column based on condition
问题
import pandas as pd
data={'ip.src':['x.x.x.x','y.y.y.y','z.z.z.z'],
'ip.dst':['a.a.a.a','b.b.b.b','c.c.c.c'],
'src_country':['china','US','china'],
'dst_country':['pakistan','china','india']
}
Data=pd.DataFrame(data)
# Keep only the values in ip.src where src_country is 'china'
Data['ip.src'] = Data['ip.src'][Data['src_country'] == 'china']
# Keep only the values in ip.dst where dst_country is 'china'
Data['ip.dst'] = Data['ip.dst'][Data['dst_country'] == 'china']
# Drop rows where both ip.src and ip.dst are NaN
Data = Data.dropna(subset=['ip.src', 'ip.dst'])
# Reset the index
Data = Data.reset_index(drop=True)
Data
这段代码将保留仅在src_country
为'china'时的ip.src
列中的值,以及仅在dst_country
为'china'时的ip.dst
列中的值。然后,删除同时为NaN的ip.src
和ip.dst
的行,并重新设置索引。
英文:
I have a dataframe
import pandas as pd
data={'ip.src':['x.x.x.x','y.y.y.y','z.z.z.z'],
'ip.dst':['a.a.a.a','b.b.b.b','c.c.c.c'],
'src_country':['china','US','china'],
'dst_country':['pakistan','china','india']
}
Data=pd.DataFrame(data)
I want to keep only that value in ip.src and ip.dst columns which has china ,like if china is in src_country then it should only keep the value in ip.src and if china is in dst_country then it should only keep the value in ip.dst.Is there any way to do it?
答案1
得分: 1
import numpy as np
Data = Data[(Data['src_country'] == 'china') | (Data['dst_country'] == 'china')]
Data[['src_country', 'dst_country']] = Data[['src_country', 'dst_country']].applymap(lambda x: np.nan if x != 'china' else x)
Data
ip.src ip.dst src_country dst_country
0 x.x.x.x a.a.a.a 中国 NaN
1 y.y.y.y b.b.b.b NaN 中国
2 z.z.z.z c.c.c.c 中国 NaN
英文:
Something like this?
import numpy as np
Data = Data[(Data['src_country'] == 'china') | (Data['dst_country'] == 'china')]
Data[['src_country', 'dst_country']] = Data[['src_country', 'dst_country']].applymap(lambda x: np.nan if x != 'china' else x)
Data
ip.src ip.dst src_country dst_country
0 x.x.x.x a.a.a.a china NaN
1 y.y.y.y b.b.b.b NaN china
2 z.z.z.z c.c.c.c china NaN
答案2
得分: 1
使用 [`DataFrame.loc`](http://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.loc.html) 来修改 `ip.src/ip.dst` 列:
Data['ip.src'] = Data.loc[Data['src_country'] == 'china', 'ip.src']
Data['ip.dst'] = Data.loc[Data['dst_country'] == 'china', 'ip.dst']
print (Data)
ip.src ip.dst src_country dst_country
0 x.x.x.x NaN china pakistan
1 NaN b.b.b.b US china
2 z.z.z.z NaN china india
或者:
m = Data[['src_country','dst_country']] == 'china'
Data[['ip.src', 'ip.dst']] = Data[['ip.src', 'ip.dst']].where(m.to_numpy())
print (Data)
ip.src ip.dst src_country dst_country
0 x.x.x.x NaN china pakistan
1 NaN b.b.b.b US china
2 z.z.z.z NaN china india
英文:
Use DataFrame.loc
for modify ip.src/ip.dst
columns:
Data['ip.src'] = Data.loc[Data['src_country'] == 'china', 'ip.src']
Data['ip.dst'] = Data.loc[Data['dst_country'] == 'china', 'ip.dst']
print (Data)
ip.src ip.dst src_country dst_country
0 x.x.x.x NaN china pakistan
1 NaN b.b.b.b US china
2 z.z.z.z NaN china india
Or:
m = Data[['src_country','dst_country']] == 'china'
Data[['ip.src', 'ip.dst']] = Data[['ip.src', 'ip.dst']].where(m.to_numpy())
print (Data)
ip.src ip.dst src_country dst_country
0 x.x.x.x NaN china pakistan
1 NaN b.b.b.b US china
2 z.z.z.z NaN china india
答案3
得分: 1
Data['ip.src'] = Data['ip.src'][(Data['src_country'] == 'china')]
Data['ip.dst'] = Data['ip.dst'][(Data['dst_country'] == 'china')]
英文:
Data['ip.src'] = Data['ip.src'][(Data['src_country'] == 'china')]
Data['ip.dst'] = Data['ip.dst'][(Data['dst_country'] == 'china')]
output
ip.src ip.dst src_country dst_country
x.x.x.x NaN china pakistan
NaN b.b.b.b US china
z.z.z.z NaN china india
答案4
得分: 0
另一个可能的解决方案:
Data[['ip.src', 'ip.dst']] = (np.where(
Data[['src_country', 'dst_country']].eq('china'),
np.nan, Data[['ip.src', 'ip.dst']]))
输出:
ip.src ip.dst src_country dst_country
0 NaN a.a.a.a china pakistan
1 y.y.y.y NaN US china
2 NaN c.c.c.c china india
英文:
Another possible solution:
Data[['ip.src', 'ip.dst']] = (np.where(
Data[['src_country', 'dst_country']].eq('china'),
np.nan, Data[['ip.src', 'ip.dst']]))
Output:
ip.src ip.dst src_country dst_country
0 NaN a.a.a.a china pakistan
1 y.y.y.y NaN US china
2 NaN c.c.c.c china india
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论