英文:
How to speed up custom function
问题
以下是已翻译的部分:
def keep_inum(row):
if len(row) != 0:
if int(row['inum']) in list1:
if row['DESC_1'] == 1:
return row['recs']
else:
return ''
elif int(row['inum']) in list2:
if row['DESC_1'] == 2:
return row['recs']
else:
return ''
elif int(row['inum']) in list3:
if row['DESC_1'] == 3:
return row['recs']
else:
return ''
else:
return row['recs']
else:
pass
应用函数到数据框(DF):
df['recs'] = df.apply(keep_inum, axis=1)
英文:
How to speed up my custom function?
I have three list of numbers :
list1
list2
list3
And Pandas Dataframe like this:
id | inum | DESC_1 | recs |
---|---|---|---|
id1 | inum1 | 1 | recs1 |
id2 | inum2 | 2 | recs2 |
id3 | inum3 | 3 | recs3 |
And my custom function:
def keep_inum(row):
if len(row) != 0:
if int(row['inum']) in list1:
if row['DESC_1'] == 1:
return row['recs']
else:
return ''
elif int(row['inum']) in list2:
if row['DESC_1'] == 2:
return row['recs']
else:
return ''
elif int(row['inum']) in list3:
if row['DESC_1'] == 3:
return row['recs']
else:
return ''
else:
return row['recs']
else:
pass
Apply func to DF:
df['recs'] = df.apply(keep_inum, axis = 1)
答案1
得分: 1
不使用自定义函数:
import pandas as pd
df = pd.DataFrame(
{
"id": ["id1", "id2", "id3", "id4"],
"inum": ["111", "222", "333", "331"],
"DESC_1": [1, 4, 3, 3],
"recs": ["recs1", "recs2", "recs3", "yes"],
}
)
print(df)
print("---")
list1 = [111]
list2 = [222]
list3 = [333, 331]
# 一次性将inum转换为整数
df["inum_int"] = df["inum"].astype(int)
# 清空不匹配desc的recs
df.loc[df["inum_int"].isin(list1) & ~(df["DESC_1"] == 1), "recs"] = ""
df.loc[df["inum_int"].isin(list2) & ~(df["DESC_1"] == 2), "recs"] = ""
df.loc[df["inum_int"].isin(list3) & ~(df["DESC_1"] == 3), "recs"] = ""
df.drop(columns=["inum_int"], inplace=True)
print(df)
英文:
By not using a custom function at all:
import pandas as pd
df = pd.DataFrame(
{
"id": ["id1", "id2", "id3", "id4"],
"inum": ["111", "222", "333", "331"],
"DESC_1": [1, 4, 3, 3],
"recs": ["recs1", "recs2", "recs3", "yes"],
}
)
print(df)
print("---")
list1 = [111]
list2 = [222]
list3 = [333, 331]
# Cast inum to int in one go
df["inum_int"] = df["inum"].astype(int)
# Empty the recs where inum doesn't match desc
df.loc[df["inum_int"].isin(list1) & ~(df["DESC_1"] == 1), "recs"] = ""
df.loc[df["inum_int"].isin(list2) & ~(df["DESC_1"] == 2), "recs"] = ""
df.loc[df["inum_int"].isin(list3) & ~(df["DESC_1"] == 3), "recs"] = ""
df.drop(columns=["inum_int"], inplace=True)
print(df)
This outputs
id inum DESC_1 recs
0 id1 111 1 recs1
1 id2 222 4 recs2
2 id3 333 3 recs3
3 id4 331 3 yes
---
id inum DESC_1 recs
0 id1 111 1 recs1
1 id2 222 4
2 id3 333 3 recs3
3 id4 331 3 yes
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论