英文:
How to speed up custom function
问题
以下是已翻译的部分:
def keep_inum(row):
    if len(row) != 0:
        if int(row['inum']) in list1:
            if row['DESC_1'] == 1:
                return row['recs']
            else:
                return ''
        elif int(row['inum']) in list2:
            if row['DESC_1'] == 2:
                return row['recs']
            else:
                return ''
        elif int(row['inum']) in list3:
            if row['DESC_1'] == 3:
                return row['recs']
            else:
                return ''
        else:
            return row['recs']
    else:
        pass
应用函数到数据框(DF):
df['recs'] = df.apply(keep_inum, axis=1)
英文:
How to speed up my custom function?
I have three list of numbers :
list1
list2
list3
And Pandas Dataframe like this:
| id | inum | DESC_1 | recs | 
|---|---|---|---|
| id1 | inum1 | 1 | recs1 | 
| id2 | inum2 | 2 | recs2 | 
| id3 | inum3 | 3 | recs3 | 
And my custom function:
def keep_inum(row):
    if len(row) != 0:
        if int(row['inum']) in list1:
            if row['DESC_1'] == 1:
                return row['recs']
            else:
                return ''
        elif int(row['inum']) in list2:
            if row['DESC_1'] == 2:
                return row['recs']
            else:
                return ''
        elif int(row['inum']) in list3:
            if row['DESC_1'] == 3:
                return row['recs']
            else:
                return ''
        else:
            return row['recs']
    else:
        pass
Apply func to DF:
df['recs'] = df.apply(keep_inum, axis = 1)
答案1
得分: 1
不使用自定义函数:
import pandas as pd
df = pd.DataFrame(
    {
        "id": ["id1", "id2", "id3", "id4"],
        "inum": ["111", "222", "333", "331"],
        "DESC_1": [1, 4, 3, 3],
        "recs": ["recs1", "recs2", "recs3", "yes"],
    }
)
print(df)
print("---")
list1 = [111]
list2 = [222]
list3 = [333, 331]
# 一次性将inum转换为整数
df["inum_int"] = df["inum"].astype(int)
# 清空不匹配desc的recs
df.loc[df["inum_int"].isin(list1) & ~(df["DESC_1"] == 1), "recs"] = ""
df.loc[df["inum_int"].isin(list2) & ~(df["DESC_1"] == 2), "recs"] = ""
df.loc[df["inum_int"].isin(list3) & ~(df["DESC_1"] == 3), "recs"] = ""
df.drop(columns=["inum_int"], inplace=True)
print(df)
英文:
By not using a custom function at all:
import pandas as pd
df = pd.DataFrame(
    {
        "id": ["id1", "id2", "id3", "id4"],
        "inum": ["111", "222", "333", "331"],
        "DESC_1": [1, 4, 3, 3],
        "recs": ["recs1", "recs2", "recs3", "yes"],
    }
)
print(df)
print("---")
list1 = [111]
list2 = [222]
list3 = [333, 331]
# Cast inum to int in one go
df["inum_int"] = df["inum"].astype(int)
# Empty the recs where inum doesn't match desc
df.loc[df["inum_int"].isin(list1) & ~(df["DESC_1"] == 1), "recs"] = ""
df.loc[df["inum_int"].isin(list2) & ~(df["DESC_1"] == 2), "recs"] = ""
df.loc[df["inum_int"].isin(list3) & ~(df["DESC_1"] == 3), "recs"] = ""
df.drop(columns=["inum_int"], inplace=True)
print(df)
This outputs
    id inum  DESC_1   recs
0  id1  111       1  recs1
1  id2  222       4  recs2
2  id3  333       3  recs3
3  id4  331       3    yes
---
    id inum  DESC_1   recs
0  id1  111       1  recs1
1  id2  222       4       
2  id3  333       3  recs3
3  id4  331       3    yes
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。


评论