如何加速自定义函数

huangapple go评论59阅读模式
英文:

How to speed up custom function

问题

以下是已翻译的部分:

def keep_inum(row):
    if len(row) != 0:
        if int(row['inum']) in list1:
            if row['DESC_1'] == 1:
                return row['recs']
            else:
                return ''
        elif int(row['inum']) in list2:
            if row['DESC_1'] == 2:
                return row['recs']
            else:
                return ''
        elif int(row['inum']) in list3:
            if row['DESC_1'] == 3:
                return row['recs']
            else:
                return ''
        else:
            return row['recs']
    else:
        pass

应用函数到数据框(DF):

df['recs'] = df.apply(keep_inum, axis=1)
英文:

How to speed up my custom function?

I have three list of numbers :

list1
list2
list3

And Pandas Dataframe like this:

id inum DESC_1 recs
id1 inum1 1 recs1
id2 inum2 2 recs2
id3 inum3 3 recs3

And my custom function:

def keep_inum(row):
    if len(row) != 0:
        if int(row['inum']) in list1:
            if row['DESC_1'] == 1:
                return row['recs']
            else:
                return ''
        elif int(row['inum']) in list2:
            if row['DESC_1'] == 2:
                return row['recs']
            else:
                return ''
        elif int(row['inum']) in list3:
            if row['DESC_1'] == 3:
                return row['recs']
            else:
                return ''
        else:
            return row['recs']
    else:
        pass

Apply func to DF:

df['recs'] = df.apply(keep_inum, axis = 1)

答案1

得分: 1

不使用自定义函数:

import pandas as pd

df = pd.DataFrame(
    {
        "id": ["id1", "id2", "id3", "id4"],
        "inum": ["111", "222", "333", "331"],
        "DESC_1": [1, 4, 3, 3],
        "recs": ["recs1", "recs2", "recs3", "yes"],
    }
)

print(df)
print("---")

list1 = [111]
list2 = [222]
list3 = [333, 331]

# 一次性将inum转换为整数
df["inum_int"] = df["inum"].astype(int)
# 清空不匹配desc的recs
df.loc[df["inum_int"].isin(list1) & ~(df["DESC_1"] == 1), "recs"] = ""
df.loc[df["inum_int"].isin(list2) & ~(df["DESC_1"] == 2), "recs"] = ""
df.loc[df["inum_int"].isin(list3) & ~(df["DESC_1"] == 3), "recs"] = ""
df.drop(columns=["inum_int"], inplace=True)
print(df)
英文:

By not using a custom function at all:

import pandas as pd

df = pd.DataFrame(
    {
        "id": ["id1", "id2", "id3", "id4"],
        "inum": ["111", "222", "333", "331"],
        "DESC_1": [1, 4, 3, 3],
        "recs": ["recs1", "recs2", "recs3", "yes"],
    }
)

print(df)
print("---")

list1 = [111]
list2 = [222]
list3 = [333, 331]

# Cast inum to int in one go
df["inum_int"] = df["inum"].astype(int)
# Empty the recs where inum doesn't match desc
df.loc[df["inum_int"].isin(list1) & ~(df["DESC_1"] == 1), "recs"] = ""
df.loc[df["inum_int"].isin(list2) & ~(df["DESC_1"] == 2), "recs"] = ""
df.loc[df["inum_int"].isin(list3) & ~(df["DESC_1"] == 3), "recs"] = ""
df.drop(columns=["inum_int"], inplace=True)
print(df)

This outputs

    id inum  DESC_1   recs
0  id1  111       1  recs1
1  id2  222       4  recs2
2  id3  333       3  recs3
3  id4  331       3    yes
---
    id inum  DESC_1   recs
0  id1  111       1  recs1
1  id2  222       4       
2  id3  333       3  recs3
3  id4  331       3    yes

huangapple
  • 本文由 发表于 2023年2月6日 18:57:14
  • 转载请务必保留本文链接:https://go.coder-hub.com/75360409.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定