如何加速自定义函数

huangapple go评论94阅读模式
英文:

How to speed up custom function

问题

以下是已翻译的部分:

  1. def keep_inum(row):
  2. if len(row) != 0:
  3. if int(row['inum']) in list1:
  4. if row['DESC_1'] == 1:
  5. return row['recs']
  6. else:
  7. return ''
  8. elif int(row['inum']) in list2:
  9. if row['DESC_1'] == 2:
  10. return row['recs']
  11. else:
  12. return ''
  13. elif int(row['inum']) in list3:
  14. if row['DESC_1'] == 3:
  15. return row['recs']
  16. else:
  17. return ''
  18. else:
  19. return row['recs']
  20. else:
  21. pass

应用函数到数据框(DF):

  1. df['recs'] = df.apply(keep_inum, axis=1)
英文:

How to speed up my custom function?

I have three list of numbers :

list1
list2
list3

And Pandas Dataframe like this:

id inum DESC_1 recs
id1 inum1 1 recs1
id2 inum2 2 recs2
id3 inum3 3 recs3

And my custom function:

  1. def keep_inum(row):
  2. if len(row) != 0:
  3. if int(row['inum']) in list1:
  4. if row['DESC_1'] == 1:
  5. return row['recs']
  6. else:
  7. return ''
  8. elif int(row['inum']) in list2:
  9. if row['DESC_1'] == 2:
  10. return row['recs']
  11. else:
  12. return ''
  13. elif int(row['inum']) in list3:
  14. if row['DESC_1'] == 3:
  15. return row['recs']
  16. else:
  17. return ''
  18. else:
  19. return row['recs']
  20. else:
  21. pass

Apply func to DF:

  1. df['recs'] = df.apply(keep_inum, axis = 1)

答案1

得分: 1

不使用自定义函数:

  1. import pandas as pd
  2. df = pd.DataFrame(
  3. {
  4. "id": ["id1", "id2", "id3", "id4"],
  5. "inum": ["111", "222", "333", "331"],
  6. "DESC_1": [1, 4, 3, 3],
  7. "recs": ["recs1", "recs2", "recs3", "yes"],
  8. }
  9. )
  10. print(df)
  11. print("---")
  12. list1 = [111]
  13. list2 = [222]
  14. list3 = [333, 331]
  15. # 一次性将inum转换为整数
  16. df["inum_int"] = df["inum"].astype(int)
  17. # 清空不匹配desc的recs
  18. df.loc[df["inum_int"].isin(list1) & ~(df["DESC_1"] == 1), "recs"] = ""
  19. df.loc[df["inum_int"].isin(list2) & ~(df["DESC_1"] == 2), "recs"] = ""
  20. df.loc[df["inum_int"].isin(list3) & ~(df["DESC_1"] == 3), "recs"] = ""
  21. df.drop(columns=["inum_int"], inplace=True)
  22. print(df)
英文:

By not using a custom function at all:

  1. import pandas as pd
  2. df = pd.DataFrame(
  3. {
  4. "id": ["id1", "id2", "id3", "id4"],
  5. "inum": ["111", "222", "333", "331"],
  6. "DESC_1": [1, 4, 3, 3],
  7. "recs": ["recs1", "recs2", "recs3", "yes"],
  8. }
  9. )
  10. print(df)
  11. print("---")
  12. list1 = [111]
  13. list2 = [222]
  14. list3 = [333, 331]
  15. # Cast inum to int in one go
  16. df["inum_int"] = df["inum"].astype(int)
  17. # Empty the recs where inum doesn't match desc
  18. df.loc[df["inum_int"].isin(list1) & ~(df["DESC_1"] == 1), "recs"] = ""
  19. df.loc[df["inum_int"].isin(list2) & ~(df["DESC_1"] == 2), "recs"] = ""
  20. df.loc[df["inum_int"].isin(list3) & ~(df["DESC_1"] == 3), "recs"] = ""
  21. df.drop(columns=["inum_int"], inplace=True)
  22. print(df)

This outputs

  1. id inum DESC_1 recs
  2. 0 id1 111 1 recs1
  3. 1 id2 222 4 recs2
  4. 2 id3 333 3 recs3
  5. 3 id4 331 3 yes
  6. ---
  7. id inum DESC_1 recs
  8. 0 id1 111 1 recs1
  9. 1 id2 222 4
  10. 2 id3 333 3 recs3
  11. 3 id4 331 3 yes

huangapple
  • 本文由 发表于 2023年2月6日 18:57:14
  • 转载请务必保留本文链接:https://go.coder-hub.com/75360409.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定