python3的itertools.filterfalse非常慢。有哪些替代方法?

huangapple go评论53阅读模式
英文:

python3 itertools.filterfalse is very slow. What are the alternatives?

问题

我想要获取两个字典变量列表之间的差异。
这两个列表每个包含 100,000 个项目。
当我编写下面的代码时,它花费了超过 2 分钟的时间(我的旧笔记本电脑配备了 Intel® Core™ i7-4700HQ CPU @ 2.40GHz × 8):

added_items = list(itertools.filterfalse(lambda x: x in records_old, records_new))

这是一个示例记录:

[{"email": "user-01@test.com", "login": "user-01", "type": "member"},

我还需要比较键和值两者之间的差异。
您能否建议其他更快的方法?

英文:

I want to get difference between two list of dict variables.
Those list contains 100,000 items each.
When I write below code it took more than 2 minutes. (My old laptop with Intel® Core™ i7-4700HQ CPU @ 2.40GHz × 8)

added_items = list(itertools.filterfalse(lambda x: x in records_old, records_new))

This is sample record;

[{"email": "user-01@test.com", "login": "user-01", "type": "member"},

Also I need to compare both keys and values.
Can you please suggest some other faster way for this?

答案1

得分: 3

你的代码具有*O(n^2)的时间复杂度和O(1)*的空间复杂度。

在编程中,我们经常可以通过牺牲时间来换取空间。这种方法的时间复杂度应该是O(n),空间复杂度也是O(n),并且在运行时间上有所改进:

records_old_set = {frozenset(r.items()) for r in records_old}
added_items = [r for r in records_new if frozenset(r.items()) not in records_old_set]
英文:

Your code is O(n^2) time complexity and O(1) space.

In programming, we can often trade time for space. This approach should be O(n) time and O(n) space, with an improvement in runtime:

records_old_set = {frozenset(r.items()) for r in records_old}
added_items = [r for r in records_new if frozenset(r.items()) not in records_old_set]

huangapple
  • 本文由 发表于 2020年1月4日 12:34:48
  • 转载请务必保留本文链接:https://go.coder-hub.com/59587913.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定