英文:
python3 itertools.filterfalse is very slow. What are the alternatives?
问题
我想要获取两个字典变量列表之间的差异。
这两个列表每个包含 100,000 个项目。
当我编写下面的代码时,它花费了超过 2 分钟的时间(我的旧笔记本电脑配备了 Intel® Core™ i7-4700HQ CPU @ 2.40GHz × 8):
added_items = list(itertools.filterfalse(lambda x: x in records_old, records_new))
这是一个示例记录:
[{"email": "user-01@test.com", "login": "user-01", "type": "member"},
我还需要比较键和值两者之间的差异。
您能否建议其他更快的方法?
英文:
I want to get difference between two list of dict variables.
Those list contains 100,000 items each.
When I write below code it took more than 2 minutes. (My old laptop with Intel® Core™ i7-4700HQ CPU @ 2.40GHz × 8)
added_items = list(itertools.filterfalse(lambda x: x in records_old, records_new))
This is sample record;
[{"email": "user-01@test.com", "login": "user-01", "type": "member"},
Also I need to compare both keys and values.
Can you please suggest some other faster way for this?
答案1
得分: 3
你的代码具有*O(n^2)的时间复杂度和O(1)*的空间复杂度。
在编程中,我们经常可以通过牺牲时间来换取空间。这种方法的时间复杂度应该是O(n),空间复杂度也是O(n),并且在运行时间上有所改进:
records_old_set = {frozenset(r.items()) for r in records_old}
added_items = [r for r in records_new if frozenset(r.items()) not in records_old_set]
英文:
Your code is O(n^2) time complexity and O(1) space.
In programming, we can often trade time for space. This approach should be O(n) time and O(n) space, with an improvement in runtime:
records_old_set = {frozenset(r.items()) for r in records_old}
added_items = [r for r in records_new if frozenset(r.items()) not in records_old_set]
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论