2023年6月15日 07:25:32go评论105阅读模式

英文:

Efficient caching strategy for a symmetric function in Python

问题

Here's the translation of the code portion:

我有以下形式的代码（这个示例要简单得多，但可以传达要点）：
```python
i = range(10)   # 实际上要大得多
def func(a, b): # 我的实际函数比这复杂得多，但重要的是 func(a, b) = func(b, a)
    return a + b
total = 0
for i_prime in i:
    for i_prime_prime in i:
        if i_prime != i_prime_prime:
            total += func(i_prime, i_prime_prime)

如何减少调用 func 的次数，利用了 func(a, b) = func(b, a) 这一事实。我想到了以下方法：

cache = {}
total = 0
for i_prime in i:
    for i_prime_prime in i:
        if i_prime != i_prime_prime:
            key = str(list({i_prime, i_prime_prime}))
            if key in cache.keys():
                total += cache[key]
            else:
                temp       = func(i_prime, i_prime_prime)
                cache[key] = temp
                total     += temp

但我觉得执行 key = str(list({i_prime, i_prime_prime})) 不是最高效的方法。


<details>
<summary>英文:</summary>
I have code of the following form (this example is much simpler but it gets the point across):
```python
i = range(10)   # in reality much larger
def func(a, b): # my actual function is much more complicated than this, but the
                # important thing is that func(a, b) = func(b, a)
    return a + b
total = 0
for i_prime in i:
    for i_prime_prime in i:
        if i_prime != i_prime_prime:
            total += func(i_prime, i_prime_prime)

How can I reduce the number of times I call func, taking advantage of the fact that func(a, b) = func(b, a). I came up with the following:

cache = {}
total = 0
for i_prime in i:
    for i_prime_prime in i:
        if i_prime != i_prime_prime:
            key = str(list({i_prime, i_prime_prime}))
            if key in cache.keys():
                total += cache[key]
            else:
                temp       = func(i_prime, i_prime_prime)
                cache[key] = temp
                total     += temp

But I feel like doing key = str(list({i_prime, i_prime_prime})) is not the most efficient way to do it.

答案1

得分: 1

我很懒。我只是使用Python已经存在的@cache，并接受轻微的效率损失：

@cache
def f(a, b):
    if a > b:
        return f(b, a)
    .... 计算f(a, b) ...

如果这变得非常时间关键，我会更加努力。但是三行代码几乎满足了我所需的一切。

英文:

I'm lazy. I'd just used Python's already existing @cache and live with the minor inefficiency:

@cache
def f(a, b):
    if a &gt; b:
        return f(b, a)
    .... calculate f(a, b) ...

If this turned out to be utterly time critical, I'd try harder. But three lines of code gives me almost everything I need.

答案2

得分: 1

If you call the function "irregularly," then caching/flipping like Frank showed is good. Or if you want to roll your own, then:

key = str(sorted((a, b)))
if key in cache:

key = tuple(sorted((a, b)))
if key in cache:

key = (a, b) if a < b else (b, a)
if key in cache:

would likely be better than your:

key = str(list({a, b}))
if key in cache.keys():

because:

Sorting a pair really does make (a, b) and (b, a) equal. Unlike your set, as for example list({0, 8}) and list({8, 0}) become the different keys [0, 8] and [8, 0] (implementation detail, but has been like this for many years) while sorting leads to [0, 8] for both.
Creating a keys view for each membership check is pointless and wasteful.
Tuple might be faster than string, at least it should be at creation, although it might be slower at hashing. Would have to measure, with the actual data.

But if you do something with all pairs, each pair once, like your demo shows and like your comments suggest, then it might be better to simply only produce one of (a, b) and (b, a), compute its function value, and incorporate it into your result twice:

from itertools import combinations
total = sum(
    2 * func(a, b)
    for a, b in combinations(i, 2)
)

Or even:

from itertools import combinations, starmap
total = 2 * sum(starmap(func, combinations(i, 2)))

Attempt This Online!

英文:

If you call the function "irregularly", then caching/flipping like Frank showed is good. Or if you want to roll your own, then

key = str(sorted((a, b)))
if key in cache:

key = tuple(sorted((a, b)))
if key in cache:

key = (a, b) if a &lt; b else (b, a)
if key in cache:

would likely be better than your

key = str(list({a, b}))
if key in cache.keys():

because:

Sorting a pair really does make (a, b) and (b, a) equal. Unlike your set, as for example list({0, 8}) and list({8, 0}) become the different keys [0, 8] and [8, 0] (implementation detail, but has been like this for many years) while sorting leads to [0, 8] for both.
Creating a keys view for each membership check is pointless and wasteful.
Tuple might be faster than string, at least it should be at creation, although it might be slower at hashing. Would have to measure, with the actual data.

from itertools import combinations
total = sum(
    2 * func(a, b)
    for a, b in combinations(i, 2)
)

Or even:

from itertools import combinations, starmap
total = 2 * sum(starmap(func, combinations(i, 2)))

Attempt This Online!

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

Efficient caching strategy for a symmetric function in Python

问题

答案1

答案2

如何改进嵌套理解的性能？

“Wrap each item in my dataset with double quotes not reading to csv”

在Kivy中四舍五入交替的边缘

CUDA在笔记本中可用，但在VS Code终端中不可用 – 使用相同的conda环境。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。