2023年4月20日 05:17:29go评论77阅读模式

英文:

No general solution to k-combination of multiset?

问题

以下是您要翻译的内容：

The problem to solve

解决可能包含重复项的 k-组合问题，并仅返回唯一的组合。

例如，输入为 [0,1,2,2,4]，k = 2，结果为：

{(0, 1), (2, 4), (1, 2), (0, 4), (1, 4), (0, 2), (2, 2)}

The solution I can think of

我能想到的唯一通用解决方案是使用 set 或 map 来提取唯一结果。

例如，在Python中：

from itertools import combinations
print(set(combinations([0, 1, 2, 2, 4], 2)))

Questions

是否有一种通用解决方案可以在不使用 set/map 进行过滤的情况下完成此操作。

我可以生成按顺序的唯一项组合（参考：https://stackoverflow.com/a/76048486），但如果存在重复项，则会有重复的组合。

我阅读了这个：https://math.stackexchange.com/a/554237，似乎它说没有一种通用解决方案可以获取所有唯一的组合，尽管有一种获得唯一组合数量的公式。

但是，我不确定。

@Update - Summary & tips

根据答案，可以通过迭代和递归两种方式来完成此操作。
从python翻译为go时：
- 在Python中的yield需要在Go中使用chan，或者需要返回所有组合（对于大量输入可能很大）。
- Python中的else用于for和while循环，当使用break时在Go中会省略。
- Python中的列表切片会创建一个副本（或引用？），而在Go中切片一个切片将重用相同的底层数组（因此在Go中，如果要复制的切片稍后将被修改，而且不希望在副本中出现该更改，则应将其复制/附加到新切片中）。

英文:

The problem to solve

Solve k-combination whose input may contain repeated items, and return only unique combinations.

e.g input is [0,1,2,2,4], k = 2, the result is:
> {(0, 1), (2, 4), (1, 2), (0, 4), (1, 4), (0, 2), (2, 2)}

The solution I can think of

The only general solution I can think is to use a set or map to extract unique results.

e.g with python

from itertools import combinations
print (set(combinations([0, 1, 2, 2, 4], 2)))

Questions

Is there a general solution to do this, without using a set/map for filtering.

I can generate combinations of unique items, in order (refer: https://stackoverflow.com/a/76048486), but if there are repeated items, then there will be repeated combinations.

I've read this: https://math.stackexchange.com/a/554237 , seems it says there is no general solution to get all unique combinations, though there do have a formula to get count of unique combinations.

But, I'm not sure.

@Update - Summary & tips

According to the answers, there are ways both via iteration & recursion to do this.
When translating from python to go:
- yield in python, need chan in go, or u need return all combinations (which can be huge for large input).
- python's else for for and while is omitted when break.
- list slicing in python will make a copy (of reference ?), while slicing a go slice will reuse the same underlying array, (so in go, if the slice being copied will be modified later, and u don't want that change in the copy, then u should copy/append it into a new slice).

答案1

得分: 3

以下是提供的Python代码的中文翻译：

这是一个按词典顺序优化简单性而不是速度的非递归组合枚举器。Knuth的《计算机程序设计艺术》几乎肯定有一个速度优化的版本。

def combinations(a, k):
    a = sorted(a)
    while True:
        yield a[:k]
        for i in range(k - 1, -1, -1):
            # a[i + 1 :] 已排序。对 a[i:] 进行排序。
            x = a[i]
            j = i + 1
            while j < len(a):
                if x < a[j]:
                    break
                a[j - 1] = a[j]
                j += 1
            a[j - 1] = x
            # a[j] 是 x 之后的下一个最大元素。如果后面有足够多的元素，将它们旋转到适当位置。
            if j + (k - i) <= len(a):
                rotate(a, i, j, j + (k - i))
                break
        else:
            break
def rotate(a, i, j, k):
    reverse(a, i, j)
    reverse(a, j, k)
    reverse(a, i, k)
def reverse(a, i, k):
    for j in range((k - i) // 2):
        a[i + j], a[(k - 1) - j] = a[(k - 1) - j], a[i + j]
print(*combinations([0, 1, 2, 2, 4], 2))
print(*combinations([0, 1, 2, 2, 4], 3))
print(*combinations([0, 0, 0, 1, 1, 1], 3))

输出：

[0, 1] [0, 2] [0, 4] [1, 2] [1, 4] [2, 2] [2, 4]
[0, 1, 2] [0, 1, 4] [0, 2, 2] [0, 2, 4] [1, 2, 2] [1, 2, 4] [2, 2, 4]
[0, 0, 0] [0, 0, 1] [0, 1, 1] [1, 1, 1]

英文:

Here’s a non-recursive enumerator for combinations in lexicographic order, optimized for simplicity over speed. Knuth’s The Art of Computer Programming almost certainly has a speed-optimized version.

def combinations(a, k):
    a = sorted(a)
    while True:
        yield a[:k]
        for i in range(k - 1, -1, -1):
            # a[i + 1 :] is sorted. Sort a[i:].
            x = a[i]
            j = i + 1
            while j &lt; len(a):
                if x &lt; a[j]:
                    break
                a[j - 1] = a[j]
                j += 1
            a[j - 1] = x
            # a[j] is the next greatest element after x. If there are enough
            # elements after it, rotate them into position.
            if j + (k - i) &lt;= len(a):
                rotate(a, i, j, j + (k - i))
                break
        else:
            break
def rotate(a, i, j, k):
    reverse(a, i, j)
    reverse(a, j, k)
    reverse(a, i, k)
def reverse(a, i, k):
    for j in range((k - i) // 2):
        a[i + j], a[(k - 1) - j] = a[(k - 1) - j], a[i + j]
print(*combinations([0, 1, 2, 2, 4], 2))
print(*combinations([0, 1, 2, 2, 4], 3))
print(*combinations([0, 0, 0, 1, 1, 1], 3))

Output:

[0, 1] [0, 2] [0, 4] [1, 2] [1, 4] [2, 2] [2, 4]
[0, 1, 2] [0, 1, 4] [0, 2, 2] [0, 2, 4] [1, 2, 2] [1, 2, 4] [2, 2, 4]
[0, 0, 0] [0, 0, 1] [0, 1, 1] [1, 1, 1]

答案2

得分: 1

以下是使用Python生成这些内容的代码。

我正在使用迭代器。将其转化为另一种语言可能是翻译这个过程中最困难的部分。

def unique_multisets(multiset, k):
    # 查找每个项目的重复次数。
    count = {}
    for x in multiset:
        count[x] = 1 + count.get(x, 0)
    # 将其转化为（值，重复次数）对。reverse是因为我们稍后会在生成多重集时将其反转。
    pairs = list(reversed(count.items()))
    # 计算累积总数。
    total = 0
    totals = []
    for p in pairs:
        total += p[1]
        totals.append(total)
    def recurse(partial, idx):
        if k == len(partial):
            yield partial
        elif idx < 0 or totals[idx] < k - len(partial):
            pass  # 无法生成组合。
        else:
            (value, repeat) = pairs[idx]
            max_used = min(repeat, k - len(partial))
            for i in range(1 + max_used):
                partial.append(value)
            for i in range(1 + max_used):
                partial.pop()
                yield from recurse(partial, idx - 1)
    yield from recurse([], len(pairs) - 1)
for x in unique_multisets([0, 1, 2, 2, 4], 2):
    print(x)

希望这可以帮助您理解这段代码。

英文:

Here is code that generates these one at a time in Python.

I'm using iterators. Translating away from that would be the hardest part of translating this to another language.

def unique_multisets (multiset, k):
    # Find the repetition of each item.
    count = {}
    for x in multiset:
        count[x] = 1 + count.get(x, 0)
    # Turn this into (value, count) pairs.  The reversed is
    # because we will reverse it later in generating the multisets.
    pairs = list(reversed(count.items()))
    # Calculate a running total.
    total = 0
    totals = []
    for p in pairs:
        total += p[1]
        totals.append(total)
    def recurse (partial, idx):
        if k == len(partial):
            yield partial
        elif idx &lt; 0 or totals[idx] &lt; k - len(partial):
            pass # Can&#39;t make combinations.
        else:
            (value, repeat) = pairs[idx]
            max_used = min(repeat, k - len(partial))
            for i in range(1 + max_used):
                partial.append(value)
            for i in range(1 + max_used):
                partial.pop()
                yield from recurse(partial, idx - 1)
    yield from recurse([], len(pairs)-1)
for x in unique_multisets([0, 1, 2, 2, 4], 2):
    print(x)

答案3

得分: 1

以下是我为这个任务开发的一个算法：

def multiset_combinations(v, m):
    
    v = sorted(v)
    f = [0]
    j = 0
    # 填充f，一个扩展索引的列表。例如，
    #
    #    v = [2, 2, 33, 33, 33, 40]
    #
    # f将是：
    #
    #    [0, 0, 1, 1, 1, 2]
    
    for i in range(1, len(v)):
        if v[i] != v[i - 1]:
            j += 1
        f.append(j)
    v = list(set(v))
    r = [0] * m
    n = len(v)
    
    z   = f[:m]
    idx = [0] * n
    p   = len(f) - m
    
    last = f[-m:]
    last[-1] += 1
    
    # 找到每个索引的第一个出现。例如，对于上面的示例，idx将是：
    #
    #   [0, 2, 5]
    
    for i in range(n):
        idx[i] = f.index(i)
    
    # 主要算法
    while True:
        while z[-1] < n:
            for j in range(m):
                r[j] = v[z[j]]
            z[-1] += 1
            yield r.copy()
        if z == last:
            break
        
        for i in range(m - 2, -1, -1):
            if z[i] != f:
                z[i] += 1
                
                for j, k in zip(range(i + 1, m),
                                range(idx[z[i]] + 1, idx[z[i]] + m)):
                    z[j] = f[k]
                
                break

调用它：

print(*multiset_combination([0, 1, 2, 2, 4], 2))
#> [0, 1] [0, 2] [0, 4] [1, 2] [1, 4] [2, 2] [2, 4]
print(*multiset_combination([0, 1, 2, 2, 4], 3))
#> [0, 1, 2] [0, 1, 4] [0, 2, 2] [0, 2, 4] [1, 2, 2] [1, 2, 4] [2, 2, 4]
print(*multiset_combination([0, 0, 0, 1, 1, 1], 3))
#> [0, 0, 0] [0, 0, 1] [0, 1, 1] [1, 1, 1]

它也相当高效。它可以在不到一秒的时间内生成753564：

v1 = [1,
      2, 2,
      3, 3, 3,
      4, 4, 4, 4,
      5, 5, 5, 5, 5,
      6,
      7, 7,
      8, 8, 8,
      9, 9, 9, 9,
      10, 10, 10, 10, 10,
      11,
      12, 12,
      13, 13, 13,
      14, 14, 14, 14,
      15, 15, 15, 15, 15]
      
def get_time(v, m):
    t1 = time.time()
    list(multiset_combinations(v, m))
    t2 = time.time()
    return t2 - t1
    
get_time(v1, 10)
#> 0.8702290058135986

英文:

Below is an algorithm I developed some time back for this task:

def multiset_combinations(v, m):
    
    v = sorted(v)
    f = [0]
    j = 0
    # Populating f, a list of expanded indices. E.g. For
    #
    #    v = [2, 2, 33, 33, 33, 40]
    #
    # f would be:
    #
    #    [0, 0, 1, 1, 1, 2]
    
    for i in range(1, len(v)):
        if v[i] != v[i - 1]:
            j += 1
        f.append(j)
    v = list(set(v))
    r = [0] * m
    n = len(v)
    
    z   = f[:m]
    idx = [0] * n
    p   = len(f) - m
    
    last = f[-m:]
    last[-1] += 1
    
    # Find the first occurence of each index. E.g. For the
    # same example above, idx would be:
    #
    #   [0, 2, 5]
    
    for i in range(n):
        idx[i] = f.index(i)
    
    # The main algorithm
    while True:
        while z[-1] &lt; n:
            for j in range(m):
                r[j] = v[z[j]]
            z[-1] += 1
            yield r.copy()
        if z == last:
            break
        
        for i in range(m - 2, -1, -1):
            if z[i] != f:
                z[i] += 1
                
                for j, k in zip(range(i + 1, m),
                                range(idx[z[i]] + 1, idx[z[i]] + m)):
                    z[j] = f[k]
                
                break

Calling it:

print(*multiset_combination([0, 1, 2, 2, 4], 2))
#&gt; [0, 1] [0, 2] [0, 4] [1, 2] [1, 4] [2, 2] [2, 4]
print(*multiset_combination([0, 1, 2, 2, 4], 3))
#&gt; [0, 1, 2] [0, 1, 4] [0, 2, 2] [0, 2, 4] [1, 2, 2] [1, 2, 4] [2, 2, 4]
print(*multiset_combination([0, 0, 0, 1, 1, 1], 3))
#&gt; [0, 0, 0] [0, 0, 1] [0, 1, 1] [1, 1, 1]

It is pretty efficient as well. It can generate 753564 in just under a second:

v1 = [1,
      2, 2,
      3, 3, 3,
      4, 4, 4, 4,
      5, 5, 5, 5, 5,
      6,
      7, 7,
      8, 8, 8,
      9, 9, 9, 9,
      10, 10, 10, 10, 10,
      11,
      12, 12,
      13, 13, 13,
      14, 14, 14, 14,
      15, 15, 15, 15, 15]
      
def get_time(v, m):
    t1 = time.time()
    list(multiset_combinations(v, m))
    t2 = time.time()
    return t2 - t1
    
get_time(v1, 10)
#&gt; 0.8702290058135986

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

没有多重集合的k-组合的通用解决方案？

问题

The problem to solve

The solution I can think of

Questions

@Update - Summary & tips

The problem to solve

The solution I can think of

Questions

@Update - Summary & tips

答案1

答案2

答案3

选择不同对的算法

在将节点按字母顺序插入双向链表时出现的问题

从a到b的数字相乘

析构函数未被调用。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

发表评论