英文:
No general solution to k-combination of multiset?
问题
以下是您要翻译的内容:
The problem to solve
解决可能包含重复项的 k-组合问题,并仅返回唯一的组合。
例如,输入为 [0,1,2,2,4]
,k = 2
,结果为:
{(0, 1), (2, 4), (1, 2), (0, 4), (1, 4), (0, 2), (2, 2)}
The solution I can think of
我能想到的唯一通用解决方案是使用 set
或 map
来提取唯一结果。
例如,在Python中:
from itertools import combinations
print(set(combinations([0, 1, 2, 2, 4], 2)))
Questions
- 是否有一种通用解决方案可以在不使用 set/map 进行过滤的情况下完成此操作。
我可以生成按顺序的唯一项组合(参考:https://stackoverflow.com/a/76048486),但如果存在重复项,则会有重复的组合。
我阅读了这个:https://math.stackexchange.com/a/554237,似乎它说没有一种通用解决方案可以获取所有唯一的组合,尽管有一种获得唯一组合数量的公式。
但是,我不确定。
@Update - Summary & tips
- 根据答案,可以通过迭代和递归两种方式来完成此操作。
- 从
python
翻译为go
时:- 在Python中的
yield
需要在Go中使用chan
,或者需要返回所有组合(对于大量输入可能很大)。 - Python中的
else
用于for
和while
循环,当使用break
时在Go中会省略。 - Python中的列表
切片
会创建一个副本(或引用?),而在Go中切片一个切片将重用相同的底层数组(因此在Go中,如果要复制的切片稍后将被修改,而且不希望在副本中出现该更改,则应将其复制/附加到新切片中)。
- 在Python中的
英文:
The problem to solve
Solve k-combination whose input may contain repeated items, and return only unique combinations.
e.g input is [0,1,2,2,4]
, k = 2
, the result is:
> {(0, 1), (2, 4), (1, 2), (0, 4), (1, 4), (0, 2), (2, 2)}
The solution I can think of
The only general solution I can think is to use a set
or map
to extract unique results.
e.g with python
<!-- language: python -->
from itertools import combinations
print (set(combinations([0, 1, 2, 2, 4], 2)))
Questions
- Is there a general solution to do this, without using a set/map for filtering.
I can generate combinations of unique items, in order (refer: https://stackoverflow.com/a/76048486), but if there are repeated items, then there will be repeated combinations.
I've read this: https://math.stackexchange.com/a/554237 , seems it says there is no general solution to get all unique combinations, though there do have a formula to get count of unique combinations.
But, I'm not sure.
@Update - Summary & tips
- According to the answers, there are ways both via iteration & recursion to do this.
- When translating from
python
togo
:yield
in python, needchan
in go, or u need return all combinations (which can be huge for large input).- python's
else
forfor
andwhile
is omitted whenbreak
. - list
slicing
in python will make a copy (of reference ?), while slicing a go slice will reuse the same underlying array, (so in go, if the slice being copied will be modified later, and u don't want that change in the copy, then u should copy/append it into a new slice).
答案1
得分: 3
以下是提供的Python代码的中文翻译:
这是一个按词典顺序优化简单性而不是速度的非递归组合枚举器。Knuth的《计算机程序设计艺术》几乎肯定有一个速度优化的版本。
def combinations(a, k):
a = sorted(a)
while True:
yield a[:k]
for i in range(k - 1, -1, -1):
# a[i + 1 :] 已排序。对 a[i:] 进行排序。
x = a[i]
j = i + 1
while j < len(a):
if x < a[j]:
break
a[j - 1] = a[j]
j += 1
a[j - 1] = x
# a[j] 是 x 之后的下一个最大元素。如果后面有足够多的元素,将它们旋转到适当位置。
if j + (k - i) <= len(a):
rotate(a, i, j, j + (k - i))
break
else:
break
def rotate(a, i, j, k):
reverse(a, i, j)
reverse(a, j, k)
reverse(a, i, k)
def reverse(a, i, k):
for j in range((k - i) // 2):
a[i + j], a[(k - 1) - j] = a[(k - 1) - j], a[i + j]
print(*combinations([0, 1, 2, 2, 4], 2))
print(*combinations([0, 1, 2, 2, 4], 3))
print(*combinations([0, 0, 0, 1, 1, 1], 3))
输出:
[0, 1] [0, 2] [0, 4] [1, 2] [1, 4] [2, 2] [2, 4]
[0, 1, 2] [0, 1, 4] [0, 2, 2] [0, 2, 4] [1, 2, 2] [1, 2, 4] [2, 2, 4]
[0, 0, 0] [0, 0, 1] [0, 1, 1] [1, 1, 1]
英文:
Here’s a non-recursive enumerator for combinations in lexicographic order, optimized for simplicity over speed. Knuth’s The Art of Computer Programming almost certainly has a speed-optimized version.
def combinations(a, k):
a = sorted(a)
while True:
yield a[:k]
for i in range(k - 1, -1, -1):
# a[i + 1 :] is sorted. Sort a[i:].
x = a[i]
j = i + 1
while j < len(a):
if x < a[j]:
break
a[j - 1] = a[j]
j += 1
a[j - 1] = x
# a[j] is the next greatest element after x. If there are enough
# elements after it, rotate them into position.
if j + (k - i) <= len(a):
rotate(a, i, j, j + (k - i))
break
else:
break
def rotate(a, i, j, k):
reverse(a, i, j)
reverse(a, j, k)
reverse(a, i, k)
def reverse(a, i, k):
for j in range((k - i) // 2):
a[i + j], a[(k - 1) - j] = a[(k - 1) - j], a[i + j]
print(*combinations([0, 1, 2, 2, 4], 2))
print(*combinations([0, 1, 2, 2, 4], 3))
print(*combinations([0, 0, 0, 1, 1, 1], 3))
Output:
[0, 1] [0, 2] [0, 4] [1, 2] [1, 4] [2, 2] [2, 4]
[0, 1, 2] [0, 1, 4] [0, 2, 2] [0, 2, 4] [1, 2, 2] [1, 2, 4] [2, 2, 4]
[0, 0, 0] [0, 0, 1] [0, 1, 1] [1, 1, 1]
答案2
得分: 1
以下是使用Python生成这些内容的代码。
我正在使用迭代器。将其转化为另一种语言可能是翻译这个过程中最困难的部分。
def unique_multisets(multiset, k):
# 查找每个项目的重复次数。
count = {}
for x in multiset:
count[x] = 1 + count.get(x, 0)
# 将其转化为(值,重复次数)对。reverse是因为我们稍后会在生成多重集时将其反转。
pairs = list(reversed(count.items()))
# 计算累积总数。
total = 0
totals = []
for p in pairs:
total += p[1]
totals.append(total)
def recurse(partial, idx):
if k == len(partial):
yield partial
elif idx < 0 or totals[idx] < k - len(partial):
pass # 无法生成组合。
else:
(value, repeat) = pairs[idx]
max_used = min(repeat, k - len(partial))
for i in range(1 + max_used):
partial.append(value)
for i in range(1 + max_used):
partial.pop()
yield from recurse(partial, idx - 1)
yield from recurse([], len(pairs) - 1)
for x in unique_multisets([0, 1, 2, 2, 4], 2):
print(x)
希望这可以帮助您理解这段代码。
英文:
Here is code that generates these one at a time in Python.
I'm using iterators. Translating away from that would be the hardest part of translating this to another language.
def unique_multisets (multiset, k):
# Find the repetition of each item.
count = {}
for x in multiset:
count[x] = 1 + count.get(x, 0)
# Turn this into (value, count) pairs. The reversed is
# because we will reverse it later in generating the multisets.
pairs = list(reversed(count.items()))
# Calculate a running total.
total = 0
totals = []
for p in pairs:
total += p[1]
totals.append(total)
def recurse (partial, idx):
if k == len(partial):
yield partial
elif idx < 0 or totals[idx] < k - len(partial):
pass # Can't make combinations.
else:
(value, repeat) = pairs[idx]
max_used = min(repeat, k - len(partial))
for i in range(1 + max_used):
partial.append(value)
for i in range(1 + max_used):
partial.pop()
yield from recurse(partial, idx - 1)
yield from recurse([], len(pairs)-1)
for x in unique_multisets([0, 1, 2, 2, 4], 2):
print(x)
答案3
得分: 1
以下是我为这个任务开发的一个算法:
def multiset_combinations(v, m):
v = sorted(v)
f = [0]
j = 0
# 填充f,一个扩展索引的列表。例如,
#
# v = [2, 2, 33, 33, 33, 40]
#
# f将是:
#
# [0, 0, 1, 1, 1, 2]
for i in range(1, len(v)):
if v[i] != v[i - 1]:
j += 1
f.append(j)
v = list(set(v))
r = [0] * m
n = len(v)
z = f[:m]
idx = [0] * n
p = len(f) - m
last = f[-m:]
last[-1] += 1
# 找到每个索引的第一个出现。例如,对于上面的示例,idx将是:
#
# [0, 2, 5]
for i in range(n):
idx[i] = f.index(i)
# 主要算法
while True:
while z[-1] < n:
for j in range(m):
r[j] = v[z[j]]
z[-1] += 1
yield r.copy()
if z == last:
break
for i in range(m - 2, -1, -1):
if z[i] != f:
z[i] += 1
for j, k in zip(range(i + 1, m),
range(idx[z[i]] + 1, idx[z[i]] + m)):
z[j] = f[k]
break
调用它:
print(*multiset_combination([0, 1, 2, 2, 4], 2))
#> [0, 1] [0, 2] [0, 4] [1, 2] [1, 4] [2, 2] [2, 4]
print(*multiset_combination([0, 1, 2, 2, 4], 3))
#> [0, 1, 2] [0, 1, 4] [0, 2, 2] [0, 2, 4] [1, 2, 2] [1, 2, 4] [2, 2, 4]
print(*multiset_combination([0, 0, 0, 1, 1, 1], 3))
#> [0, 0, 0] [0, 0, 1] [0, 1, 1] [1, 1, 1]
它也相当高效。它可以在不到一秒的时间内生成753564
:
v1 = [1,
2, 2,
3, 3, 3,
4, 4, 4, 4,
5, 5, 5, 5, 5,
6,
7, 7,
8, 8, 8,
9, 9, 9, 9,
10, 10, 10, 10, 10,
11,
12, 12,
13, 13, 13,
14, 14, 14, 14,
15, 15, 15, 15, 15]
def get_time(v, m):
t1 = time.time()
list(multiset_combinations(v, m))
t2 = time.time()
return t2 - t1
get_time(v1, 10)
#> 0.8702290058135986
英文:
Below is an algorithm I developed some time back for this task:
def multiset_combinations(v, m):
v = sorted(v)
f = [0]
j = 0
# Populating f, a list of expanded indices. E.g. For
#
# v = [2, 2, 33, 33, 33, 40]
#
# f would be:
#
# [0, 0, 1, 1, 1, 2]
for i in range(1, len(v)):
if v[i] != v[i - 1]:
j += 1
f.append(j)
v = list(set(v))
r = [0] * m
n = len(v)
z = f[:m]
idx = [0] * n
p = len(f) - m
last = f[-m:]
last[-1] += 1
# Find the first occurence of each index. E.g. For the
# same example above, idx would be:
#
# [0, 2, 5]
for i in range(n):
idx[i] = f.index(i)
# The main algorithm
while True:
while z[-1] < n:
for j in range(m):
r[j] = v[z[j]]
z[-1] += 1
yield r.copy()
if z == last:
break
for i in range(m - 2, -1, -1):
if z[i] != f:
z[i] += 1
for j, k in zip(range(i + 1, m),
range(idx[z[i]] + 1, idx[z[i]] + m)):
z[j] = f[k]
break
Calling it:
print(*multiset_combination([0, 1, 2, 2, 4], 2))
#> [0, 1] [0, 2] [0, 4] [1, 2] [1, 4] [2, 2] [2, 4]
print(*multiset_combination([0, 1, 2, 2, 4], 3))
#> [0, 1, 2] [0, 1, 4] [0, 2, 2] [0, 2, 4] [1, 2, 2] [1, 2, 4] [2, 2, 4]
print(*multiset_combination([0, 0, 0, 1, 1, 1], 3))
#> [0, 0, 0] [0, 0, 1] [0, 1, 1] [1, 1, 1]
It is pretty efficient as well. It can generate 753564
in just under a second:
v1 = [1,
2, 2,
3, 3, 3,
4, 4, 4, 4,
5, 5, 5, 5, 5,
6,
7, 7,
8, 8, 8,
9, 9, 9, 9,
10, 10, 10, 10, 10,
11,
12, 12,
13, 13, 13,
14, 14, 14, 14,
15, 15, 15, 15, 15]
def get_time(v, m):
t1 = time.time()
list(multiset_combinations(v, m))
t2 = time.time()
return t2 - t1
get_time(v1, 10)
#> 0.8702290058135986
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论