找出数据集中的最大组合数。

huangapple go评论100阅读模式
英文:

Find the maximum number of combinations from a data set

问题

需要编写一个函数,该函数从数据集中找到最大组合数,其中每个子集组合的总和>=目标值。一旦从数据集中取出数字并放入组合中,它们就不能在另一个组合中重复使用。例如,如果data_set = [1,2,3,4,5,6,7,8,9,10]target_value = 5,则组合将是:

  1. 1: [10] 2: [9] 3: [8] 4: [7] 5: [6] 6: [5] 7: [4,1] 8: [3,2]

不好的分组示例将是:

  1. 1: [1,2,3] 2: [4,5] 3: [6] 4: [7] 5: [8] 6: [9]

  1. 1: [1,2,3] 2: [4,5] 3: [6,7,8,9]

我认为一个约束是子集的数量不能大于sum(data_set)/target_value,即如果data_set = [5,5,5,5,5]target_value = 5,则结果应为[5],[5],[5],[5],[5]

为了提供背景信息,我有一大堆物品需要从商店购买,商店在购买金额超过$150时会提供一张优惠券,如果你一次购买所有物品,你将只收到一张优惠券,但如果你将物品拆分成尽可能接近$150的小笔购买,你将获得尽可能多的优惠券。

以下是当前的代码,但显然不是最优化的,我在寻找更好匹配时遇到了困难:

  1. from numpy import random
  2. def get_groups(item_list=[], target=0):
  3. groups = []
  4. def recurse(current_group, remaining_item_list):
  5. for index in range(len(remaining_item_list)):
  6. if sum(current_group) + remaining_item_list[index] < target:
  7. current_group.append(remaining_item_list[index])
  8. if index + 1 == len(remaining_item_list):
  9. groups.append(current_group)
  10. else:
  11. current_group.append(remaining_item_list[index])
  12. groups.append(current_group)
  13. recurse([], remaining_item_list[index + 1:])
  14. break
  15. item_list.sort(reverse=True)
  16. recurse([], item_list)
  17. return groups
  18. items = [random.randint(50) for i in range(21)]
  19. target = 150
  20. groups = get_groups(items, target)
  21. print("Items: {}".format(items))
  22. for index, group in enumerate(groups, start=1):
  23. print("Group {}: {}, total: {}, length: {}".format(index, group, sum(group), len(group)))

编辑:以下是一些更优化的代码,我相信它不是100%正确,但它更好:

  1. from numpy import random
  2. def get_groups(item_list=[], target=0):
  3. groups = []
  4. def recurse(current_group, remaining_item_list):
  5. for index in range(len(remaining_item_list)):
  6. remaining_item_list.sort(reverse=True)
  7. if sum(current_group) + remaining_item_list[index] < target:
  8. current_group.append(remaining_item_list[index])
  9. if index + 1 == len(remaining_item_list):
  10. groups.append(current_group)
  11. elif sum(current_group) + remaining_item_list[index] > target and current_group:
  12. reverse_search(current_group, remaining_item_list)
  13. remaining_item_list.sort(reverse=True)
  14. recurse([], remaining_item_list[index:])
  15. break
  16. else:
  17. current_group.append(remaining_item_list[index])
  18. groups.append(current_group)
  19. recurse([], remaining_item_list[index + 1:])
  20. break
  21. def reverse_search(current_group, remaining_item_list):
  22. for index in range(len(remaining_item_list)):
  23. remaining_item_list.sort()
  24. if sum(current_group) + remaining_item_list[index] < target:
  25. current_group.append(remaining_item_list.pop(index))
  26. if index + 1 == len(remaining_item_list):
  27. groups.append(current_group)
  28. else:
  29. current_group.append(remaining_item_list.pop(index))
  30. groups.append(current_group)
  31. current_group = []
  32. break
  33. recurse([], item_list)
  34. return groups
  35. items = [random.randint(50) for i in range(20)]
  36. target = 150
  37. items.sort(reverse=True)
  38. print("Items: {}".format(items))
  39. groups = get_groups(items, target)
  40. for index, group in enumerate(groups, start=1):
  41. print("Group {}: {}, total: {}, length: {}".format(index, group, sum(group), len(group)))
英文:

I need to write a function that finds the maximum number of combinations from a data set, where the sum of each of the subset combinations are &gt;= a target value. Once numbers are taken from the data_set and put into a combination they cannot be reused in another combination For example
data_set = [1,2,3,4,5,6,7,8,9,10] target_value = 5
the combinations would be
1: [10] 2: [9] 3: [8] 4: [7] 5: [6] 6: [5] 7: [4,1] 8: [3,2]
an example of bad grouping would be
1: [1,2,3] 2: [4,5] 3: [6] 4: [7] 5: [8] 6: [9]
or 1: [1,2,3] 2: [4,5] 3: [6,7,8,9].
I believe a constraint to be the number of subsets cannot be greater than sum(data_set)/target_value i.e. data_set = [5,5,5,5,5] target_value = 5 yields [5],[5],[5],[5],[5].

For context I have a large set of items I need to buy from a store, the store gives a coupon every time your purchase is over $150, if you buy everything at once you receive one coupon but if you split your items into small purchases as close to $150 as possible you will receive the maximum coupons possible.

This is the current code but it is obviously not optimized I'm having trouble scanning ahead for better matches

  1. from numpy import random
  2. def get_groups(item_list=[], target=0):
  3. groups = []
  4. def recurse(current_group, remaining_item_list):
  5. for index in range(len(remaining_item_list)):
  6. if sum(current_group) + remaining_item_list[index] &lt; target:
  7. current_group.append(remaining_item_list[index])
  8. if index+1 == len(remaining_item_list):
  9. groups.append(current_group)
  10. else:
  11. current_group.append(remaining_item_list[index])
  12. groups.append(current_group)
  13. recurse([], remaining_item_list[index+1:])
  14. break
  15. item_list.sort(reverse=True)
  16. recurse([], item_list)
  17. return groups
  18. items = [ random.randint(50) for i in range(21)]
  19. target = 150
  20. groups = get_groups(items, target)
  21. print(&quot;Items: {}&quot;.format(items))
  22. for index, group in enumerate(groups, start=1):
  23. print(&quot;Group {}: {}, total: {}, length: {}&quot;.format(index, group, sum(group), len(group)))

EDIT: here is some much more optimized code I am sure it is not 100% correct but its better

  1. from numpy import random
  2. def get_groups(item_list=[], target=0):
  3. groups = []
  4. def recurse(current_group, remaining_item_list):
  5. for index in range(len(remaining_item_list)):
  6. remaining_item_list.sort(reverse=True)
  7. if sum(current_group) + remaining_item_list[index] &lt; target:
  8. current_group.append(remaining_item_list[index])
  9. if index+1 == len(remaining_item_list):
  10. groups.append(current_group)
  11. elif sum(current_group) + remaining_item_list[index] &gt; target and current_group:
  12. reverse_search(current_group, remaining_item_list)
  13. remaining_item_list.sort(reverse=True)
  14. recurse([], remaining_item_list[index:])
  15. break
  16. else:
  17. current_group.append(remaining_item_list[index])
  18. groups.append(current_group)
  19. recurse([], remaining_item_list[index+1:])
  20. break
  21. def reverse_search(current_group, remaining_item_list):
  22. for index in range(len(remaining_item_list)):
  23. remaining_item_list.sort()
  24. if sum(current_group) + remaining_item_list[index] &lt; target:
  25. current_group.append(remaining_item_list.pop(index))
  26. if index+1 == len(remaining_item_list):
  27. groups.append(current_group)
  28. else:
  29. current_group.append(remaining_item_list.pop(index))
  30. groups.append(current_group)
  31. current_group = []
  32. break
  33. recurse([], item_list)
  34. return groups
  35. items = [ random.randint(50) for i in range(20)]
  36. target = 150
  37. items.sort(reverse=True)
  38. print(&quot;Items: {}&quot;.format(items))
  39. groups = get_groups(items, target)
  40. for index, group in enumerate(groups, start=1):
  41. print(&quot;Group {}: {}, total: {}, length: {}&quot;.format(index, group, sum(group), len(group)))```
  42. </details>
  43. # 答案1
  44. **得分**: 1
  45. 以下是翻译好的代码部分:
  46. ```python
  47. 以下是翻译好的代码部分:
  48. from bisect import bisect_right
  49. from itertools import combinations
  50. def bonusGroups(numbers, target):
  51. numbers = sorted(numbers)
  52. nbGroups = sum(numbers) // target
  53. if not nbGroups:
  54. return [numbers]
  55. groups = [[] for _ in range(nbGroups)]
  56. # build initial groups
  57. for ig, g in enumerate(groups[:-1]):
  58. gap = target - sum(g)
  59. while numbers and gap > 0:
  60. i = max(0, bisect_right(numbers, gap) - 1)
  61. g.append(numbers.pop(i))
  62. gap -= g[-1]
  63. while g and min(g) <= target - sum(g):
  64. i = g.index(min(g))
  65. groups[ig + 1].append(g.pop(i))
  66. groups[-1].extend(numbers)
  67. # last group reaches target, no spreading required
  68. if sum(groups[-1]) >= target:
  69. return groups
  70. return spreadGroups([*filter(None, groups)], target)
  71. # The optimization function will try to remove items from the first groups and replace them with items of subsequent groups (starting from last). The items are selected and matched up in a way that increase the subsequent group's sum without making the first group go below the target. To keep this efficient, the logic starts by removing as few items as possible and increases the removal combination size progressively until the new spread makes all groups reach the target (or removal size exceeds the largest group's size):
  72. def spreadGroups(groups, target):
  73. groups.sort(key=sum, reverse=True)
  74. spreading = True
  75. remSize = 1
  76. while spreading and any(sum(g) < target for g in groups):
  77. spreading = False
  78. for ig, g in enumerate(groups[:-1]):
  79. if remSize >= len(g):
  80. continue
  81. extra = sum(g) - target
  82. if not extra:
  83. continue
  84. remSums = {sum(g[i] for i in idx): idx
  85. for idx in combinations(range(len(g)), remSize)}
  86. for subSum, indexes in remSums.items():
  87. for tg in groups[-1:ig:-1]:
  88. idx = matchSum(tg, max(0, subSum - extra), subSum - 1)
  89. if not idx and subSum > extra:
  90. continue
  91. g.extend(tg.pop(i) for i in sorted(idx, reverse=True))
  92. tg.extend(g.pop(i) for i in sorted(indexes, reverse=True))
  93. if remSize > 1:
  94. ig, remSize = -1, 1
  95. groups[ig + 1:] = sorted(groups[ig + 1:], key=sum, reverse=True)
  96. spreading = True
  97. break
  98. if spreading:
  99. break
  100. if spreading:
  101. break
  102. if not spreading and remSize < max(map(len, groups)) - 1:
  103. remSize += 1
  104. spreading = True
  105. groups.sort(key=sum, reverse=True)
  106. return groups
  107. # In order to optimally match the total value of removed items, the target "swapping" group needs a combination of item (indexes) that adds up to the smallest total that is within the specified range. By getting the combination with the smallest eligible total, the reduction of the source group's total is maximized and excess item values are progressively pushed down to the last groups. This function produces a list of indexes that meet that criteria:
  108. # find group indexes producing smallest sum in range
  109. def matchSum(g, minSum, maxSum):
  110. gsums = {0: []}
  111. for i, n in enumerate(g):
  112. newSums = {s + n: [*ss, i] for s, ss in gsums.items() if s + n <= maxSum}
  113. gsums.update(newSums)
  114. sumVal = min(gsums, key=lambda s: s if s >= minSum else maxSum + 1)
  115. return gsums.get(sumVal, [])
  116. # Testing:
  117. items = [random.randint(50) for i in range(21)]
  118. target = 150
  119. groups = bonusGroups(items, target)
  120. print("Items: {}".format(items))
  121. for index, group in enumerate(groups, start=1):
  122. print("Group {}: {}, total: {}, length: {}".format(index, group, sum(group), len(group)))

希望这能帮助到你!如果需要进一步的翻译或解释,请告诉我。

英文:

The following approach does not guarantee an optimal solution but it will produce one in almost all cases.

The main function starts by determining the maximum number of groups that can be formed and builds the groups by adding the largest item value that fits within the target boundary. This will produce groups that only exceed the target value by a portion of their smallest item. If the last group formed reaches the target value, then, all preceding groups also reach it and there is no additional optimization needed. If the last group does not reach the target, then an optimization function is called to swap items between groups to spread some of the extras down to the last group(s).

  1. from bisect import bisect_right
  2. from itertools import combinations
  3. def bonusGroups(numbers,target):
  4. numbers = sorted(numbers)
  5. nbGroups = sum(numbers)//target
  6. if not nbGroups: return [numbers]
  7. groups = [[] for _ in range(nbGroups)]
  8. # build initial groups
  9. for ig,g in enumerate(groups[:-1]):
  10. gap = target - sum(g)
  11. while numbers and gap &gt; 0:
  12. i = max(0,bisect_right(numbers,gap) - 1)
  13. g.append(numbers.pop(i))
  14. gap -= g[-1]
  15. while g and min(g) &lt;= target-sum(g):
  16. i = g.index(min(g))
  17. groups[ig+1].append(g.pop(i))
  18. groups[-1].extend(numbers)
  19. # last group reaches target, no spreading required
  20. if sum(groups[-1]) &gt;= target:
  21. return groups
  22. return spreadGroups([*filter(None,groups)],target)

The optimization function will try to remove items from the first groups and replace them with items of subsequent groups (starting from last). The items are selected and matched up in a way that increase the subsequent group's sum without making the first group go below the target. To keep this efficient, the logic starts by removing as few items as possible and increases the removal combination size progressively until the new spread makes all groups reach the target (or removal size exceeds the largest group's size):

  1. def spreadGroups(groups,target):
  2. groups.sort(key=sum,reverse=True)
  3. spreading = True
  4. remSize = 1
  5. while spreading and any(sum(g) &lt; target for g in groups):
  6. spreading = False
  7. for ig,g in enumerate(groups[:-1]):
  8. if remSize &gt;= len(g): continue
  9. extra = sum(g)-target
  10. if not extra: continue
  11. remSums = { sum(g[i] for i in idx):idx
  12. for idx in combinations(range(len(g)),remSize) }
  13. for subSum,indexes in remSums.items():
  14. for tg in groups[-1:ig:-1]:
  15. idx = matchSum(tg,max(0,subSum-extra),subSum-1)
  16. if not idx and subSum&gt;extra: continue
  17. g.extend(tg.pop(i) for i in sorted(idx,reverse=True))
  18. tg.extend(g.pop(i) for i in sorted(indexes,reverse=True))
  19. if remSize&gt;1: ig,remSize = -1,1
  20. groups[ig+1:] = sorted(groups[ig+1:],key=sum,reverse=True)
  21. spreading = True
  22. break
  23. if spreading: break
  24. if spreading: break
  25. if not spreading and remSize &lt; max(map(len,groups))-1:
  26. remSize += 1
  27. spreading = True
  28. groups.sort(key=sum,reverse=True)
  29. return groups

In order to optimally match the total value of removed items, the target "swapping" group needs a combination of item (indexes) that adds up to the smallest total that is within the specified range. By getting the combination with the smallest eligible total, the reduction of the source group's total is maximized and excess item values are progressively pushed down to the last groups. This function produces a list of indexes that meet that criteria:

  1. # find group indexes producing smallest sum in range
  2. def matchSum(g,minSum,maxSum):
  3. gsums = {0:[]}
  4. for i,n in enumerate(g):
  5. newSums = { s+n:[*ss,i] for s,ss in gsums.items() if s+n&lt;=maxSum }
  6. gsums.update(newSums)
  7. sumVal = min(gsums,key=lambda s:s if s&gt;=minSum else maxSum+1)
  8. return gsums.get(sumVal,[])

Testing:

  1. items = [ random.randint(50) for i in range(21)]
  2. target = 150
  3. groups = bonusGroups(items, target)
  4. print(&quot;Items: {}&quot;.format(items))
  5. for index, group in enumerate(groups, start=1):
  6. print(&quot;Group {}: {}, total: {}, length: {}&quot;.format(index, group, sum(group), len(group)))

Test outputs:

  1. Items: [38, 38, 31, 9, 42, 25, 22, 42, 44, 46, 43, 47, 14, 20, 38, 34, 35, 3, 49, 36, 30]
  2. Group 1: [49, 47, 46, 3, 9], total: 154, length: 5
  3. Group 2: [44, 43, 42, 20, 14], total: 163, length: 5
  4. Group 3: [42, 38, 38, 31, 22], total: 171, length: 5
  5. Group 4: [25, 30, 34, 35, 36, 38], total: 198, length: 6
  6. Items: [34, 5, 13, 30, 27, 15, 4, 44, 33, 13, 25, 41, 33, 23, 14, 8, 39, 9, 33, 8, 4]
  7. Group 1: [44, 41, 39, 25, 4], total: 153, length: 5
  8. Group 2: [34, 33, 33, 33, 15, 4], total: 152, length: 6
  9. Group 3: [5, 8, 8, 9, 13, 13, 14, 23, 27, 30], total: 150, length: 10
  10. Items: [25, 22, 44, 42, 43, 22, 25, 16, 42, 24, 21, 5, 4, 16, 3, 22, 46, 31, 11, 24, 14]
  11. Group 1: [46, 44, 43, 16, 3], total: 152, length: 5
  12. Group 2: [42, 42, 31, 25, 5, 4, 11], total: 160, length: 7
  13. Group 3: [14, 16, 21, 22, 22, 22, 24, 24, 25], total: 190, length: 9

答案2

得分: 0

以下是优化后的方法,它按降序对数组进行排序,然后尝试构建尽可能大的子集,然后再移动到下一个子集。如果下一个项目太大,它会继续下一个。它还包括动态规划,减少了所进行的检查数量。

  1. from typing import List
  2. def get_groups(item_list: List[int], target: int) -> List[List[int]]:
  3. groups = []
  4. remainder = []
  5. # 降序排序列表
  6. item_list.sort(reverse=True)
  7. while item_list:
  8. group = []
  9. for item in item_list[:]:
  10. if sum(group) + item <= target:
  11. group.append(item)
  12. item_list.remove(item)
  13. if group:
  14. groups.append(group)
  15. else:
  16. remainder = item_list
  17. break
  18. groups.append(remainder)
  19. return groups
  20. # 随机生成器已被静态列表替代,以保证可重现性
  21. items = [1,2,3,4,5,6,7,8,9,10]
  22. target = 5
  23. groups = get_groups(items, target)
  24. print("Items: {}".format(items))
  25. for index, group in enumerate(groups, start=1):
  26. print("Group {}: {}, total: {}, length: {}".format(index, group, sum(group), len(group)))

注意:上述代码已经翻译成中文,只包括代码部分。

英文:

Here's an optimized approach which sorts the array in descending order, then tries to build the maximum subset possible before moving to the next subset. If the next item is too large, it moves on to the next one. It also incorporates dynamic programming, reducing the number of checks made.

  1. from typing import List
  2. def get_groups(item_list: List[int], target: int) -&gt; List[List[int]]:
  3. groups = []
  4. remainder = []
  5. # sort list in descending order
  6. item_list.sort(reverse=True)
  7. while item_list:
  8. group = []
  9. for item in item_list[:]:
  10. if sum(group) + item &lt;= target:
  11. group.append(item)
  12. item_list.remove(item)
  13. if group:
  14. groups.append(group)
  15. else:
  16. remainder = item_list
  17. break
  18. groups.append(remainder)
  19. return groups
  20. # random generator is replaced with a static list for reproducibility
  21. items = [1,2,3,4,5,6,7,8,9,10]
  22. target = 5
  23. groups = get_groups(items, target)
  24. print(&quot;Items: {}&quot;.format(items))
  25. for index, group in enumerate(groups, start=1):
  26. print(&quot;Group {}: {}, total: {}, length: {}&quot;.format(index, group, sum(group), len(group)))

huangapple
  • 本文由 发表于 2023年6月29日 03:12:45
  • 转载请务必保留本文链接:https://go.coder-hub.com/76576089.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定