Pulp匹配算法替换贪婪算法

huangapple go评论123阅读模式
英文:

Pulp Matching algorithm to replace greedy algo

问题

你的代码存在一些问题。我会为你指出这些问题:

  1. 在你的第一个尝试中,你创建了一个LP问题,但没有明确定义你的目标函数。你应该定义一个目标函数来最大化或最小化某些值。在这种情况下,你似乎想要最大化某些值,但你的目标函数是 prob += lpSum(variables[i] for i in range(len(dataset))),这只是将所有变量的总和添加到目标函数中。你需要考虑如何构建目标函数来实现你的匹配目标。

  2. 你的约束也有问题。例如,你的 "Total weight constraint" 是 prob += lpSum(variables[i] * dataset.loc[i, 'weight'] for i in range(len(dataset))) <= user_vars[user_id]['total_weight'],但这个约束不会正确地将权重与用户匹配。你应该将变量与数据集中的用户相关联,以便在匹配时考虑权重。

  3. 在第二个尝试中,你的目标函数仍然不明确,而且在约束中仍然存在问题。

为了正确解决这个问题,你需要重新考虑如何定义目标函数和约束,以便实现你的匹配目标。你可能需要重新设计你的模型,以确保它正确地表示了你的问题。这可能需要一些重新思考和重写代码。如果你需要更多的帮助,可以提供更多详细信息,我会尽力提供更多指导。

英文:

I am trying to create a matching algorithm using pulp but the results for the sample data I'm getting are wrong as I think the function is flawed.

Sample data:

  1. users = {
  2. 1: (5.0, 4.0, 1.0, 2, 1, 1),
  3. 2: (8.0, 6.0, 2.0, 3, 2, 1)
  4. }
  5. dataset = pd.DataFrame([
  6. {&#39;id&#39;: 1, &#39;group&#39;: &#39;A&#39;, &#39;weight&#39;: 1},
  7. {&#39;id&#39;: 2, &#39;group&#39;: &#39;A&#39;, &#39;weight&#39;: 2},
  8. {&#39;id&#39;: 3, &#39;group&#39;: &#39;A&#39;, &#39;weight&#39;: 3},
  9. {&#39;id&#39;: 4, &#39;group&#39;: &#39;A&#39;, &#39;weight&#39;: 3},
  10. {&#39;id&#39;: 5, &#39;group&#39;: &#39;A&#39;, &#39;weight&#39;: 4},
  11. {&#39;id&#39;: 6, &#39;group&#39;: &#39;A&#39;, &#39;weight&#39;: 6},
  12. {&#39;id&#39;: 7, &#39;group&#39;: &#39;A&#39;, &#39;weight&#39;: 7},
  13. {&#39;id&#39;: 8, &#39;group&#39;: &#39;A&#39;, &#39;weight&#39;: 8},
  14. {&#39;id&#39;: 9, &#39;group&#39;: &#39;B&#39;, &#39;weight&#39;: 2},
  15. {&#39;d&#39;: 10, &#39;group&#39;: &#39;B&#39;, &#39;weight&#39;: 1}
  16. ])

I would like to match different ids to users (without repetition). For each user I have a total weight, group A weight, group B weight, unique id count, group A unique id count, group B unique id count.

For the sample above the correct answer should be:

  1. {&#39;id&#39;: 5, &#39;group&#39;: &#39;A&#39;, &#39;weight&#39;: 4, &#39;user_id&#39;: 1}
  2. {&#39;id&#39;: 10, &#39;group&#39;: &#39;B&#39;, &#39;weight&#39;: 1, &#39;user_id&#39;: 1}
  3. {&#39;id&#39;: 3, &#39;group&#39;: &#39;A&#39;, &#39;weight&#39;: 3, &#39;user_id&#39;: 2}
  4. {&#39;id&#39;: 4, &#39;group&#39;: &#39;A&#39;, &#39;weight&#39;: 3, &#39;user_id&#39;: 2}
  5. {&#39;id&#39;: 9, &#39;group&#39;: &#39;B&#39;, &#39;weight&#39;: 2, &#39;user_id&#39;: 2}

My first attempt:

  1. from pulp import *
  2. import pandas as pd
  3. from itertools import product
  4. def match_weights(users, dataset):
  5. matched_rows = []
  6. variables = LpVariable.dicts(&quot;Item&quot;, range(len(dataset)), lowBound=0, cat=&#39;Binary&#39;)
  7. user_vars = {}
  8. for user_id, (total_weight, group_a_weight, group_b_weight, total_unique_users, group_a_unique_users, group_b_unique_users) in users.items():
  9. user_vars[user_id] = {}
  10. user_vars[user_id][&#39;total_weight&#39;] = LpVariable(&quot;TotalWeight_{}&quot;.format(user_id), lowBound=0, upBound=total_weight)
  11. user_vars[user_id][&#39;group_a_weight&#39;] = LpVariable(&quot;GroupAWeight_{}&quot;.format(user_id), lowBound=0, upBound=group_a_weight)
  12. user_vars[user_id][&#39;group_b_weight&#39;] = LpVariable(&quot;GroupBWeight_{}&quot;.format(user_id), lowBound=0, upBound=group_b_weight)
  13. user_vars[user_id][&#39;total_unique_users&#39;] = LpVariable(&quot;TotalUniqueUsers_{}&quot;.format(user_id), lowBound=0, upBound=total_unique_users, cat=&#39;Integer&#39;)
  14. user_vars[user_id][&#39;group_a_unique_users&#39;] = LpVariable(&quot;GroupAUniqueUsers_{}&quot;.format(user_id), lowBound=0, upBound=group_a_unique_users, cat=&#39;Integer&#39;)
  15. user_vars[user_id][&#39;group_b_unique_users&#39;] = LpVariable(&quot;GroupBUniqueUsers_{}&quot;.format(user_id), lowBound=0, upBound=group_b_unique_users, cat=&#39;Integer&#39;)
  16. prob = LpProblem(&quot;MatchingProblem&quot;, LpMaximize)
  17. prob += lpSum(variables[i] for i in range(len(dataset)))
  18. for user_id, (total_weight, group_a_weight, group_b_weight, total_unique_users, group_a_unique_users, group_b_unique_users) in users.items():
  19. group_a_items = dataset[dataset[&#39;group&#39;] == &#39;A&#39;].index.tolist()
  20. group_b_items = dataset[dataset[&#39;group&#39;] == &#39;B&#39;].index.tolist()
  21. # Total weight constraint
  22. prob += lpSum(variables[i] * dataset.loc[i, &#39;weight&#39;] for i in range(len(dataset))) &lt;= user_vars[user_id][&#39;total_weight&#39;]
  23. # Group A weight constraint
  24. prob += lpSum(variables[i] * dataset.loc[i, &#39;weight&#39;] for i in group_a_items) &lt;= user_vars[user_id][&#39;group_a_weight&#39;]
  25. # Group B weight constraint
  26. prob += lpSum(variables[i] * dataset.loc[i, &#39;weight&#39;] for i in group_b_items) &lt;= user_vars[user_id][&#39;group_b_weight&#39;]
  27. # Total unique user constraint
  28. unique_users = set()
  29. for i in range(len(dataset)):
  30. if variables[i].value() == 1:
  31. unique_users.add(dataset.loc[i, &#39;id&#39;])
  32. prob += lpSum(1 for u in unique_users) &lt;= user_vars[user_id][&#39;total_unique_users&#39;]
  33. # Group A unique user constraint
  34. unique_users_a = set()
  35. for i in group_a_items:
  36. if variables[i].value() == 1:
  37. unique_users_a.add(dataset.loc[i, &#39;id&#39;])
  38. prob += lpSum(1 for u in unique_users_a) &lt;= user_vars[user_id][&#39;group_a_unique_users&#39;]
  39. # Group B unique user constraint
  40. unique_users_b = set()
  41. for i in group_b_items:
  42. if variables[i].value() == 1:
  43. unique_users_b.add(dataset.loc[i, &#39;id&#39;])
  44. prob += lpSum(1 for u in unique_users_b) &lt;= user_vars[user_id][&#39;group_b_unique_users&#39;]
  45. prob.solve()
  46. for user_id, (total_weight, group_a_weight, group_b_weight, total_unique_users, group_a_unique_users, group_b_unique_users) in users.items():
  47. matched_user_rows = []
  48. for i in range(len(dataset)):
  49. if variables[i].value() == 1:
  50. matched_row = dataset.loc[i].copy()
  51. matched_row[&#39;user_id&#39;] = user_id
  52. matched_user_rows.append(matched_row)
  53. matched_rows.extend(matched_user_rows)
  54. return matched_rows

However the results are:

  1. {1: {&#39;group_a&#39;: [2], &#39;group_b&#39;: [10]}, 2: {&#39;group_a&#39;: [2], &#39;group_b&#39;: [10]}}

Looks like my results might overwrite each other but also look wrong.

I tried to rewrite it and got similar incorrect results:

  1. def match_weights(users, dataset):
  2. model = LpProblem(&quot;MatchingProblem&quot;, LpMaximize)
  3. variables = LpVariable.dicts(&quot;Item&quot;, dataset.index, lowBound=0, cat=&#39;Binary&#39;)
  4. model += lpSum(variables[i] for i in dataset.index)
  5. # Add constraints for each user
  6. for user_id, (total_weight, group_a_weight, group_b_weight, _, _, _) in users.items():
  7. # Filter dataset based on user group
  8. group_a_indices = dataset[dataset[&#39;group&#39;] == &#39;A&#39;].index
  9. group_b_indices = dataset[dataset[&#39;group&#39;] == &#39;B&#39;].index
  10. # Total weight constraint
  11. model += lpSum(variables[i] * dataset.loc[i, &#39;weight&#39;] for i in dataset.index) &lt;= total_weight
  12. # Group A weight constraint
  13. model += lpSum(variables[i] * dataset.loc[i, &#39;weight&#39;] for i in group_a_indices) &lt;= group_a_weight
  14. # Group B weight constraint
  15. model += lpSum(variables[i] * dataset.loc[i, &#39;weight&#39;] for i in group_b_indices) &lt;= group_b_weight
  16. unique_user_set = set(dataset[&#39;respondent_id&#39;])
  17. for user_id, (total_weight, _, _, total_unique_users, group_a_unique_users, group_b_unique_users) in users.items():
  18. group_a_indices = dataset[dataset[&#39;group&#39;] == &#39;A&#39;].index
  19. group_b_indices = dataset[dataset[&#39;group&#39;] == &#39;B&#39;].index
  20. # Total unique users constraint
  21. model += lpSum(variables[i] for i in dataset.index if dataset.loc[i, &#39;respondent_id&#39;] in unique_user_set) \
  22. &lt;= total_unique_users
  23. # Group A unique users constraint
  24. model += lpSum(variables[i] for i in group_a_indices if dataset.loc[i, &#39;respondent_id&#39;] in unique_user_set) \
  25. &lt;= group_a_unique_users
  26. # Group B unique users constraint
  27. model += lpSum(variables[i] for i in group_b_indices if dataset.loc[i, &#39;respondent_id&#39;] in unique_user_set) \
  28. &lt;= group_b_unique_users
  29. model.solve()
  30. results = {}
  31. for user_id, (_, _, _, _, _, _) in users.items():
  32. group_a_indices = dataset[dataset[&#39;group&#39;] == &#39;A&#39;].index
  33. group_b_indices = dataset[dataset[&#39;group&#39;] == &#39;B&#39;].index
  34. matched_a = [dataset.loc[i, &#39;respondent_id&#39;] for i in group_a_indices if variables[i].value() == 1]
  35. matched_b = [dataset.loc[i, &#39;respondent_id&#39;] for i in group_b_indices if variables[i].value() == 1]
  36. results[user_id] = {&#39;group_a&#39;: matched_a, &#39;group_b&#39;: matched_b}
  37. return results

Where am I going wrong?

答案1

得分: 2

您已经设置了一个优化目标和优化条件,但我认为不应该设置目标(将条件保留为默认)。换句话说,一旦您完成了公式化,print(prob) 应该包括以下内容:

  1. MINIMIZE
  2. None

您不应该有关于总权重、总唯一用户等的变量:而应该只有一种决策变量类型,即判断是否将ID分配给用户的二进制分配。

"total" 约束是多余的,您可以处理单个组的值而不是总和。

由于您正在使用 Pandas,只要数据的形状正确,您不需要使用 lpSum;您可以直接操作框架和系列(.dot().sum())。

dataset 是一个数据框(很好!)需要将 id 转换为索引。users 应该从字典转换为数据框。

  1. import pandas as pd
  2. import pulp
  3. requirements = pd.DataFrame(
  4. data={
  5. 1: (5.0, 4.0, 1.0, 2, 1, 1),
  6. 2: (8.0, 6.0, 2.0, 3, 2, 1),
  7. },
  8. # row index, soon to turn into column names via .T
  9. index=pd.MultiIndex.from_product(
  10. iterables=(('weight', 'count'), ('total', 'A', 'B')),
  11. names=('requirement', 'group'),
  12. ),
  13. ).drop(labels='total', level='group').T # redundant
  14. requirements.index.name = 'user'
  15. user_values = requirements.index.to_series()
  16. requirements = requirements.stack(level='group')
  17. print(requirements, end='\n\n')
  18. dataset = pd.DataFrame([
  19. {'id': 1, 'group': 'A', 'weight': 1},
  20. {'id': 2, 'group': 'A', 'weight': 2},
  21. {'id': 3, 'group': 'A', 'weight': 3},
  22. {'id': 4, 'group': 'A', 'weight': 3},
  23. {'id': 5, 'group': 'A', 'weight': 4},
  24. {'id': 6, 'group': 'A', 'weight': 6},
  25. {'id': 7, 'group': 'A', 'weight': 7},
  26. {'id': 8, 'group': 'A', 'weight': 8},
  27. {'id': 9, 'group': 'B', 'weight': 2},
  28. {'id': 10, 'group': 'B', 'weight': 1},
  29. ]).set_index('id')
  30. print(dataset, end='\n\n')
  31. combos = pd.merge(left=user_values, right=dataset.index.to_series(), how='cross')
  32. asn_name = 'asn_u' + combos.user.astype(str) + '_i' + combos.id.astype(str)
  33. combos['asn'] = asn_name.apply(pulp.LpVariable, cat=pulp.LpBinary)
  34. combos = combos.set_index(['user', 'id']).asn.unstack(level='user')
  35. print(combos, end='\n\n')
  36. prob = pulp.LpProblem(name='user_assignment')
  37. # Every ID can be assigned to at most one user
  38. for user, total in combos.sum(axis=1).items():
  39. prob.addConstraint(name=f'excl_u{user}', constraint=total <= 1)
  40. for group, dataset_weights in dataset.groupby('group'):
  41. group_assigns = combos.loc[dataset_weights.index]
  42. for user, user_assigns in group_assigns.items():
  43. requirement = requirements.loc[(user, group), :]
  44. prob.addConstraint(
  45. name=f'count_u{user}_g{group}',
  46. constraint=user_assigns.sum() == requirement['count'],
  47. )
  48. prob.addConstraint(
  49. name=f'weight_u{user}_g{group}',
  50. constraint=user_assigns.dot(dataset_weights.weight) == requirement['weight'],
  51. )
  52. print(prob)
  53. prob.solve()
  54. assert prob.status == pulp.LpStatusOptimal
  55. combos = combos.applymap(pulp.LpVariable.value).astype(int)
  56. combos = combos[combos.any(axis=1)]
  57. print(combos)
  1. requirement count weight
  2. user group
  3. 1 A 1.0 4.0
  4. B 1.0 1.0
  5. 2 A 2.0 6.0
  6. B 1.0 2.0
  7. group weight
  8. id
  9. 1 A 1
  10. 2 A 2
  11. 3 A 3
  12. 4 A 3
  13. 5 A 4
  14. 6 A 6
  15. 7 A 7
  16. 8 A 8
  17. 9 B 2
  18. 10 B 1
  19. user 1 2
  20. id
  21. 1 asn_u1_i1 asn_u2_i1
  22. 2 asn_u1_i2 asn_u2_i2
  23. 3 asn_u1_i3 asn_u2_i3
  24. 4 asn_u1_i4 asn_u2_i4
  25. 5 asn_u1_i5 asn_u2_i5
  26. 6 asn_u1_i6 asn_u2_i6
  27. 7 asn_u1_i7 asn_u2_i7
  28. 8 asn_u1_i8 asn_u2_i8
  29. 9 asn_u1_i9 asn_u2_i9
  30. 10 asn_u1_i10 asn_u2_i10
  31. user_assignment:
  32. MINIMIZE
  33. None
  34. SUBJECT TO
  35. excl_u1: asn_u1_i1 + asn_u2_i1 <= 1
  36. excl_u2: asn_u1_i2 + asn_u2_i2 <= 1
  37. ...
  38. count_u1_gA: asn_u1_i1 + asn_u1_i2 + asn_u1_i3 + asn_u1_i4 + asn_u1_i5
  39. + asn_u1_i6 + asn_u1_i7 + asn_u1_i8 = 1
  40. weight_u1_gA: asn_u1_i1 + 2 asn_u1_i2 + 3 asn_u1_i3 + 3 asn_u1_i4
  41. + 4 asn_u1_i5 + 6 asn_u1_i6 + 7 asn_u1_i7 + 8 asn_u1_i8 = 4
  42. ...
  43. VARIABLES
  44. 0 <= asn_u1_i1 <= 1 Integer
  45. 0 <= asn_u1_i10 <= 1 Integer
  46. 0 <= asn_u1_i2 <= 1 Integer
  47. 0 <= asn_u1_i3 <= 1 Integer
  48. 0 <= asn_u1_i4 <= 1 Integer
  49. 0 <= asn_u1_i5 <= 1 Integer
  50. 0 <= asn_u1_i6 <= 1 Integer
  51. 0 <= asn_u1_i7 <= 1 Integer
  52. 0 <= asn_u1_i8 <= 1 Integer
  53. 0 <= asn
  54. <details>
  55. <summary>英文:</summary>
  56. You&#39;ve set an optimisation objective and optimisation sense when I believe there should be no objective (and the sense left as default). In other words, once you&#39;re done formulating, `print(prob)` should include
  57. ```none
  58. MINIMIZE
  59. None

You should not have variables for total weight, total unique users etc: instead, only have one decision variable kind, which is a binary assignment of whether an ID has been assigned to a user.

The "total" constraint is redundant and you can deal with the individual group values without the total.

Since you're in Pandas, so long as your data are shaped correctly, you do not need to use lpSum; you can operate on the frames and series directly (.dot(), .sum()).

dataset is a frame (good!) that needs id to move to an index. users should be converted from a dictionary to a frame.

  1. import pandas as pd
  2. import pulp
  3. requirements = pd.DataFrame(
  4. data={
  5. 1: (5.0, 4.0, 1.0, 2, 1, 1),
  6. 2: (8.0, 6.0, 2.0, 3, 2, 1),
  7. },
  8. # row index, soon to turn into column names via .T
  9. index=pd.MultiIndex.from_product(
  10. iterables=((&#39;weight&#39;, &#39;count&#39;), (&#39;total&#39;, &#39;A&#39;, &#39;B&#39;)),
  11. names=(&#39;requirement&#39;, &#39;group&#39;),
  12. ),
  13. ).drop(labels=&#39;total&#39;, level=&#39;group&#39;).T # redundant
  14. requirements.index.name = &#39;user&#39;
  15. user_values = requirements.index.to_series()
  16. requirements = requirements.stack(level=&#39;group&#39;)
  17. print(requirements, end=&#39;\n\n&#39;)
  18. dataset = pd.DataFrame([
  19. {&#39;id&#39;: 1, &#39;group&#39;: &#39;A&#39;, &#39;weight&#39;: 1},
  20. {&#39;id&#39;: 2, &#39;group&#39;: &#39;A&#39;, &#39;weight&#39;: 2},
  21. {&#39;id&#39;: 3, &#39;group&#39;: &#39;A&#39;, &#39;weight&#39;: 3},
  22. {&#39;id&#39;: 4, &#39;group&#39;: &#39;A&#39;, &#39;weight&#39;: 3},
  23. {&#39;id&#39;: 5, &#39;group&#39;: &#39;A&#39;, &#39;weight&#39;: 4},
  24. {&#39;id&#39;: 6, &#39;group&#39;: &#39;A&#39;, &#39;weight&#39;: 6},
  25. {&#39;id&#39;: 7, &#39;group&#39;: &#39;A&#39;, &#39;weight&#39;: 7},
  26. {&#39;id&#39;: 8, &#39;group&#39;: &#39;A&#39;, &#39;weight&#39;: 8},
  27. {&#39;id&#39;: 9, &#39;group&#39;: &#39;B&#39;, &#39;weight&#39;: 2},
  28. {&#39;id&#39;: 10, &#39;group&#39;: &#39;B&#39;, &#39;weight&#39;: 1},
  29. ]).set_index(&#39;id&#39;)
  30. print(dataset, end=&#39;\n\n&#39;)
  31. combos = pd.merge(left=user_values, right=dataset.index.to_series(), how=&#39;cross&#39;)
  32. asn_name = &#39;asn_u&#39; + combos.user.astype(str) + &#39;_i&#39; + combos.id.astype(str)
  33. combos[&#39;asn&#39;] = asn_name.apply(pulp.LpVariable, cat=pulp.LpBinary)
  34. combos = combos.set_index([&#39;user&#39;, &#39;id&#39;]).asn.unstack(level=&#39;user&#39;)
  35. print(combos, end=&#39;\n\n&#39;)
  36. prob = pulp.LpProblem(name=&#39;user_assignment&#39;)
  37. # Every ID can be assigned to at most one user
  38. for user, total in combos.sum(axis=1).items():
  39. prob.addConstraint(name=f&#39;excl_u{user}&#39;, constraint=total &lt;= 1)
  40. for group, dataset_weights in dataset.groupby(&#39;group&#39;):
  41. group_assigns = combos.loc[dataset_weights.index]
  42. for user, user_assigns in group_assigns.items():
  43. requirement = requirements.loc[(user, group), :]
  44. prob.addConstraint(
  45. name=f&#39;count_u{user}_g{group}&#39;,
  46. constraint=user_assigns.sum() == requirement[&#39;count&#39;],
  47. )
  48. prob.addConstraint(
  49. name=f&#39;weight_u{user}_g{group}&#39;,
  50. constraint=user_assigns.dot(dataset_weights.weight) == requirement[&#39;weight&#39;],
  51. )
  52. print(prob)
  53. prob.solve()
  54. assert prob.status == pulp.LpStatusOptimal
  55. combos = combos.applymap(pulp.LpVariable.value).astype(int)
  56. combos = combos[combos.any(axis=1)]
  57. print(combos)
  1. requirement count weight
  2. user group
  3. 1 A 1.0 4.0
  4. B 1.0 1.0
  5. 2 A 2.0 6.0
  6. B 1.0 2.0
  7. group weight
  8. id
  9. 1 A 1
  10. 2 A 2
  11. 3 A 3
  12. 4 A 3
  13. 5 A 4
  14. 6 A 6
  15. 7 A 7
  16. 8 A 8
  17. 9 B 2
  18. 10 B 1
  19. user 1 2
  20. id
  21. 1 asn_u1_i1 asn_u2_i1
  22. 2 asn_u1_i2 asn_u2_i2
  23. 3 asn_u1_i3 asn_u2_i3
  24. 4 asn_u1_i4 asn_u2_i4
  25. 5 asn_u1_i5 asn_u2_i5
  26. 6 asn_u1_i6 asn_u2_i6
  27. 7 asn_u1_i7 asn_u2_i7
  28. 8 asn_u1_i8 asn_u2_i8
  29. 9 asn_u1_i9 asn_u2_i9
  30. 10 asn_u1_i10 asn_u2_i10
  31. user_assignment:
  32. MINIMIZE
  33. None
  34. SUBJECT TO
  35. excl_u1: asn_u1_i1 + asn_u2_i1 &lt;= 1
  36. excl_u2: asn_u1_i2 + asn_u2_i2 &lt;= 1
  37. ...
  38. count_u1_gA: asn_u1_i1 + asn_u1_i2 + asn_u1_i3 + asn_u1_i4 + asn_u1_i5
  39. + asn_u1_i6 + asn_u1_i7 + asn_u1_i8 = 1
  40. weight_u1_gA: asn_u1_i1 + 2 asn_u1_i2 + 3 asn_u1_i3 + 3 asn_u1_i4
  41. + 4 asn_u1_i5 + 6 asn_u1_i6 + 7 asn_u1_i7 + 8 asn_u1_i8 = 4
  42. ...
  43. VARIABLES
  44. 0 &lt;= asn_u1_i1 &lt;= 1 Integer
  45. 0 &lt;= asn_u1_i10 &lt;= 1 Integer
  46. 0 &lt;= asn_u1_i2 &lt;= 1 Integer
  47. 0 &lt;= asn_u1_i3 &lt;= 1 Integer
  48. 0 &lt;= asn_u1_i4 &lt;= 1 Integer
  49. 0 &lt;= asn_u1_i5 &lt;= 1 Integer
  50. 0 &lt;= asn_u1_i6 &lt;= 1 Integer
  51. 0 &lt;= asn_u1_i7 &lt;= 1 Integer
  52. 0 &lt;= asn_u1_i8 &lt;= 1 Integer
  53. 0 &lt;= asn_u1_i9 &lt;= 1 Integer
  54. 0 &lt;= asn_u2_i1 &lt;= 1 Integer
  55. 0 &lt;= asn_u2_i10 &lt;= 1 Integer
  56. 0 &lt;= asn_u2_i2 &lt;= 1 Integer
  57. 0 &lt;= asn_u2_i3 &lt;= 1 Integer
  58. 0 &lt;= asn_u2_i4 &lt;= 1 Integer
  59. 0 &lt;= asn_u2_i5 &lt;= 1 Integer
  60. 0 &lt;= asn_u2_i6 &lt;= 1 Integer
  61. 0 &lt;= asn_u2_i7 &lt;= 1 Integer
  62. 0 &lt;= asn_u2_i8 &lt;= 1 Integer
  63. 0 &lt;= asn_u2_i9 &lt;= 1 Integer
  64. ...
  65. Result - Optimal solution found
  66. user 1 2
  67. id
  68. 3 0 1
  69. 4 0 1
  70. 5 1 0
  71. 9 0 1
  72. 10 1 0

答案2

得分: 1

以下是翻译好的部分:

你确实在正确的方向上... 有点

我在第一次尝试中无法真正理解逻辑。不清楚为什么要创建这么多变量。在核心上,这是一个分配问题,我们正在为user分配id。所以我们真正需要的唯一变量应该是类似于:

  1. assign[item, user] # 一个指示分配的二进制变量

然后,如果我们需要分配的总权重,我们可以按标准方式总结分配乘以权重。不需要更多的变量。

第二次尝试在更好的方向上,但是变量设置错误。您真的想要按照我下面展示的方式对其进行双索引,然后一切就会顺利进行。当您习惯在数学编程环境中思考这个问题时,事情会变得更容易。您需要正确的变量和一些问题中指示所有id集合以及组的子集。如果有许多组,我可能会重新组织数据以能够按组进行索引,但是对于2个组,这是可以的。

关于您的循环,您可以将我下面的内容组合成一个循环"对于每个用户"并在其中包含所有约束,但我认为如果您对每个约束进行一个循环更容易理解。最终这是一种偏好。

我认为您可能可以完成我下面已经开始的内容,以获取其余的约束并解决问题。如果卡住了,请回复评论。

代码:

  1. # 组分配
  2. import pandas as pd
  3. import pulp
  4. ### 数据
  5. users = {
  6. 1: (5.0, 4.0, 1.0, 2, 1, 1),
  7. 2: (8.0, 6.0, 2.0, 3, 2, 1)
  8. }
  9. dataset = pd.DataFrame([
  10. {'id': 1, 'group': 'A', 'weight': 1},
  11. {'id': 2, 'group': 'A', 'weight': 2},
  12. {'id': 3, 'group': 'A', 'weight': 3},
  13. {'id': 4, 'group': 'A', 'weight': 3},
  14. {'id': 5, 'group': 'A', 'weight': 4},
  15. {'id': 6, 'group': 'A', 'weight': 6},
  16. {'id': 7, 'group': 'A', 'weight': 7},
  17. {'id': 8, 'group': 'A', 'weight': 8},
  18. {'id': 9, 'group': 'B', 'weight': 2},
  19. {'id': 10, 'group': 'B', 'weight': 1} # <-- 请注意这一行中有一个拼写错误
  20. ]).set_index('id')
  21. ### 一些整理和方便的工作...
  22. for user in users:
  23. users[user] = dict(zip(('wt_limit', 'a_wt_limit', 'b_wt_limit', 'item_limit', 'a_item_limit', 'b_item_limit'), users[user]))
  24. ids = dataset.index
  25. print(ids)
  26. # 子集
  27. grp_a_idx = dataset[dataset['group'] == 'A'].index
  28. grp_b_idx = dataset[dataset['group'] == 'B'].index
  29. ### 问题设置
  30. prob = pulp.LpProblem('id_assigner', pulp.LpMaximize)
  31. ### 集合
  32. id_user = {(i, u) for i in dataset.index for u in users}
  33. ### 变量
  34. assign = pulp.LpVariable.dicts('assign', indices=id_user, cat=pulp.LpBinary)
  35. ### 目标:最大化分配
  36. prob += pulp.lpSum(assign)
  37. ### 约束
  38. # 1. 每个id只能分配一次(或者不分配)
  39. for i in ids:
  40. prob += sum(assign[i, user] for user in users) <= 1
  41. # 2. 不要超出每个用户的总权重
  42. for user in users:
  43. prob += sum(assign[i, user]*dataset.loc[i, 'weight'] for i in ids ) <= users[user]['wt_limit']
  44. # 3. 不要超出组A的权重,对于每个用户
  45. for user in users:
  46. prob += sum(assign[i, user]*dataset.loc[i, 'weight'] for i in grp_a_idx) <= users[user]['a_wt_limit']
  47. # ..... 与其他约束类似。根据需要使用子集进行组特定的操作,例如上面的#3。
  48. # 检查一下:
  49. print(prob)
  50. ### 解决
  51. ### 结果

希望这能帮助您理解代码的内容。如果有任何问题,请随时提问。

英文:

You are on the right track... kinda.

I couldn't really follow the logic in the first attempt. It isn't clear why you were creating so many variables. At the core, this is an assignment problem, where we are assigning ids to users. So the only variable we should really need is something like:

  1. assign[item, user] # a binary variable indicating assignment

Then, if we need the total weight assigned, we can just sum up the assignments times the weights in standard fashion. No need for more variables.

The second attempt is on a better track, but the variable is set up wrong, You really want to double-index it as I show below, then things fall into place. The more you get used to thinking about this in a math programming context, the easier things get. You need the correct variable and some subsets in your problem indicating the set of all ids and then the groups. If you had many groups, I'd probably reorganize the data to be able to index by group as well, but for 2 groups, this is fine.

Regarding the loops you have, you can combine what I did below into a loop "for each user" and include all the constraints there, but I think it is easier to digest if you do a loop per constraint. In the end it is a preference

I think you can probably finish what I've started below to scoop up the rest of the constraints and solve. Comment back if you are stuck.

Code:

  1. # group assignment
  2. import pandas as pd
  3. import pulp
  4. ### DATA
  5. users = {
  6. 1: (5.0, 4.0, 1.0, 2, 1, 1),
  7. 2: (8.0, 6.0, 2.0, 3, 2, 1)
  8. }
  9. dataset = pd.DataFrame([
  10. {&#39;id&#39;: 1, &#39;group&#39;: &#39;A&#39;, &#39;weight&#39;: 1},
  11. {&#39;id&#39;: 2, &#39;group&#39;: &#39;A&#39;, &#39;weight&#39;: 2},
  12. {&#39;id&#39;: 3, &#39;group&#39;: &#39;A&#39;, &#39;weight&#39;: 3},
  13. {&#39;id&#39;: 4, &#39;group&#39;: &#39;A&#39;, &#39;weight&#39;: 3},
  14. {&#39;id&#39;: 5, &#39;group&#39;: &#39;A&#39;, &#39;weight&#39;: 4},
  15. {&#39;id&#39;: 6, &#39;group&#39;: &#39;A&#39;, &#39;weight&#39;: 6},
  16. {&#39;id&#39;: 7, &#39;group&#39;: &#39;A&#39;, &#39;weight&#39;: 7},
  17. {&#39;id&#39;: 8, &#39;group&#39;: &#39;A&#39;, &#39;weight&#39;: 8},
  18. {&#39;id&#39;: 9, &#39;group&#39;: &#39;B&#39;, &#39;weight&#39;: 2},
  19. {&#39;id&#39;: 10, &#39;group&#39;: &#39;B&#39;, &#39;weight&#39;: 1} # &lt;-- note you had a typo in this row
  20. ]).set_index(&#39;id&#39;)
  21. ### a little tidying &amp; conveniences...
  22. for user in users:
  23. users[user] = dict(zip( (&#39;wt_limit&#39;, &#39;a_wt_limit&#39;, &#39;b_wt_limit&#39;, &#39;item_limit&#39;, &#39;a_item_limit&#39;, &#39;b_item_limit&#39;), users[user]))
  24. ids = dataset.index
  25. print(ids)
  26. # subsets
  27. grp_a_idx = dataset[dataset[&#39;group&#39;] == &#39;A&#39;].index
  28. grp_b_idx = dataset[dataset[&#39;group&#39;] == &#39;B&#39;].index
  29. ### PROBLEM SETUP
  30. prob = pulp.LpProblem(&#39;id_assigner&#39;, pulp.LpMaximize)
  31. ### SETS
  32. id_user = { (i, u) for i in dataset.index for u in users }
  33. ### VARS
  34. assign = pulp.LpVariable.dicts(&#39;assign&#39;, indices=id_user, cat=pulp.LpBinary)
  35. ### OBJ: Maximize assignments
  36. prob += pulp.lpSum(assign)
  37. ### CONSTRAINTS
  38. # 1. Can only assign each id only once (or not at all)
  39. for i in ids:
  40. prob += sum(assign[i, user] for user in users) &lt;= 1
  41. # 2. Don&#39;t overassign total weight, for each user
  42. for user in users:
  43. prob += sum(assign[i, user]*dataset.loc[i, &#39;weight&#39;] for i in ids ) &lt;= users[user][&#39;wt_limit&#39;]
  44. # 3. Don&#39;t overassign weights for grp A, for each user
  45. for user in users:
  46. prob += sum(assign[i, user]*dataset.loc[i, &#39;weight&#39;] for i in grp_a_idx) &lt;= users[user][&#39;a_wt_limit&#39;]
  47. # ..... similarly for other constraints. Use the subsets as needed for group-specific things, like #3 above.
  48. # check it:
  49. print(prob)
  50. ### SOLVE
  51. ### RESULTS

huangapple
  • 本文由 发表于 2023年7月10日 19:00:07
  • 转载请务必保留本文链接:https://go.coder-hub.com/76653057.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定