在字符串列表的列表中找到不重复的字符串出现(附加更多条件)

huangapple go评论110阅读模式
英文:

Occurence of not repeated string in alist of list of strings (update with more conditions)

问题

  1. from collections import Counter
  2. List=[[ 'Jhon', 'Rana'], ['Rana', 'Rana'], ['Rana', 'Zhang'], ['Rana', ['Z', 'Y']]]
  3. # Flatten the nested list and count occurrences
  4. flat_list = [item for sublist in List for item in sublist]
  5. word_counts = Counter(flat_list)
  6. # First Solution: Count without considering duplicates
  7. result1 = {word: count for word, count in word_counts.items() if count == 1}
  8. # Second Solution: Count with duplicates, considering the special relationship
  9. result2 = {}
  10. for sublist in List:
  11. for item in sublist:
  12. if isinstance(item, list):
  13. result2[item[0]] = result2.get(item[0], 0) + 1
  14. else:
  15. result2[item] = result2.get(item, 0) + 1
  16. # Print the results
  17. result1_text = '\n'.join([f'{key}:{value}' for key, value in result1.items()])
  18. result2_text = '\n'.join([f'{key}:{value}' for key, value in result2.items()])
  19. result1_text, result2_text

Output for the first solution:

  1. Jhon:1
  2. Zhang:1

Output for the second solution (including the special relationship):

  1. Jhon:1
  2. Zhang:1
  3. Rana:5

Please note that in the second solution, 'Rana' is counted as 5 because of the relationships with 'Z' and 'Y'.

英文:

I have a list of list like this one :

  1. List=[['Jhon', 'Rana'], ['Rana', 'Rana'], ['Rana', 'Zhang], ['Rana', ['Z', 'Y']]

****Update: In this list, I had another structure where the first element is a single element and the second element is a list like ['Rana', ['Z', 'Y']], which means that Rana has a specific relationship with Z and R, which are not the same relationship as Rana and Jhon. ****

I want to calculate the occurence of the word of this list and I need two kind of output. The first one when we have a duplicated (or repeated word), we ignore it. The second solution, when we detect the repeated word we count it as once not twice.

Update: I want to add this type of relation to be included in the second output.

For example for the first solution the result will be
Rana:2
Jhon:1
Zhang:1

the second solution will be
Rana:3 (Update: will be 5 instead of 3 since we will consider Z and Y)
Jhon:1
Zhang: 1

I have tried to develop the following lignes of code, but I didn´t have results:

  1. from collections import Counter
  2. List1=[["Rana", "Jhon"], ["Rana", "Rana"], ["Jhon", "Rana"], ["Rana", "Alex"]]
  3. count=0
  4. n=0
  5. for j in range (0, len(List1)-1):
  6. if (List1[j][0] == List1[j][1] ) or (List1[j][0] != List1[j][1] ):
  7. count += 1
  8. print(count)

答案1

得分: 1

以下是已翻译的部分:

第一部分,您可以通过忽略具有重复值的内部列表来实现:

  1. Lst=[['Jhon', 'Rana'], ['Rana', 'Rana'], ['Rana', 'Zhang']]
  2. l = [x for x in Lst if len(set(x))==len(x)] # len(set())将确保内部列表中只有唯一元素
  3. # [['Jhon', 'Rana'], ['Rana', 'Zhang']]
  4. # 扁平化l列表
  5. m = [item for sublist in l for item in sublist]
  6. # ['Jhon', 'Rana', 'Rana', 'Zhang']
  7. from collections import Counter as c
  8. c(m)
  9. # 输出
  10. Counter({'Jhon': 1, 'Rana': 2, 'Zhang': 1})

len(set(x))x转换为一个set。如果存在重复值,则len(x)不等于len(set(x))

第二部分,如果内部列表中存在相同的值,您可以只添加1个值:

  1. Lst=[['Jhon', 'Rana'], ['Rana', 'Rana'], ['Rana', 'Zhang']]
  2. l = [x if len(set(x))==len(x) else [x[0]] for x in Lst ]
  3. # [['Jhon', 'Rana'], ['Rana'], ['Rana', 'Zhang']]
  4. # 扁平化l列表
  5. n = [item for sublist in l for item in sublist]
  6. # ['Jhon', 'Rana', 'Rana', 'Rana', 'Zhang']
  7. from collections import Counter as c
  8. c(n)
  9. # 输出
  10. Counter({'Jhon': 1, 'Rana': 3, 'Zhang': 1})

现在,如果您有一个包含反向元素的列表,如下所示:

  1. Lst1=[["Rana", "Jhon"], ["Rana", "Rana"], ["Jhon", "Rana"], ["Rana", "Alex"]]
  2. # 您可以通过以下方式删除["Jhon", "Rana"]和["Rana", "Rana"]:
  3. # ["Rana", "Rana"] 也是其本身的反向。
  4. seen = set()
  5. new_Lst1 = [x for x in Lst1 if tuple(x[::-1]) not in seen and not seen.add(tuple(x))]
  6. print(new_Lst1)
  7. # [['Rana', 'Jhon'], ['Rana', 'Alex']]
  8. # 扁平化new_Lst1列表
  9. p = [item for sublist in new_Lst1 for item in sublist]
  10. # ['Rana', 'Jhon', 'Rana', 'Alex']
  11. from collections import Counter as c
  12. c(p)
  13. # 输出
  14. Counter({'Rana': 2, 'Jhon': 1, 'Alex': 1})

希望这对您有所帮助。

英文:

For the first part you can ignore the inner list with repeating values by:

  1. Lst=[['Jhon', 'Rana'], ['Rana', 'Rana'], ['Rana', 'Zhang']]
  2. l = [x for x in Lst if len(set(x))==len(x)] #len(set()) will make sure only unique elements are there in inner-list
  3. #[['Jhon', 'Rana'], ['Rana', 'Zhang']]
  4. #Flatten l by
  5. m = [item for sublist in l for item in sublist]
  6. #['Jhon', 'Rana', 'Rana', 'Zhang']
  7. from collections import Counter as c
  8. c(m)
  9. #output
  10. Counter({'Jhon': 1, 'Rana': 2, 'Zhang': 1})

len(set(x)) converts x into a set. If there are repeating values then the len(x) will NOT be equal to len(set(x))

For the second part if there are same values in an inner list you can add only 1 value by:

  1. Lst=[['Jhon', 'Rana'], ['Rana', 'Rana'], ['Rana', 'Zhang']]
  2. l = [x if len(set(x))==len(x) else [x[0]] for x in Lst ]
  3. #[['Jhon', 'Rana'], ['Rana'], ['Rana', 'Zhang']]
  4. #Flatten l by:
  5. n = [item for sublist in l for item in sublist]
  6. #['Jhon', 'Rana', 'Rana', 'Rana', 'Zhang']
  7. from collections import Counter as c
  8. c(n)
  9. #output
  10. Counter({'Jhon': 1, 'Rana': 3, 'Zhang': 1})

Edit:

Now, if you have a list with reverse elements like:

  1. Lst1=[["Rana", "Jhon"], ["Rana", "Rana"], ["Jhon", "Rana"], ["Rana", "Alex"]]

You can remove ["Jhon", "Rana"] and ["Rana", "Rana"] by:

#["Rana", "Rana"] is also an inverse of itself.

  1. seen = set()
  2. new_Lst1 = [x for x in Lst1 if tuple(x[::-1]) not in seen and not seen.add(tuple(x))]
  3. print(new_Lst1)
  4. [['Rana', 'Jhon'], ['Rana', 'Alex']]
  5. #Flatten new_Lst1 by
  6. p = [item for sublist in new_Lst1 for item in sublist]
  7. #['Rana', 'Jhon', 'Rana', 'Alex']
  8. from collections import Counter as c
  9. c(p)
  10. #output
  11. Counter({'Rana': 2, 'Jhon': 1, 'Alex': 1})

答案2

得分: 1

根据您的描述,以下是从 https://stackoverflow.com/a/30357006 中理解的内容:

  1. from collections import defaultdict
  2. list_1 = [['Jhon', 'Rana'], ['Rana', 'Rana'], ['Rana', 'Zhang']]
  3. # list_1 = [['Rana', 'Jhon'], ['Jhon', 'Rana']]
  4. seen = []
  5. for sub in list_1:
  6. sub = sorted(sub)
  7. if sub not in seen:
  8. seen.append(sub)
  9. res_1 = defaultdict(lambda: 0)
  10. res_2 = defaultdict(lambda: 0)
  11. for sub in seen:
  12. a, b = sub
  13. if a == b:
  14. res_2[a] += 1;
  15. else:
  16. res_2[a] += 1
  17. res_2[b] += 1
  18. res_1[a] += 1;
  19. res_1[b] += 1;
  20. print(dict(res_1)) #=> {'Jhon': 1, 'Rana': 2, 'Zhang': 1}
  21. print(dict(res_2)) #=> {'Jhon': 1, 'Rana': 3, 'Zhang': 1}

或者添加更多的情况,如已经评论中所述。

英文:

For what I can understand from your description, using https://stackoverflow.com/a/30357006:

  1. from collections import defaultdict
  2. list_1 = [['Jhon', 'Rana'], ['Rana', 'Rana'], ['Rana', 'Zhang']]
  3. # list_1 = [['Rana', 'Jhon'], ['Jhon', 'Rana']]
  4. seen = []
  5. for sub in list_1:
  6. sub = sorted(sub)
  7. if sub not in seen:
  8. seen.append(sub)
  9. res_1 = defaultdict(lambda: 0)
  10. res_2 = defaultdict(lambda: 0)
  11. for sub in seen:
  12. a, b = sub
  13. if a == b:
  14. res_2[a] += 1;
  15. else:
  16. res_2[a] += 1
  17. res_2[b] += 1
  18. res_1[a] += 1;
  19. res_1[b] += 1;
  20. print(dict(res_1)) #=> {'Jhon': 1, 'Rana': 2, 'Zhang': 1}
  21. print(dict(res_2)) #=> {'Jhon': 1, 'Rana': 3, 'Zhang': 1}

Or add more cases, as already commented.

huangapple
  • 本文由 发表于 2023年6月5日 22:09:32
  • 转载请务必保留本文链接:https://go.coder-hub.com/76407282.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定