在字符串列表的列表中找到不重复的字符串出现(附加更多条件)

huangapple go评论77阅读模式
英文:

Occurence of not repeated string in alist of list of strings (update with more conditions)

问题

from collections import Counter

List=[[ 'Jhon', 'Rana'], ['Rana', 'Rana'], ['Rana', 'Zhang'], ['Rana', ['Z', 'Y']]]

# Flatten the nested list and count occurrences
flat_list = [item for sublist in List for item in sublist]
word_counts = Counter(flat_list)

# First Solution: Count without considering duplicates
result1 = {word: count for word, count in word_counts.items() if count == 1}

# Second Solution: Count with duplicates, considering the special relationship
result2 = {}
for sublist in List:
    for item in sublist:
        if isinstance(item, list):
            result2[item[0]] = result2.get(item[0], 0) + 1
        else:
            result2[item] = result2.get(item, 0) + 1

# Print the results
result1_text = '\n'.join([f'{key}:{value}' for key, value in result1.items()])
result2_text = '\n'.join([f'{key}:{value}' for key, value in result2.items()])

result1_text, result2_text

Output for the first solution:

Jhon:1
Zhang:1

Output for the second solution (including the special relationship):

Jhon:1
Zhang:1
Rana:5

Please note that in the second solution, 'Rana' is counted as 5 because of the relationships with 'Z' and 'Y'.

英文:

I have a list of list like this one :

List=[['Jhon', 'Rana'], ['Rana', 'Rana'], ['Rana', 'Zhang], ['Rana', ['Z', 'Y']]

****Update: In this list, I had another structure where the first element is a single element and the second element is a list like ['Rana', ['Z', 'Y']], which means that Rana has a specific relationship with Z and R, which are not the same relationship as Rana and Jhon. ****

I want to calculate the occurence of the word of this list and I need two kind of output. The first one when we have a duplicated (or repeated word), we ignore it. The second solution, when we detect the repeated word we count it as once not twice.

Update: I want to add this type of relation to be included in the second output.

For example for the first solution the result will be
Rana:2
Jhon:1
Zhang:1

the second solution will be
Rana:3 (Update: will be 5 instead of 3 since we will consider Z and Y)
Jhon:1
Zhang: 1

I have tried to develop the following lignes of code, but I didn´t have results:

from collections import Counter
List1=[["Rana", "Jhon"], ["Rana", "Rana"], ["Jhon", "Rana"], ["Rana", "Alex"]]
count=0
n=0
for j in range (0, len(List1)-1):
  if (List1[j][0] == List1[j][1] ) or (List1[j][0] != List1[j][1] ):
    count += 1
print(count)

答案1

得分: 1

以下是已翻译的部分:

第一部分,您可以通过忽略具有重复值的内部列表来实现:

Lst=[['Jhon', 'Rana'], ['Rana', 'Rana'], ['Rana', 'Zhang']]

l = [x for x in Lst if len(set(x))==len(x)]     # len(set())将确保内部列表中只有唯一元素
# [['Jhon', 'Rana'], ['Rana', 'Zhang']]

# 扁平化l列表
m = [item for sublist in l for item in sublist]
# ['Jhon', 'Rana', 'Rana', 'Zhang']

from collections import Counter as c
c(m)

# 输出
Counter({'Jhon': 1, 'Rana': 2, 'Zhang': 1})

len(set(x))x转换为一个set。如果存在重复值,则len(x)不等于len(set(x))

第二部分,如果内部列表中存在相同的值,您可以只添加1个值:

Lst=[['Jhon', 'Rana'], ['Rana', 'Rana'], ['Rana', 'Zhang']]

l = [x if len(set(x))==len(x) else [x[0]] for x in Lst ]
# [['Jhon', 'Rana'], ['Rana'], ['Rana', 'Zhang']]

# 扁平化l列表
n = [item for sublist in l for item in sublist]
# ['Jhon', 'Rana', 'Rana', 'Rana', 'Zhang']

from collections import Counter as c
c(n)

# 输出
Counter({'Jhon': 1, 'Rana': 3, 'Zhang': 1})

现在,如果您有一个包含反向元素的列表,如下所示:

Lst1=[["Rana", "Jhon"], ["Rana", "Rana"], ["Jhon", "Rana"], ["Rana", "Alex"]]

# 您可以通过以下方式删除["Jhon", "Rana"]和["Rana", "Rana"]:
# ["Rana", "Rana"] 也是其本身的反向。

seen = set()
new_Lst1 = [x for x in Lst1 if tuple(x[::-1]) not in seen and not seen.add(tuple(x))]

print(new_Lst1)
# [['Rana', 'Jhon'], ['Rana', 'Alex']]

# 扁平化new_Lst1列表
p = [item for sublist in new_Lst1 for item in sublist]
# ['Rana', 'Jhon', 'Rana', 'Alex']

from collections import Counter as c
c(p)

# 输出
Counter({'Rana': 2, 'Jhon': 1, 'Alex': 1})

希望这对您有所帮助。

英文:

For the first part you can ignore the inner list with repeating values by:

Lst=[['Jhon', 'Rana'], ['Rana', 'Rana'], ['Rana', 'Zhang']]

l = [x for x in Lst if len(set(x))==len(x)]     #len(set()) will make sure only unique elements are there in inner-list
#[['Jhon', 'Rana'], ['Rana', 'Zhang']]

#Flatten l by

m = [item for sublist in l for item in sublist]
#['Jhon', 'Rana', 'Rana', 'Zhang']

from collections import Counter as c
c(m)

#output
Counter({'Jhon': 1, 'Rana': 2, 'Zhang': 1})

len(set(x)) converts x into a set. If there are repeating values then the len(x) will NOT be equal to len(set(x))

For the second part if there are same values in an inner list you can add only 1 value by:

Lst=[['Jhon', 'Rana'], ['Rana', 'Rana'], ['Rana', 'Zhang']]

l = [x if len(set(x))==len(x) else [x[0]] for x in Lst ]
#[['Jhon', 'Rana'], ['Rana'], ['Rana', 'Zhang']]

#Flatten l by:

n = [item for sublist in l for item in sublist]
#['Jhon', 'Rana', 'Rana', 'Rana', 'Zhang']

from collections import Counter as c
c(n)

#output
Counter({'Jhon': 1, 'Rana': 3, 'Zhang': 1})

Edit:

Now, if you have a list with reverse elements like:

Lst1=[["Rana", "Jhon"], ["Rana", "Rana"], ["Jhon", "Rana"], ["Rana", "Alex"]]

You can remove ["Jhon", "Rana"] and ["Rana", "Rana"] by:

#["Rana", "Rana"] is also an inverse of itself.

seen = set()
new_Lst1 = [x for x in Lst1 if tuple(x[::-1]) not in seen and not seen.add(tuple(x))]

print(new_Lst1)
[['Rana', 'Jhon'], ['Rana', 'Alex']]


#Flatten new_Lst1 by

p = [item for sublist in new_Lst1 for item in sublist]
#['Rana', 'Jhon', 'Rana', 'Alex']

from collections import Counter as c
c(p)

#output
Counter({'Rana': 2, 'Jhon': 1, 'Alex': 1})

答案2

得分: 1

根据您的描述,以下是从 https://stackoverflow.com/a/30357006 中理解的内容:

from collections import defaultdict

list_1 = [['Jhon', 'Rana'], ['Rana', 'Rana'], ['Rana', 'Zhang']]
# list_1 = [['Rana', 'Jhon'], ['Jhon', 'Rana']]

seen = []

for sub in list_1:
    sub = sorted(sub)
    if sub not in seen:
        seen.append(sub) 

res_1 = defaultdict(lambda: 0)
res_2 = defaultdict(lambda: 0)

for sub in seen:
    a, b = sub
    if a == b:
        res_2[a] += 1;
    else:
        res_2[a] += 1
        res_2[b] += 1
        res_1[a] += 1;
        res_1[b] += 1;

print(dict(res_1)) #=> {'Jhon': 1, 'Rana': 2, 'Zhang': 1}
print(dict(res_2)) #=> {'Jhon': 1, 'Rana': 3, 'Zhang': 1}

或者添加更多的情况,如已经评论中所述。

英文:

For what I can understand from your description, using https://stackoverflow.com/a/30357006:

from collections import defaultdict

list_1 = [['Jhon', 'Rana'], ['Rana', 'Rana'], ['Rana', 'Zhang']]
# list_1 = [['Rana', 'Jhon'], ['Jhon', 'Rana']]

seen = []

for sub in list_1:
	sub = sorted(sub)
	if sub not in seen:
		seen.append(sub) 

res_1 = defaultdict(lambda: 0)
res_2 = defaultdict(lambda: 0)

for sub in seen:
	a, b = sub
	if a == b:
		res_2[a] += 1;
	else:
		res_2[a] += 1
		res_2[b] += 1
		res_1[a] += 1;
		res_1[b] += 1;


print(dict(res_1)) #=> {'Jhon': 1, 'Rana': 2, 'Zhang': 1}
print(dict(res_2)) #=> {'Jhon': 1, 'Rana': 3, 'Zhang': 1}

Or add more cases, as already commented.

huangapple
  • 本文由 发表于 2023年6月5日 22:09:32
  • 转载请务必保留本文链接:https://go.coder-hub.com/76407282.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定