在两个列表中找到相邻的相同项的优雅解决方案

huangapple go评论62阅读模式
英文:

Find elegant solution for finding same items that follow each other in two lists

问题

I have multiple lists in list and I want to find out if any of them is at least a partial match.

Example:

list_of_lists = [["a", "b", "c"], ["z", "a", "b"], ["y", "b", "c"], ["z", "a"]]

The desired output should be:

["a", "b"] # because first and second list
["b", "c"] # because first and third list
["z", "a"] # because second and last list

❗The order and duplicated matters. So I cannot use set. it chance to have desired output like that: ["a", "b", "a"]

I can probably loop over every item in 3 or more for cycles but for me is totally overkill and when I have 100 lists in list it will be really slow.

if exist any pandas or numpy function or something in Python that to do more efficiently I would appreciate it.

英文:

I have multiple lists in list and I want to find out if any of them is at least a partial match.

Example:

list_of_lists = [["a", "b", "c"], ["z", "a", "b"], ["y", "b", "c"], ["z", "a"]]

The desired output should be:

["a", "b"] # because first and second list
["b", "c"] # because first and third list
["z", "a"] # because second and last list

❗The order and duplicated matters. So I cannot use set. it chance to have desired output like that: ["a", "b", "a"]

I can probably loop over every item in 3 or more for cycles but for me is totally overkill and when I have 100 lists in list it will be really slow.

if exist any pandas or numpy function or something in Python that to do more efficiently I would appreciate it.

答案1

得分: 2

我怀疑这可能不是可能的最高效解决方案但找到了一个似乎可以工作的方法检查它如何在你的列表中运行如果需要调整请留下评论

它将列表合并为一个以逗号分隔的字符串然后使用正则表达式搜索重复序列
英文:

I suspect this isn't the highest-performance solution possible, but found a method that appears to work. Check how it works with your lists and drop a comment if it needs tweaking.

It combines the lists into a comma-delimited string then uses Regex to search for repeating sequences.

import re
list_of_lists = [["a", "b", "c"], ["z", "a", "b"], ["y", "b", "c"], ["z", "a"]]
search=re.compile(r'([a-z]{2,}).+?,?.+?') #
text=','.join([''.join(s) for s in list_of_lists])
i=0
matches=[]
results=[]
while len(text)>0:
	if result := search.search(text):
		if (span := tuple(x+i for x in result.span())) not in matches:
			matches.append(span)
			results.append(list(result.group(1)))
	text=text[1:]
	i+=1
results

huangapple
  • 本文由 发表于 2023年5月10日 19:52:05
  • 转载请务必保留本文链接:https://go.coder-hub.com/76218064.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定