英文:
finding LCP of two strings in constant time and linear space
问题
这是问题:
<br> 假设S是一组字符串,我们知道S中所有字符串的总长度为n。我们必须找到一个数据结构,其空间复杂度为O(n),可以在O(1)时间内找到LCP(s,t),其中LCP是字符串s,t之间的最长公共前缀。
<br> 起初,我认为可以使用哈希,因为我们可以在常数时间内检查数字,并在常数时间内找到子字符串,如果我们预先对字符串进行哈希处理。但我认为这可能不起作用,因为它需要更多的空间,经过一些搜索,我发现解决方案可能在于使用Trie树、后缀数组,可能还包括LCA和RMQ。我认为我已经接近答案,但不知道这些概念如何结合在一起以构建一个快速提供LCP的数据结构!
<br> 谢谢阅读
英文:
This is the problem:
<br> Suppose S is a set of strings and we know the total length of all strings in S in n. We must find a data structure with space O(n) that finds LCP(s,t) in O(1) which LCP is the longest common prefix between strings s,t.
<br>At first I thought I could use hashing since we can check numbers in constant time and find sub-strings in constant time if we prehashed strings.But I don't think this would work since it needs more space and after a little searching I found out that the solution probably lies in using Trie's,Suffix arrays and maybe LCA and RMQ. I think I'm close to the answer but don't know how these concepts can work together to make a data structure that give LCP fast!
<br> Thanks for reading
答案1
得分: 1
我认为我知道他们想要的答案。
首先,构建一个用于所有字符串的前缀树。Trie中的每个节点都可以包括指向以该前缀开头的字符串的指针和长度。将每个字符串映射到其在Trie中最终节点上。
现在,当给定一对字符串(假设您被告知它们是字符串i
和字符串j
)时,返回字符串的问题是找到最近的共同祖先,然后返回一对(指向字符串开头的指针, 长度)
。
但是,Trie可以被表示为一棵树,然后可以使用Tarjan的离线最近公共祖先算法(参见https://www.geeksforgeeks.org/tarjans-off-line-lowest-common-ancestors-algorithm/)来预处理该树以快速回答LCA问题。
这在技术上并不是O(1)
。但是,对于适应可观察宇宙的任何计算机来说,它是O(inverse_ackermann(n))
,可以视为一个相当小的常数。
英文:
I think that I know the answer they are looking for.
First, construct a trie for all of the strings. Each node in the trie can include a pointer to a string starting with that prefix, and a length. Map each string to the final node in the trie that that string winds up on.
Now when given a pair of strings (which presumably you are told as string i
and string j
) the problem of returning a string is a question of finding the least common ancestor, then returning the pair (pointer_to_start_of_string, length)
.
But a trie can be written as a tree, and then Tarjan's off line lowest Common Ancestors Algorithm (see https://www.geeksforgeeks.org/tarjans-off-line-lowest-common-ancestors-algorithm/) can be used to preprocess that tree to answer LCA questions very quickly.
It is technically not O(1)
. However it is O(inverse_ackermann(n))
which can be treated as a fairly small constant for any computer that fits in the observable universe.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论