finding LCP of two strings in constant time and linear space

huangapple go评论79阅读模式
英文:

finding LCP of two strings in constant time and linear space

问题

这是问题:
<br> 假设S是一组字符串,我们知道S中所有字符串的总长度为n。我们必须找到一个数据结构,其空间复杂度为O(n),可以在O(1)时间内找到LCP(s,t),其中LCP是字符串s,t之间的最长公共前缀。

<br> 起初,我认为可以使用哈希,因为我们可以在常数时间内检查数字,并在常数时间内找到子字符串,如果我们预先对字符串进行哈希处理。但我认为这可能不起作用,因为它需要更多的空间,经过一些搜索,我发现解决方案可能在于使用Trie树、后缀数组,可能还包括LCA和RMQ。我认为我已经接近答案,但不知道这些概念如何结合在一起以构建一个快速提供LCP的数据结构!

<br> 谢谢阅读

英文:

This is the problem:
<br> Suppose S is a set of strings and we know the total length of all strings in S in n. We must find a data structure with space O(n) that finds LCP(s,t) in O(1) which LCP is the longest common prefix between strings s,t.

<br>At first I thought I could use hashing since we can check numbers in constant time and find sub-strings in constant time if we prehashed strings.But I don't think this would work since it needs more space and after a little searching I found out that the solution probably lies in using Trie's,Suffix arrays and maybe LCA and RMQ. I think I'm close to the answer but don't know how these concepts can work together to make a data structure that give LCP fast!

<br> Thanks for reading

答案1

得分: 1

我认为我知道他们想要的答案。

首先,构建一个用于所有字符串的前缀树。Trie中的每个节点都可以包括指向以该前缀开头的字符串的指针和长度。将每个字符串映射到其在Trie中最终节点上。

现在,当给定一对字符串(假设您被告知它们是字符串i和字符串j)时,返回字符串的问题是找到最近的共同祖先,然后返回一对(指向字符串开头的指针, 长度)

但是,Trie可以被表示为一棵树,然后可以使用Tarjan的离线最近公共祖先算法(参见https://www.geeksforgeeks.org/tarjans-off-line-lowest-common-ancestors-algorithm/)来预处理该树以快速回答LCA问题。

这在技术上并不是O(1)。但是,对于适应可观察宇宙的任何计算机来说,它是O(inverse_ackermann(n)),可以视为一个相当小的常数。

英文:

I think that I know the answer they are looking for.

First, construct a trie for all of the strings. Each node in the trie can include a pointer to a string starting with that prefix, and a length. Map each string to the final node in the trie that that string winds up on.

Now when given a pair of strings (which presumably you are told as string i and string j) the problem of returning a string is a question of finding the least common ancestor, then returning the pair (pointer_to_start_of_string, length).

But a trie can be written as a tree, and then Tarjan's off line lowest Common Ancestors Algorithm (see https://www.geeksforgeeks.org/tarjans-off-line-lowest-common-ancestors-algorithm/) can be used to preprocess that tree to answer LCA questions very quickly.

It is technically not O(1). However it is O(inverse_ackermann(n)) which can be treated as a fairly small constant for any computer that fits in the observable universe.

huangapple
  • 本文由 发表于 2020年1月4日 00:28:59
  • 转载请务必保留本文链接:https://go.coder-hub.com/59582024.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定