2023年5月22日 08:36:10go评论96阅读模式

英文:

Find the list of starting indexes of occurring of a string in another string in O(N) time complexity

问题

给定两个字符串：str1 和 str2。找到 str1 在 str2 中的所有起始索引。示例：
I/p = str1: abc, str2: abckdabcgfacabc
O/p = [0, 5, 12]

public static List<Integer> firstMatchingIndexes(String str1, String str2) {
    List<Integer> indexes = new ArrayList<>();
    int end = 0, n = str2.length();
    
    for (; end < n; end++) {
        int index = str2.substring(end, n).indexOf(str1) + end;
        if (index != -1)
            indexes.add(index);
        else
            break;
        end = index + str1.length() - 1;
    }
    
    return indexes;
}

但这种方法使用了内部具有O(N)时间复杂度的indexOf()。KMP算法能在这里工作吗？

英文:

Given are two strings : str1 and str2. Find all the starting indexes of str1 in str2. Example:
I/p = str1: abc, str2: abckdabcgfacabc
O/p = [0, 5, 12]

public static List&lt;Integer&gt; firstMatchingIndexes(String str1, String str2) {
	List&lt;Integer&gt; indexes = new ArrayList&lt;&gt;();
	int end = 0, n = str2.length();
	
	for(; end&lt;n; end++) {
		int index = str2.substring(end, n).indexOf(str1)+end;
		if(index!=-1)
			indexes.add(index);
		else
			break;
		end =index + str1.length()-1;
		
	}
	
	return indexes;
}

But this approach uses indexOf() which internally has O(N) time complexity. Can KMP algorithm work here?

答案1

得分: 1

我能够使以下算法渲染出来。

纠正我如果我错了; 我相信这是O(n)复杂度的一个例子。
至于KMP算法，我不确定。

List<Integer> firstMatchingIndexes(String stringA, String stringB) {
    List<Integer> indices = new ArrayList<>();
    boolean checking = false;
    int indexA = 0, indexB = 0;
    for (char character : stringB.toCharArray()) {
        if (checking) {
            if (character == stringA.charAt(indexA))
                if (indexA != stringA.length() - 1)
                    indexA++;
                else {
                    indices.add(indexB - (stringA.length() - 1));
                    checking = false;
                    indexA = 0;
                }
            else {
                checking = false;
                indexA = 0;
            }
        } else if (character == stringA.charAt(indexA)) {
            checking = true;
            indexA++;
        }
        indexB++;
    }
    return indices;
}

输出

[0, 5, 12]

英文:

I was able to get the following algorithm to render.

Correct me if I'm wrong; I believe this is an example of O(n) complexity.
As for a KMP algorithm, I'm unsure.

List&lt;Integer&gt; firstMatchingIndexes(String stringA, String stringB) {
    List&lt;Integer&gt; indices = new ArrayList&lt;&gt;();
    boolean checking = false;
    int indexA = 0, indexB = 0;
    for (char character : stringB.toCharArray()) {
        if (checking) {
            if (character == stringA.charAt(indexA))
                if (indexA != stringA.length() - 1)
                    indexA++;
                else {
                    indices.add(indexB - (stringA.length() - 1));
                    checking = false;
                    indexA = 0;
                }
            else {
                checking = false;
                indexA = 0;
            }
        } else if (character == stringA.charAt(indexA)) {
            checking = true;
            indexA++;
        }
        indexB++;
    }
    return indices;
}

Output

[0, 5, 12]

答案2

得分: 1

是的，Knuth-Morris-Pratt算法在这里可以派上用场。

关于Knuth-Morris-Pratt算法的维基百科文章提供了该算法的伪代码。以下是将该伪代码转换为Java的版本：

    static int[] kmpTable(String pattern) {
        int n = pattern.length();
        int[] partialMatchTable = new int[n+1];
        int j = 0;
        partialMatchTable[0] = -1;
        for (int i = 1; i < n; i++, j++) {
            if (pattern.charAt(i) == pattern.charAt(j)) {
                partialMatchTable[i] = partialMatchTable[j];
            } else {
                partialMatchTable[i] = j;
                while (j >= 0 && pattern.charAt(i) != pattern.charAt(j)) {
                    j = partialMatchTable[j];
                }
            }
        }
        partialMatchTable[n] = j;
        return partialMatchTable;
    }
 
    static List<Integer> kmpSearch(String needle, String haystack) {
        List<Integer> matches = new ArrayList<>();
        int m = haystack.length();
        int n = needle.length();
        if (n > m) { // 为了避免O(m)运行时添加了这个条件
            return matches; // 当needle太大时返回空列表
        }
        int[] partialMatchTable = kmpTable(needle);
        int j = 0, k = 0;
        while (j < m) {
            if (needle.charAt(k) == haystack.charAt(j)) {
                j++;
                k++;
                if (k == n) {
                    matches.add(j - k);
                    k = partialMatchTable[k];
                }
            } else {
                k = partialMatchTable[k];
                if (k < 0) {
                    j++;
                    k++;
                }
            }
        }
        return matches;
    }
    public static void main(String args[])
    {
        System.out.println("matches: " + kmpSearch("abc", "abckdabcgfacabc"));
    }

这将输出：

matches: [0, 5, 12]

维基百科对于时间复杂度的说明如下：

Knuth-Morris-Pratt算法的复杂度为O(𝑛+𝑚)，其中𝑛是模式的长度。

当考虑到构建长度为𝑚的模式的部分匹配表所需的时间时：

由于算法的两部分分别具有O(𝑚)和O(𝑛)的复杂度，整体算法的复杂度为O(𝑚+𝑛)。

然而，当𝑛 ≤ 𝑚 时，我们可以说它是O(𝑚)。当使用比要搜索的字符串更长的模式进行搜索时，该算法可以跳过构建此搜索字符串的部分匹配表，并以空列表退出（参见维基百科伪代码中不存在的已注释代码）。因此，它总是O(𝑛)。

英文:

Yes, the Knuth–Morris–Pratt algorithm can be of use here.

The Wikipedia article on the Knuth–Morris–Pratt algorithm provides pseudocode for the algorithm. Here is that pseudocode ported to Java:

    static int[] kmpTable(String pattern) {
int n = pattern.length();
int[] partialMatchTable = new int[n+1];
int j = 0;
partialMatchTable[0] = -1;
for (int i = 1; i &lt; n; i++, j++) {
if (pattern.charAt(i) == pattern.charAt(j)) {
partialMatchTable[i] = partialMatchTable[j];
} else {
partialMatchTable[i] = j;
while (j &gt;= 0 &amp;&amp; pattern.charAt(i) != pattern.charAt(j)) {
j = partialMatchTable[j];
}
}
}
partialMatchTable[n] = j;
return partialMatchTable;
}
static List&lt;Integer&gt; kmpSearch(String needle, String haystack) {
List&lt;Integer&gt; matches = new ArrayList&lt;&gt;();
int m = haystack.length();
int n = needle.length();
if (n &gt; m) { // Added this to avoid O(m) runtime
return matches; // Return empty list when needle is too large
}
int[] partialMatchTable = kmpTable(needle);
int j = 0, k = 0;
while (j &lt; m) {
if (needle.charAt(k) == haystack.charAt(j)) {
j++;
k++;
if (k == n) {
matches.add(j - k);
k = partialMatchTable[k];
}
} else {
k = partialMatchTable[k];
if (k &lt; 0) {
j++;
k++;
}
}
}
return matches;
}
public static void main(String args[])
{
System.out.println(&quot;matches: &quot; + kmpSearch(&quot;abc&quot;, &quot;abckdabcgfacabc&quot;));
}

This outputs:

matches: [0, 5, 12]

Wikipedia says about the time complexity:

> the Knuth–Morris–Pratt algorithm has complexity O(𝑛), where 𝑛 is the length of 𝑆.

And when taking into account the time needed for building the partial-match table for a pattern of length 𝑘:

> Since the two portions of the algorithm have, respectively, complexities of O(𝑘) and O(𝑛), the complexity of the overall algorithm is O(𝑛 + 𝑘).

We can however say it is O(𝑛) when 𝑘 ≤ 𝑛. When searching with a pattern that is longer than the string to search in, the algorithm could just skip building the partial match table for such a search string and exit with an empty list (See commented code that is not present in Wikipedia's pseudocode). That way it is always O(𝑛).

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

在O(N)时间复杂度中找到一个字符串在另一个字符串中出现的起始索引列表。

问题

答案1

答案2

pom.xml文件在Eclipse中导入Maven项目时丢失。

在Java中格式化SQL查询的输出。

在LinkedHashMap中以O(1)检查两个不同键的顺序。

输入和输出语法（简单的Java）

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。