英文:
Implementation of Rabin-Karp string matching (Rolling hash)
问题
I am trying to implement Rabin-Karp string matching algorithm to find String needle
in String haystack
(Return index of String haystack
where a match for String needle
was found). I am getting error in trying to find needle c
in haystack abc
.
Here is how my output looks after running my code for finding c
in abc
:
hayHash: 0 and needleHash: 2 at i: 0 remove: a and add: b and new hash
is before checking for negatives: 1hayHash: 1 and needleHash: 2 at i: 1 remove: b and add: c and new hash
is before checking for negatives: -50hayHash: 51 and needleHash: 2
I can't figure why the last hash recount calculates to -50
and not 2
for hayHash
. Here both hayHash
and needleHash
should be a calculated hash consisting of only char 'c', and should be the same value for both. But my code is recalculating hayHash
to 51
(-50
before cancelling negative value) instead of 2
.
Any suggestions on what might be wrong in my code?
Here is my code:
private fun find(haystack: String, needle: String): Int {
if(needle.length > haystack.length) return -1
val q = 101
val d = 256
var needleHash = 0
var hayHash = 0
var hash = 1
for (i in 0..needle.length)
hash = (hash * d) % q
for(i in 0..needle.lastIndex) {
needleHash = (d * needleHash + (needle[i] - 'a')) % q
hayHash = (d * hayHash + (haystack[i] - 'a')) % q
}
for(i in 0..(haystack.length - needle.length)) {
println("hayHash: $hayHash and needleHash: $needleHash")
if(hayHash == needleHash) {
for(j in 0..needle.lastIndex) {
if(haystack[i + j] != needle[j])
break
if(j == needle.lastIndex)
return i
}
}
if(i == haystack.length - needle.length)
break
print("at i: $i remove: ${haystack[i]} and add: ${haystack[i + needle.length]}")
hayHash = (d * (hayHash - (haystack[i] - 'a') * hash) + (haystack[i + needle.length] - 'a')) % q
println(" and new hash is before checking for negatives: $hayHash")
if(hayHash < 0)
hayHash += q
}
return -1
}
英文:
I am trying to implement Rabin-Karp string matching algorithm to find String needle
in String haystack
(Return index of String haystack
where a match for String needle
was found). I am getting error in trying to find needle c
in haystack abc
.
Here is how my output looks after running my code for finding c
in abc
:
> hayHash: 0 and needleHash: 2 at i: 0 remove: a and add: b and new hash
> is before checking for negatives: 1
>
> hayHash: 1 and needleHash: 2 at i: 1 remove: b and add: c and new hash
> is before checking for negatives: -50
>
> hayHash: 51 and needleHash: 2
I can't figure why the last hash recount calculates to -50
and not 2
for hayHash
. Here both hayHash
and needleHash
should be a calculated hash consisting of only char 'c'
, and should be the same value for both. But my code is recalculating hayHash
to 51
(-50
before cancelling negative value) instead of 2
.
Any suggestions on what might be wrong in my code?
Here is my code:
private fun find(haystack: String, needle: String): Int {
if(needle.length > haystack.length) return -1
val q = 101
val d = 256
var needleHash = 0
var hayHash = 0
var hash = 1
for (i in 0..needle.length)
hash = (hash * d) % q
for(i in 0..needle.lastIndex) {
needleHash = (d * needleHash + (needle[i] - 'a')) % q
hayHash = (d * hayHash + (haystack[i] - 'a')) % q
}
for(i in 0..(haystack.length - needle.length)) {
println("hayHash: $hayHash and needleHash: $needleHash")
if(hayHash == needleHash) {
for(j in 0..needle.lastIndex) {
if(haystack[i + j] != needle[j])
break
if(j == needle.lastIndex)
return i
}
}
if(i == haystack.length - needle.length)
break
print("at i: $i remove: ${haystack[i]} and add: ${haystack[i + needle.length]}")
hayHash = (d * (hayHash - (haystack[i] - 'a') * hash) + (haystack[i + needle.length] - 'a')) % q
println(" and new hash is before checking for negatives: $hayHash")
if(hayHash < 0)
hayHash += q
}
return -1
}
答案1
得分: 2
for (i in 0..needle.length)
在初始化hash
时存在一种差一的错误。您应该使用 for(i in 0..needle.lastIndex)
。
英文:
for (i in 0..needle.length)
while initializing hash
is an off-by-one error. You want for(i in 0..needle.lastIndex)
.
答案2
得分: 1
一部分是预先计算错误的 d 次幂。
另一部分是将不仅窗口外的字符与该幂相乘,还有与当前 hayHash 的差异。
尝试 hayHash = (d * hayHash - ((haystack[i] - 'a') * hash) + (haystack[i + needle.length] - 'a')) % q
。
英文:
One part is pre-computing the wrong power of d.
The other is multiplying not only the character leaving the window with this power, but the difference from the current hayHash.
Try hayHash = (d * hayHash - ((haystack[i] - 'a') * hash) + (haystack[i + needle.length] - 'a')) % q
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论