英文:
Golang regexp MatchString() isn't idempotent
问题
我不确定发生了什么。当使用golang的regexp库时,相同的函数和相同的输入返回不同的结果。
我测试了相同的代码,使用NodeJS时结果始终相同。
- Go代码演示:https://go.dev/play/p/uP4G0jOwLV_L
- JS代码演示:https://playcode.io/1450178
英文:
I not sure what's happening. The same function with same input return different results when using regexp library of golang.
package main
import (
"fmt"
"regexp"
)
type PaymentNetworkData struct {
Regex string
Name string
}
var PAYMENT_NETWORKS = map[string]PaymentNetworkData{
"Mastercard": {
Regex: "^5[1-5][0-9]{14}|^(222[1-9]|22[3-9]\\d|2[3-6]\\d{2}|27[0-1]\\d|2720)[0-9]{12}$",
Name: "Mastercard",
},
"VisaMaster": {
Regex: "^(?:4[0-9]{12}(?:[0-9]{3})?|5[1-5][0-9]{14})$",
Name: "VisaMaster",
},
}
func resolvePaymentNetwork(cardIn string) string {
payNet := "Unknown"
for _, v := range PAYMENT_NETWORKS {
regex := regexp.MustCompile(v.Regex)
if regex.MatchString(cardIn) {
payNet = v.Name
}
}
return payNet
}
func main() {
in := "5103901404433835"
for i := 1; i < 100; i++ {
payNet := resolvePaymentNetwork(in)
fmt.Println("Payment Network is: ", payNet)
}
}
Input: 5103901404433835
Regex:
Mastercard: ^5[1-5][0-9]{14}|^(222[1-9]|22[3-9]\\d|2[3-6]\\d{2}|27[0-1]\\d|2720)[0-9]{12}$
VisaMaster: ^(?:4[0-9]{12}(?:[0-9]{3})?|5[1-5][0-9]{14})$
Golang Output:
Payment Network is: VisaMaster
Payment Network is: Mastercard
Payment Network is: Mastercard
Payment Network is: VisaMaster
Payment Network is: VisaMaster
Payment Network is: VisaMaster
Payment Network is: VisaMaster
Payment Network is: VisaMaster
Payment Network is: VisaMaster
I tested the same code with NodeJS and in this case the result was always the same.
JS Output:
Payment Network is: VisaMaster
Payment Network is: VisaMaster
Payment Network is: VisaMaster
Payment Network is: VisaMaster
Payment Network is: VisaMaster
Payment Network is: VisaMaster
Payment Network is: VisaMaster
Payment Network is: VisaMaster
Payment Network is: VisaMaster
Payment Network is: VisaMaster
- Go Code demo: https://go.dev/play/p/uP4G0jOwLV_L
- JS Code demo: https://playcode.io/1450178
答案1
得分: 6
你的代码有几个问题:
- 没有明显的原因使用了
map
, - 两个正则表达式都匹配了提供的卡号。
这些问题,以及通过map
迭代的顺序不保证产生相同的序列,导致了非幂等函数。
以下是修正后的代码:
package main
import (
"fmt"
"regexp"
)
type PaymentNetworkData struct {
Regex *regexp.Regexp
Name string
}
var PAYMENT_NETWORKS = [2]PaymentNetworkData{
{
Regex: regexp.MustCompile("^(?:5[1-5][0-9]{14}|(?:222[1-9]|22[3-9]\\d|2[3-6]\\d{2}|27[0-1]\\d|2720)[0-9]{12})$"),
Name: "Mastercard",
},
{
Regex: regexp.MustCompile("^(?:4[0-9]{12}(?:[0-9]{3})?|5[1-5][0-9]{14})$"),
Name: "VisaMaster",
},
}
func resolvePaymentNetwork(cardIn string) string {
for _, v := range PAYMENT_NETWORKS {
if v.Regex.MatchString(cardIn) {
return v.Name
}
}
return "Unknown"
}
func main() {
in := "5103901404433835"
for i := 1; i < 100; i++ {
payNet := resolvePaymentNetwork(in)
fmt.Println("Payment Network is: ", payNet)
}
}
它使用数组而不是map
来保证顺序。
此外,我将你的结构更改为只编译一次正则表达式。
它每次输出Payment Network is: Mastercard
。
演示在这里。
注意,它仍然使用相同的正则表达式(根据[@WiktorStribiżew在评论中的建议进行了更正] 2)。它们看起来不太好,特别是这部分(?:4[0-9]{12}(?:[0-9]{3})?
- 它也会匹配13位数字。
你最好检查卡号的预期格式,并相应地更正表达式。
英文:
You code has a couple problems:
- use
map
for no apparent reason, - both regexes match supplied card number.
These problems, and a fact that iteration through map is not guaranteed to produce same sequence, results in non-idempotent function.
Here is corrected code:
package main
import (
"fmt"
"regexp"
)
type PaymentNetworkData struct {
Regex *regexp.Regexp
Name string
}
var PAYMENT_NETWORKS = [2]PaymentNetworkData{
{
Regex: regexp.MustCompile("^(?:5[1-5][0-9]{14}|(?:222[1-9]|22[3-9]\\d|2[3-6]\\d{2}|27[0-1]\\d|2720)[0-9]{12})$"),
Name: "Mastercard",
},
{
Regex: regexp.MustCompile("^(?:4[0-9]{12}(?:[0-9]{3})?|5[1-5][0-9]{14})$"),
Name: "VisaMaster",
},
}
func resolvePaymentNetwork(cardIn string) string {
for _, v := range PAYMENT_NETWORKS {
if v.Regex.MatchString(cardIn) {
return v.Name
}
}
return "Unknown"
}
func main() {
in := "5103901404433835"
for i := 1; i < 100; i++ {
payNet := resolvePaymentNetwork(in)
fmt.Println("Payment Network is: ", payNet)
}
}
It uses array instead of map to guarantee sequence.
Also, I've changed you structure to compile regexes only once.
It outputs Payment Network is: Mastercard
every time.
Demo here.
Notice, it still uses same regexes (with correction recommended by @WiktorStribiżew in comments). They don't look very good, especially this part (?:4[0-9]{12}(?:[0-9]{3})?
- it will match 13 digits too.
You'll better check expected formats for card numbers, and correct expressions accordingly.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论