Golang的regexp.MatchString()不是幂等的。

huangapple go评论78阅读模式
英文:

Golang regexp MatchString() isn't idempotent

问题

我不确定发生了什么。当使用golang的regexp库时,相同的函数和相同的输入返回不同的结果。

我测试了相同的代码,使用NodeJS时结果始终相同。

  • Go代码演示:https://go.dev/play/p/uP4G0jOwLV_L
  • JS代码演示:https://playcode.io/1450178
英文:

I not sure what's happening. The same function with same input return different results when using regexp library of golang.

package main

import (
	"fmt"
	"regexp"
)

type PaymentNetworkData struct {
	Regex string
	Name  string
}

var PAYMENT_NETWORKS = map[string]PaymentNetworkData{
	"Mastercard": {
		Regex: "^5[1-5][0-9]{14}|^(222[1-9]|22[3-9]\\d|2[3-6]\\d{2}|27[0-1]\\d|2720)[0-9]{12}$",
		Name:  "Mastercard",
	},
	"VisaMaster": {
		Regex: "^(?:4[0-9]{12}(?:[0-9]{3})?|5[1-5][0-9]{14})$",
		Name:  "VisaMaster",
	},
}

func resolvePaymentNetwork(cardIn string) string {
	payNet := "Unknown"
	for _, v := range PAYMENT_NETWORKS {
		regex := regexp.MustCompile(v.Regex)

		if regex.MatchString(cardIn) {
			payNet = v.Name
		}
	}
	return payNet
}

func main() {

	in := "5103901404433835"

	for i := 1; i < 100; i++ {
		payNet := resolvePaymentNetwork(in)
		fmt.Println("Payment Network is: ", payNet)
	}
}

Input: 5103901404433835

Regex:

Mastercard: ^5[1-5][0-9]{14}|^(222[1-9]|22[3-9]\\d|2[3-6]\\d{2}|27[0-1]\\d|2720)[0-9]{12}$
VisaMaster: ^(?:4[0-9]{12}(?:[0-9]{3})?|5[1-5][0-9]{14})$

Golang Output:

Payment Network is:  VisaMaster
Payment Network is:  Mastercard
Payment Network is:  Mastercard
Payment Network is:  VisaMaster
Payment Network is:  VisaMaster
Payment Network is:  VisaMaster
Payment Network is:  VisaMaster
Payment Network is:  VisaMaster
Payment Network is:  VisaMaster

I tested the same code with NodeJS and in this case the result was always the same.

JS Output:

Payment Network is:  VisaMaster
Payment Network is:  VisaMaster
Payment Network is:  VisaMaster
Payment Network is:  VisaMaster
Payment Network is:  VisaMaster
Payment Network is:  VisaMaster
Payment Network is:  VisaMaster
Payment Network is:  VisaMaster
Payment Network is:  VisaMaster
Payment Network is:  VisaMaster

答案1

得分: 6

你的代码有几个问题:

  1. 没有明显的原因使用了map
  2. 两个正则表达式都匹配了提供的卡号。

这些问题,以及通过map迭代的顺序不保证产生相同的序列,导致了非幂等函数。

以下是修正后的代码:

package main

import (
	"fmt"
	"regexp"
)

type PaymentNetworkData struct {
	Regex *regexp.Regexp
	Name  string
}

var PAYMENT_NETWORKS = [2]PaymentNetworkData{
	{
		Regex: regexp.MustCompile("^(?:5[1-5][0-9]{14}|(?:222[1-9]|22[3-9]\\d|2[3-6]\\d{2}|27[0-1]\\d|2720)[0-9]{12})$"),
		Name:  "Mastercard",
	},
	{
		Regex: regexp.MustCompile("^(?:4[0-9]{12}(?:[0-9]{3})?|5[1-5][0-9]{14})$"),
		Name:  "VisaMaster",
	},
}

func resolvePaymentNetwork(cardIn string) string {
	for _, v := range PAYMENT_NETWORKS {
		if v.Regex.MatchString(cardIn) {
			return v.Name
		}
	}
	return "Unknown"
}

func main() {
	in := "5103901404433835"

	for i := 1; i < 100; i++ {
		payNet := resolvePaymentNetwork(in)
		fmt.Println("Payment Network is: ", payNet)
	}
}

它使用数组而不是map来保证顺序。

此外,我将你的结构更改为只编译一次正则表达式。

它每次输出Payment Network is: Mastercard

演示在这里

注意,它仍然使用相同的正则表达式(根据[@WiktorStribiżew在评论中的建议进行了更正] 2)。它们看起来不太好,特别是这部分(?:4[0-9]{12}(?:[0-9]{3})? - 它也会匹配13位数字。
你最好检查卡号的预期格式,并相应地更正表达式。

英文:

You code has a couple problems:

  1. use map for no apparent reason,
  2. both regexes match supplied card number.

These problems, and a fact that iteration through map is not guaranteed to produce same sequence, results in non-idempotent function.

Here is corrected code:

package main

import (
	&quot;fmt&quot;
	&quot;regexp&quot;
)

type PaymentNetworkData struct {
	Regex *regexp.Regexp
	Name  string
}

var PAYMENT_NETWORKS = [2]PaymentNetworkData{
	{
		Regex: regexp.MustCompile(&quot;^(?:5[1-5][0-9]{14}|(?:222[1-9]|22[3-9]\\d|2[3-6]\\d{2}|27[0-1]\\d|2720)[0-9]{12})$&quot;),
		Name:  &quot;Mastercard&quot;,
	},
	{
		Regex: regexp.MustCompile(&quot;^(?:4[0-9]{12}(?:[0-9]{3})?|5[1-5][0-9]{14})$&quot;),
		Name:  &quot;VisaMaster&quot;,
	},
}

func resolvePaymentNetwork(cardIn string) string {
	for _, v := range PAYMENT_NETWORKS {
		if v.Regex.MatchString(cardIn) {
			return v.Name
		}
	}
	return &quot;Unknown&quot;
}

func main() {
	in := &quot;5103901404433835&quot;

	for i := 1; i &lt; 100; i++ {
		payNet := resolvePaymentNetwork(in)
		fmt.Println(&quot;Payment Network is: &quot;, payNet)
	}
}

It uses array instead of map to guarantee sequence.

Also, I've changed you structure to compile regexes only once.

It outputs Payment Network is: Mastercard every time.

Demo here.

Notice, it still uses same regexes (with correction recommended by @WiktorStribiżew in comments). They don't look very good, especially this part (?:4[0-9]{12}(?:[0-9]{3})? - it will match 13 digits too.
You'll better check expected formats for card numbers, and correct expressions accordingly.

huangapple
  • 本文由 发表于 2023年4月25日 15:55:05
  • 转载请务必保留本文链接:https://go.coder-hub.com/76098819.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定